Harvard Educational Review
  1. Spring 2012 Issue »

    Symposium: By What Measure?

    Mapping and Expanding the Teacher Effectiveness Debate

    Even without the benefit of peer-reviewed research, anyone who has experienced school can describe a teacher who was effective in challenging, inspiring, encouraging, and instructing them, as well as one who frustrated or disappointed them. Among administrators and teacher trainers there is the widely held belief that we can recognize effective teaching when we see it. And yet, even with decades of research, we still grapple with the basic questions: What does it mean to be an effective teacher? How do we tell which teachers are the most and least effective? What can we do to improve teacher effectiveness?

    These questions have moved to the center of an increasingly heated policy debate over how to define, measure, and improve teacher effectiveness in K–12 education in the United States. While this debate has been going on for more than half a century, large-scale standardized testing has more recently been used to quantify the performance of students and schools. By logical extension, this same student data has been called on to quantify the effectiveness of teachers. Although there are potentially many ways to measure teachers’ performance—and some may argue that the art of teaching is perhaps too complex to be measured at all—what has emerged is a burgeoning literature (and a fast-growing industry) on measuring the effectiveness of teachers, largely tied to a teacher’s ability to raise students’ test scores.

    These approaches include value-added measurements and observational assessments, as well as other approaches. The research on value-added measurements suggests that teachers have a profound impact on student learning as measured by test scores (Nye, Konstantopoulus, & Hedges, 2004; Rivkin, Hanushek, & Kain, 2005) and that some teachers appear to be more effective than others (Gordon, Kane, & Staiger, 2006; Sanders & Rivers, 1996), assertions that have been posited before, but that now have been quantified in ways not previously imagined. Finally, it seemed, we had scientifically based, objective data around which to create sound policy, including decisions about hiring and firing, merit pay, and teaching assignments. Proponents believed that the widespread use of these measures might root out those who don’t belong in the profession and incentivize the most dedicated and talented teachers to remain in the classroom. Many hoped that these measures could be used to ensure that the most effective teachers could be placed with the students who need them most.

    The promulgation of these efforts to measure teachers has certainly garnered much attention from the media, teacher unions, parents, policy makers, and researchers and has created significant conflict at all levels of education. From the Los Angeles Times’s decision to calculate and publish online the value-added scores of every teacher in the Los Angeles Unified School District to the allegations of cheating being made across the country, these efforts have simultaneously been labeled “a witch hunt” and “the salvation of education.”1 In spite of these unresolved issues, the Obama administration made the use of multiple measures of teacher effectiveness, including those that tie teacher performance ratings with student test scores, a cornerstone of their Race to the Top incentive grants.

    States have begun scrambling to design and implement systems (McNeil, 2012), a rush that has created differences in the ways states implement these requirements, mirroring the standards movement that began twenty years ago. This push to have quantifiable effectiveness ratings for all three million American teachers (Keigher, 2010) teaching under differing state requirements and standards will take unprecedented effort, and many have begun to question whether the results will be useful enough to justify the time and expense. Proponents say we can no longer afford to have ineffective teachers in our classrooms, particularly for students who have traditionally struggled to succeed in our nation’s schools. Though hardly a new claim, this call has been amplified as politicians and policy makers have touted the objectivity of value-added measures of teacher effectiveness.

    The current debate has been fueled by states and districts that have begun attaching these value-added scores to such decisions as tenure and salary, amplifying teachers’ concerns about the fairness and reliability of these measures. The resulting acrimony has often drowned out the voices of parents, students, and teachers—who seem to have the most to lose or gain.

    As editors of the Harvard Educational Review, we have seen the way the academic discourse around this debate has largely devolved into methodological disagreements surrounding the technical aspects of the measurements, while the popular media has focused on the ideological conflicts inherent in bringing high-stakes assessments to the very human endeavor of teaching. Through this symposium we present diverse viewpoints from a variety of researchers, practitioners, and students. We reintroduce some much-needed context and complexity to the current conversation and begin a new dialogue that moves us beyond the current myopia surrounding one way to measure teacher effectiveness.

    The symposium’s central feature is the edited transcript of a roundtable discussion held in New York City in December 2010. Organized by Heather Harding of Teach For America, this gathering of researchers and practitioners provided the opportunity for those who approach these issues from different vantage points to express both their struggles with and hopes for improving the recruitment, selection, assessment, and training of teachers. The roundtable was led by Anthony Bryk, longtime educational researcher and president of the Carnegie Foundation for the Advancement of Teaching. The rich array of voices at the table included: Jesse Solomon, the founder of Boston Teacher Residency; Edward Liu, former Rutgers professor and now director of organizational learning at Boston Teacher Residency; Ann Clark, the chief academic officer at the Charlotte-Mecklenburg (NC) school district; Jane Hannaway, a vice president at the American Institute for Research (AIR); Steven Farr, the chief knowledge officer at Teach for America; and Pam Grossman, professor of education at Stanford University School of Education. The roundtable delved into how to move from a loose system of assessing teacher effectiveness to a more coherent and scalable approach. Around the table these researchers and practitioners shared their perspectives on promising practices and their efforts to refine and expand on them. The diverse experiences and points of view provided a cross-section of perspectives on the current policy debates that are shaping personnel decisions across the United States.

    Through four additional pieces in this symposium, we aim to inform and also offer complexity and nuance to the conversation. In the first piece, Susan Moore Johnson synthesizes some of her research from the past two decades in light of the current demands for greater teacher accountability. In this piece, she challenges assumptions about the conclusions we can draw about individual teacher performances without considering the organizational conditions of teachers’ work. While much of the current policy debate considers the hiring and firing of effective or ineffective teachers, Johnson reminds us that teachers can be developed over time. She notes that school improvement must not only assist individual teachers but also make for greater interdependence and consistency throughout the school.

    In the current milieu of quantitative approaches to measuring teacher effectiveness, John Papay provides important details of—and context for—the teacher evaluation tools currently available to administrators and policy makers. He argues that teacher evaluation tools should be assessed on how well they inform ongoing teacher development while also providing accurate measurements. He describes the technical details that underlie the quantitative mechanics of value-added measures, including issues of bias, reliability, and validity. Importantly, this piece also considers how a teacher evaluation system using value-added measures could be implemented in a way that helps teachers improve. By situating each of these aspects of teacher evaluation in the current literature and policy conversation, Papay’s essay strikes a balance between methodological and practical knowledge and provides some much-needed clarity around the evaluation tools frequently discussed in policy circles.

    Education historian Jeremy Sullivan provides context for the teacher effectiveness debate in his examination of teacher evaluation and the development of the Peer Assisted Review (PAR) system of teacher evaluation in Montgomery County, Maryland. By detailing the historical role of the collective bargaining unit and district administration, he conveys the background and conversations that have taken place regarding teacher effectiveness in one local setting. The history of Montgomery County reminds us that pay-for-performance systems and the desire to remove ineffective teachers are not new ideas and that there are alternatives to antiquated methods of teacher evaluation that do not necessarily rely on students’ standardized test scores. It also reminds us that many teachers are additionally concerned with ensuring that all students are taught by competent professionals and are willing to partner with districts to fairly evaluate and develop teachers to their fullest potential.

    Finally, we hear from participants and leaders of the Boston Student Advisory Council (BSAC), a student-directed organization that has added the all-important voice of students in the teacher evaluation process in Boston and the Commonwealth of Massachusetts. They recount their experiences in gaining a voice in the conversation on measuring teacher effectiveness in the Boston Public Schools, in expanding their efforts to the state level, and in developing their vision for a nationwide approach to teacher evaluation that includes student input. They also draw our attention to important new findings from the Measures of Effective Teaching (MET) study on teacher effectiveness, which has shown that student perception of teacher effectiveness is a strong predictor of achievement gains made by other students who have that same teacher (Bill and Melinda Gates Foundation, 2012). These students remind us that they are our primary stakeholders, that they are the ones taught by the teachers, and that we would do well to integrate their voices into the process of determining what does and does not count as effective teaching.

    What we include in this teacher effectiveness symposium is hardly exhaustive, but it does provide a texture and depth not represented in the popular media. Mirroring the spirit of the original roundtable, we have endeavored to tackle one idea from several angles in order to further the conversation. Importantly, although the voices of current teachers and school administrators are absent from this symposium, we recognize that they are essential to this conversation. We further recognize that there are many other aspects of high-quality teaching—such as cultural and linguistic competence, civic and moral development, and numerous subjects, including the arts and early childhood education—that do not easily lend themselves to large-scale, standardized assessments. Though we do not address these elements in this brief symposium, they deserve serious consideration by both practitioners and policy makers.

    Through these five articles, we shine light not only on dominant aspects of the conversation but also on elements of the debate that have been receiving less column space and airtime but that are no less part of a nuanced consideration of the topic of teacher effectiveness.

    1. See http://projects.latimes.com/value-added/


    Bill and Melinda Gates Foundation. Measures of Effective Teaching Project. (2010). Learning about teaching: Initial findings from the Measures of Effective Teaching Project. Retrieved from http://www.metproject.org/downloads/Preliminary_Finding_Policy_Brief.pdf

    Gordon, R., Kane, T. J., & Staiger, D. O. (2006). Identifying effective teachers using performance on the job. Washington, DC: Brookings Institution. 

    Keigher, A. (2010). Teacher attrition and mobility: Results from the 2008–09 Teacher Follow-Up Survey (NCES 2010-353). Washington, DC: National Center for Education Statistics. Retrieved from http://nces.ed.gov/pubsearch

    McNeil, M. (2012, January 17). Reports detail Race to Top winners’ challenges: States face difficulties delivering on promises. Education Week. Retrieved from www.edweek.org.

    Nye, B., Konstantopoulos, S., & Hedges, L. V. (2004). How large are teacher effects? Educational Evaluation and Policy Analysis, 26(3), 237–257.

    Rivkin, S., Hanushek, R., & Kain, J. (2005). Teachers, schools, and academic achievement. Econometrica, 73(2), 417–458.

    Sanders, W. L., & Rivers, J. C. (1996). Cumulative and residual effects of teachers on future student academic achievement. Knoxville: University of Tennessee Value-Added Research and Assessment Center.

    The editors thank former cochair and editor Candice Bocala for her continued support and leadership in making this symposium a reality.

  2. Share

    Spring 2012 Issue


    “A Few of the Brightest, Cleanest Mexican Children”
    School Segregation as a Form of Mundane Racism in Oxnard, California, 1900–1940
    David G. García, Tara J. Yosso, and Frank P. Barajas
    Changing Our Landscape of Inquiry for a New Science of Education
    Gary Thomas
    Institutional Racist Melancholia
    A Structural Understanding of Grief and Power in Schooling
    Sabina Vaught
    Symposium: By What Measure?
    Mapping and Expanding the Teacher Effectiveness Debate
    Contextual Influences on Inquiries into Effective Teaching and Their Implications for Improving Student Learning
    Anthony Bryk, Heather Harding, and Sharon Greenberg
    Having It Both Ways
    Building the Capacity of Individual Teachers and Their Schools
    Susan Moore Johnson
    Refocusing the Debate
    Assessing the Purposes and Tools of Teacher Evaluation
    John Papay
    A Collaborative Effort
    Peer Review and the History of Teacher Evaluations in Montgomery County, Maryland
    Jeremy P. Sullivan
    “We Are the Ones in the Classrooms—Ask Us!”
    Student Voice in Teacher Evaluations
    Boston Student Advisory Council

    Book Notes

    Our Difficult Sunlight
    Georgia A. Popoff and Quraysh Ali Lansana

    Call 1-800-513-0763 to order this issue.