Harvard Educational Review
  1. Summer 2013 Issue »

    Symposium: Teacher Effectiveness

    The Role of Context

    In 2012, the Harvard Educational Review (HER) asked some important questions about teacher effectiveness in its symposium titled “By What Measure?” These questions included: Who defines effectiveness and how can it be measured? And what effect does the language around teacher effectiveness have on policies and systems such as teacher certification, evaluation, and professional learning? Now as then, the journal seeks to remain at the forefront of this important debate. As teacher evaluation policies are hotly debated, adopted, and implemented around the country, HER builds on its earlier work to highlight the crucial role of context in the teacher effectiveness conversation. 

    The inclusion of requirements for measuring teacher effectiveness in the Race to the Top (RTTT)—the Obama administration’s competitive system for awarding federal funding—as well as the highly touted results of the Gates Foundation–funded Measures of Effective Teaching (MET) study, continue to highlight the salience of teacher effectiveness in these current education policy debates. As a result, many states and localities are creating teacher evaluation systems that rely heavily on value-added measures (VAM) of teacher quality. In fact, states that have accepted waivers from No Child Left Behind or RTTT funding have done so with the requirement that they consent to using standardized test scores, along with other measures, as a way to evaluate teacher performance. Other states have voluntarily adopted such programs, with some forgoing any measures other than student test scores. This trend has meant, on the whole, that teachers in a growing number of states are, or will be, subject to evaluation systems that use these metrics in the coming years.

    Recent evidence from the final MET Project (2013) report unequivocally states that using three metrics in combination—VAM scores, classroom observation instruments, and student surveys—reliably identifies “great teaching.” These findings reinforce the heart of what the RTTT requires of state evaluation systems. While not without flaws, the findings of the MET study do highlight important components of teaching and teaching effectiveness that many conversations and existing evaluation systems that rely only on VAM have been missing. Principally, the report highlights the role of context and the multidimensionality of the practice of teaching that cannot be captured dynamically in a test score or a set of quantifiable student characteristics.

    We believe that, especially in their infancy, policies that seek to address the needs and meet the challenges of teaching practice must be debated and evaluated rigorously and considered thoroughly—especially after being deployed—if their impact and application are to be fully understood. The budding programs for teacher evaluation are certainly no exception. As these new policies move from theory to practice, with considerable consequences to teachers and students alike, their logic, assumptions, and entailments need to be examined if we are to foresee and potentially minimize unintended consequences.

    This symposium includes two articles that take a deeper dive into considering the role that context plays in our understanding of teacher effectiveness. While the articles conceptualize context somewhat differently, considering them together allows us to further emphasize that the details of context matter. We see that context can be incorporated both in the consideration of complex statistical methodologies for calculating teacher value-added scores as well as in the more holistic observational rubrics designed to capture important dimensions of pedagogy and practice not otherwise visible in VAM.

    In the first piece, “Rethinking Teacher Evaluation: A Conversation about Statistical Inferences and Value-Added Models,” Everson, Feinauer, and Sudweeks take a methodological look at context, offering a critique of the ways in which value-added scores are currently calculated and the potential inferences that can then be made about teacher effectiveness. The authors offer an alternative statistical methodology that changes the very inference being drawn about teachers by their value-added scores, incorporating context literally into the equation. They argue that if teachers are to be evaluated using student test scores at all, they should be evaluated for the job they were hired to do in the context in which they were hired to do it. In other words, the metric itself should not be a measure of how effective teachers are at teaching all students on average but, rather, how effective teachers are at teaching their own classroom composition of students.

    In the second piece, Hill and Grossman, both developers and researchers of classroom observation instruments, caution against the rush to include teacher observation metrics in teacher evaluation without unpacking important assumptions about their validity and reliability for both evaluation and feedback purposes. In “Learning from Teacher Observations: Challenges and Opportunities Posed by New Teacher Evaluation Systems,” the authors call on policy makers and practitioners to be thoughtful in considering potential problems of implementation that include: content specificity, grain size of what is measured and discussed, rater expertise, observation system design, accuracy and alignment of scores and feedback to teachers, and time investments in such observational systems.

    While the ideas presented in these pages are new and worthy of ponder and policy consideration, we recognize that there remain gaps in what we have considered here and in our previous symposium on this issue. For example, at present, only teachers in tested subjects and grades are subject to the existing forms of evaluation. It is an open question as to what teacher effectiveness and context will look like for teachers of nontested subjects. Thus, we assert that the consideration of context becomes important in ways we have only begun to consider. We aim to add further texture to the debate on teacher effectiveness and spur new conversations and scholarship that will continue on our pages and elsewhere.

    MET Project. (2013). Ensuring fair and reliable measures of effective teaching: Culminating findings from the MET project’s three-year study. Seattle: Bill & Melinda Gates Foundation. 
  2. Share

    Summer 2013 Issue


    Leaving the Space Better Than You Found It Through Song
    Music, Diversity, and Mission in One Black Student Organization
    A Gifted Education
    The Importance of Still Teaching the iGeneration
    New Technologies and the Centrality of Pedagogy
    For Colored Kids Who Committed Suicide, Our Outrage Isn’t Enough
    Queer Youth of Color, Bullying, and the Discursive Limits of Identity and Safety
    Eric Darnell Pritchard

    Book Notes

    Beyond Binaries in Education Research
    edited by Warren Midgley, Mark A. Tyler, Patrick Alan Danaher, and Alison Mander

    Educational Experiences of Hidden Homeless Teenagers Living Doubled-Up
    Ronald E. Hallett

    Call 1-800-513-0763 to order this issue.