Evaluating Teacher Performance and Teaching Effectiveness: Conceptual and Methodological Considerations

Educational theory inextricably links teachers to student learning, as the key factor mediating educational policies and student experiences in the classroom, with research consistently showing a relationship between a range of teacher and classroom variables that exert an important influence on student outcomes. This chapter highlights the key conceptual and methodological issues involved in the evaluation of teaching and teachers, with particular focus on the distinction between the concepts of performance and effectiveness. It considers the implications of assumptions and choices around why the evaluation is conducted, what is evaluated, and how it is evaluated, presenting a range of methods to collect data on performance and effectiveness. Additionally, we analyze issues related to the reliability and validity of resulting inferences about teacher performance or effectiveness and the implications for policy and practice. Finally, the distinctions and commonalities in evaluating performance and effectiveness in practice are exemplified through the presentation of different models of teacher evaluation.
This is a preview of subscription content, log in via an institution to check access.
Access this chapter
Springer+ Basic
€32.70 /Month
- Get 10 units per month
- Download Article/Chapter or eBook
- 1 Unit = 1 Article or 1 Chapter
- Cancel anytime
Buy Now
Price includes VAT (France)
eBook EUR 117.69 Price includes VAT (France)
Softcover Book EUR 147.69 Price includes VAT (France)
Hardcover Book EUR 147.69 Price includes VAT (France)
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others

Teacher Evaluation with Multiple Indicators: Conceptual and Methodological Considerations Regarding Validity
Chapter © 2021

An integral perspective on teacher evaluation: a review of empirical studies
Article 22 May 2020
Teacher Evaluations and Pupil Achievement Gains: Evidence from Classroom Observations
Article 15 July 2016
Notes
Subject knowledge; commitment to student learning; monitoring and managing student learning; reflecting around and learning about their own practice; and membership in learning communities.
Learner development; learning differences; learning environments; content knowledge; application of content; assessment, planning for instruction; instructional strategies; professional learning and ethical practice; and leadership and collaboration.
In 2020, guidelines for remote teaching were issued for the FFT, which focus on components that are thought to be most relevant for online learning and remote instruction (The Danielson Group, 2020).
The area of emotional support encompasses the dimensions of classroom climate, teacher sensitivity, and regard for student perspectives, while classroom organization includes behavior management, productivity, and instructional learning format. Finally, instructional support is operationalized into concept development, quality of feedback, and language modeling.
In these models, teachers who have been identified for their excellence in teaching and mentoring are chosen as coaches to provide support to new teachers as well as experienced colleagues who may require help. Coaches are also responsible for the teachers’ formal personnel evaluations. Typically, coaches do not work in a single school, but are matched with teachers from different schools according to grade level or subject area.
AYPs were defined as a specific amount of yearly progress in standardized test scores a school, district, or state was expected to make in a year.
Schools can adopt commercially available tests or develop their own, provided these are “rigorous, aligned to content standards, and appropriate for the teacher’s classes and students” (District of Columbia Public Schools, 2011, p. 2; Gitomer & Joyce, 2015).
References
- AERA, APA, NCME. (2014). Standards for educational and psychological testing. American Educational Research Association. Google Scholar
- Amrein-Beardsley, A. (2008). Methodological concerns about the education value-added assessment system. Educational Researcher, 37(2), 65–75. Google Scholar
- Anderson, J. (2013, March 30). Curious grade for teachers: Nearly all pass. New York Times. Google Scholar
- Apple, M. W. (2007). Ideological success, educational failure? On the politics of no child left behind. Journal of Teacher Education,58(2), 108–116. ArticleGoogle Scholar
- Australian Institute for Teaching and School Leadership. (2018). Australian Professional Standards for Teachers. AITSL. Google Scholar
- Baker, E. L., Barton, P. E., Darling-Hammond, L., Haertel, E., Ladd, H. F., Linn, R. L., Shepard, L. A., et al. (2010). Problems with the use of student test scores to evaluate teachers. EPI Briefing Paper (278). Google Scholar
- Ball, D. L., & Rowan, B. (2004). Introduction: Measuring instruction. The Elementary School Journal,5(1), 3–10. ArticleGoogle Scholar
- Ball, D. L., Thames, M. H., & Phelps, G. (2008). Content knowledge for teaching: What makes it special? Journal of Teacher Education,59(5), 389–407. ArticleGoogle Scholar
- Bell, C. A., Dobbelaer, M. J., Klette, K., & Visscher, A. (2019). Qualities of classroom observation systems. School Effectiveness and School Improvement,30(1), 3–29. ArticleGoogle Scholar
- Bell, C. A., Klieme, E., & Praetorius, A.-K. (2020). Conceptualising teaching quality into six domains for the Study. In OECD, global teaching insights technical report (pp. 1–24). OECD Publishing. Google Scholar
- Betebenner, D. W. (2009). Norm- and criterion-referenced student growth. Educational Measurement: Issues and Practice,28(4), 42–51. ArticleGoogle Scholar
- Betebenner, D. W. (2011). A technical overview of the student growth percentile methodology: Student growth percentiles and percentile growth projections/trajectories. The National Center for the Improvement of Educational Assessment. Google Scholar
- Bill & Melinda Gates Foundation. (2010). Learning about teaching: Initial findings from the measures of effective teaching project. Bill & Melinda Gates Foundation. Google Scholar
- Bill & Melinda Gates Foundation. (2012a). Gathering feedback for teaching. Research Paper. Bill & Melinda Gates Foundation. Google Scholar
- Bill & Melinda Gates Foundation. (2012b). Asking students about teaching. Policy and practice brief.Google Scholar
- Bransford, J., Darling-Hammond, L., & LePage, P. (2005). Introduction. In L. Darling-Hammond & J. Bransford (Eds.), Preparing teachers for a changing world: What teachers should learn and be able to do (pp. 1–39). Jossey-Bass. Google Scholar
- Brennan, R. L. (2001). Some problems, pitfalls, and paradoxes in educational measurement. Educational Measurement: Issues and Practice,20(4), 6–18. ArticleGoogle Scholar
- Brookhart, S. M. (2009). The many meanings of multiple measures. Education Leadership,67(3), 6–12. Google Scholar
- Brophy, J., & Goode, T. L. (1986). Teacher behavior and student achievement. In M. C. Wittrock (Ed.), Handbook of research on teaching (3rd ed., pp. 328–375). MacMillan. Google Scholar
- California Commission on Teacher Credentialing. (2009). California standards for the teaching profession (CSTP).Google Scholar
- CASEL. (2020). CASEL’S SEL framework: What are the core competence areas and where are they promoted? CASEL. Google Scholar
- Close, K., Amrein-Beardsley, A., & Collins, C. (2020). Putting teacher evaluation systems on the map: An overview of state’s teacher evaluation systems post–every student succeeds act. Education Policy Analysis Archives, 28(58), 1–26. Google Scholar
- Cohen, D. K. (1995). Rewarding teachers for student performance. In S. Fuhrman, & J. O’Day (Eds.), Rewards and reforms: Creating educational incentives that work. Jossey-Bass. Google Scholar
- Cole, M. S., Bedeian, A. G., Hirschfeld, R. R., & Vogel, B. (2011). Dispersion-composition models in multilevel research: A data-analytic framework. Organizational Research Methods,14(4), 718–734. ArticleGoogle Scholar
- Connecticut State Department of Education. (2010). Common core of teaching: Foundational skills. CSDE. Google Scholar
- Corcoran, S. P. (2010). Can teachers be evaluated by their students’ test scores? Should they be? The use of value-added measures of teacher effectiveness in policy and practice. Annenberg Institute for School Reform. Google Scholar
- Council of Chief State School Officers. (2013). InTASC model core teaching standards and learning progressions for teachers 1.0. CCSO. Google Scholar
- Danielson, C. (2013). The framework for teaching evaluation instrument, 2013 edition. Danielson group. Google Scholar
- Darling-Hammond, L. (2000). Teacher quality and student achievement: A review of state policy evidence. Education Policy Analysis Archives,8(1), 1–44. ArticleGoogle Scholar
- Darling-Hammond, L. (2006). Constructing 21st-century teacher education. Journal of Teacher Education,57(3), 300–314. ArticleGoogle Scholar
- Darling-Hammond, L. (2008). Reshaping teaching policy, preparation, and practice: Influences of the national board for professional teaching standards. In R. Stake, S. Kushner, L. Ingvarson, & J. Hattie (Eds.), Assessing teachers for professional certification: The first decade of the national board for professional teaching standards (Advances in Program Evaluation) (Vol. 11, pp. 25–53). Emerald Group Publishing Limited. Google Scholar
- Darling-Hammond, L. (2015). Can value added add value to teacher evaluation? Educational Researcher,44(2), 132–137. ArticleGoogle Scholar
- Darling-Hammond, L., Amrein-Beardsley, A., Haertel, E., & Rothstein, J. (2012). Evaluating teacher evaluation. Phi Delta Kappan, 93(6), 8–15. Google Scholar
- De Corte, W., Lievens, F., & Sackett, P. R. (2007). Combining predictors to achieve optimal tradeoffs between selection quality and adverse impact. Journal of Applied Psychology,92, 1380–1393. ArticleGoogle Scholar
- De Pascale, C. (2012). Managing multiple measures. Principal,91(5), 6–10. Google Scholar
- Department for Education, England. (2013). Teachers’ standards: Guidance for school leaders, school staff and governing bodies. DFE. Google Scholar
- District of Columbia Public Schools. (2011). Teacher-assessed student achievement data (TAS) guidance. DCPS. Google Scholar
- Doss, C. J. (2019). Student growth percentiles 101: Using relative ranks in student test scores to help measure teaching effectiveness. RAND Corporation. Google Scholar
- Duncan, A. (2012, agosto 22). Change is hard. Retrieved from US Department of Education: https://www.ed.gov/news/speeches/change-hard
- Dynarski, M. (2016). Teacher observations have been a waste of time and money. Brookings Institution. Google Scholar
- Ehlert, M., Koedel, C., Parsons, E., & Podgursky, M. J. (2014). The sensitivity of value-added estimates to specification adjustments: Evidence from school- and teacher-level models in missouri. Statistics and Public Policy,1(1), 19–27. ArticleGoogle Scholar
- Elmore, R. F. (1996). Getting to scale with good educational practice. Harvard Educational Review,66(1), 1–26. ArticleGoogle Scholar
- Every Student Succeeds Act, Title I Section 1111(2)(B)(III)(vi) (2015). Google Scholar
- Ferguson, R. F. (2012). Can student surveys measure teaching quality? Phi Delta Kappan,94(3), 24–28. ArticleGoogle Scholar
- Florida Department of Education. (2018). 2017–18 District educator evaluation ratings. Retrieved from Archived Statewide District Evaluation Results: http://www.fldoe.org/teaching/performance-evaluation/archive.stml
- Gitomer, D. H., & Joyce, J. (2015). A review of the DC IMPACT teacher evaluation system. National Research Council. Google Scholar
- Gitomer, D. H., & Zisk, R. C. (2015). Knowing what teachers know. Review of Research in Education,39, 1–53. ArticleGoogle Scholar
- Gitomer, D. H., Martinez, J. F., Battey, D., & Hyland, N. E. (2019). Assessing the assessment: Evidence of reliability and validity in the edTPA. American Educational Research Journal,58(1), 3–31. ArticleGoogle Scholar
- Glazerman, S., Goldhaber, D., Loeb, S., Raudenbush, S., Staiger, D. O., & Whitehurst, G. J. (2011). Passing muster: Evaluating evaluation systems. Brown Center on Education Policy at Brookings. Google Scholar
- Glazerman, S., Loeb, S., Goldhaber, D., Staiger, D., Raudenbush, S., & Whitehurst, G. (2010). Evaluating teachers: The important role of value-added. Brookings Institution. Google Scholar
- Goe, L. (2007). The link between teacher quality and student outcomes: A research synthesis. National Comprehensive Center for Teacher Quality. Google Scholar
- Goe, L., & Croft, A. (2009). Methods of evaluating teacher effectiveness. National Comprehensive Center for Teacher Quality. Google Scholar
- Goe, L., Bell, C., & Little, O. (2008). Approaches to evaluating teacher effectiveness: A research synthesis. National Comprehensive Center for Teacher Quality. Google Scholar
- Goldhaber, D., & Anthony, E. (2007). Can teacher quality be effectively assessed? National board certification as a signal of effective teaching. The Review of Economics and Statistics,89(1), 134–150. ArticleGoogle Scholar
- Goldhaber, D., Walch, J., & Gabele, B. (2014). Does the model matter? Exploring the relationship between different student achievement-based teacher assessments. Statistics and Public Policy,1(1), 28–39. ArticleGoogle Scholar
- Goldstein, J., & Noguera, P. A. (2006). A thoughtful approach to teacher evaluation. Educational Leadership,63(6), 31–37. Google Scholar
- Good, T. L. (2014). What do we know about how teachers influence student performance on standardized tests: And why do we know so little about other student outcomes? Teachers College Record,116, 1–41. Google Scholar
- Goodman, S. F., & Turner, L. J. (2013). The design of teacher incentive pay and educational outcomes: Evidence from the New York City bonus program. Journal of Labor Economics,31(2), 409–420. ArticleGoogle Scholar
- Grossman, P., Loeb, S., Cohen, J., & Wyckoff, J. (2013). Measure for measure: The relationship between measures of instructional practice in middle school English language arts and teachers’ value-added scores. American Journal of Education,119, 445–470. ArticleGoogle Scholar
- Guarino, C. M., Reckase, M. D., & Wooldridge, J. M. (2012). Can value-added measures of teacher performance be trusted? Education Policy Center at Michigan State University. Google Scholar
- Guarino, C. M., Reckase, M. D., Stacy, B., & Wooldridge, J. M. (2015). A comparison of student growth percentile and value-added models of teacher performance. Statistics and Public Policy,2(1), 1–11. ArticleGoogle Scholar
- Guerriero, S. (2018). Teachers’ pedagogical knowledge and the teaching profession: Background report and project objectives. OECD Publishing. Google Scholar
- Hallinger, P., Heck, R. H., & Murphy, J. (2014). Teacher evaluation and school improvement: An analysis of the evidence. Educational Assessment, Evaluation and Accountability,26(1), 5–28. ArticleGoogle Scholar
- Hamilton, L. (2005). Lessons from performance measurement in education. In R. Klitgaard & P. C. Light (Eds.), High-performance government (pp. 381–405). RAND Corporation. Google Scholar
- Hamre, B. K., & Pianta, R. C. (2007). Learning opportunities in preschool and early elementary classrooms. In R. C. Pianta, M. J. Cox, & K. L. Snow (Eds.), School readiness & the transition to kindergarten in the era of accountability (pp. 49–84). Paul H. Brookes Publishing Co. Google Scholar
- Hanushek, E. A., & Rivkin, S. G. (2010). Using value-added measures of teacher quality. CALDER - Urban Institute. Google Scholar
- Hanushek, E. A., Kain, J. F., & Rivkin, S. G. (1999). Do higher salaries buy better teachers? NBER Working Paper No. 7082. Google Scholar
- Harris, D. N., & Sass, T. R. (2009). What makes for a good teacher and who can tell? CALDER working paper. Google Scholar
- Harris, D. N., Ingle, W. K., & Rutledge, S. A. (2014). How teacher evaluation methods matter for accountability: a comparative analysis of teacher effectiveness ratings by principals and teacher value-added measures. American Educational Research Journal,51(1), 73–112. ArticleGoogle Scholar
- Hattie, J. (2009). Visible learning: A synthesis of over 800 meta-analyses relating to achievement. Routledge. Google Scholar
- Hill, H. C., Blunk, M., Charalambous, C., Lewis, J., Phelps, G. C., Sleep, L., & Ball, D. L. (2008). Mathematical knowledge for teaching and the mathematical quality of instruction: An exploratory study. Cognition and Instruction,26, 430–511. ArticleGoogle Scholar
- Hill, H. C., Charalambous, C. Y., & Kraft, M. A. (2012). When rater reliability is not enough: Teacher observation systems and a case for the generalizability study. Educational Researcher,41(2), 56–64. ArticleGoogle Scholar
- Jackson, C. K. (2016). What do test scores miss? The importance of teacher effects on non-test score outcomes. NBER. Google Scholar
- Johnson, S. M., & Fiarman, S. E. (2012). The potential of peer review. Educational Leadership,70(3), 20–25. Google Scholar
- Kane, M. T. (2006). Validation. In R. L. Brennan (Ed.), Educational measurement (4th ed., pp. 17–64). American Council on Education. Google Scholar
- Kane, T. J., & Staiger, D. O. (2008). Estimating teacher impacts on student achievement: An experimental evaluation. NBER Working Paper 14607. Google Scholar
- Kane, T. J., & Staiger, D. O. (2012). Gathering feedback for teaching. Bill & Melinda Gates Foundation. Retrieved from http://k12education.gatesfoundation.org/download/?Num=2680&filename=MET_Gathering_Feedback_Research_Paper1.pdf
- Kane, T. J., Taylor, E. S., Tyler, J. H., & Wooten, A. L. (2011, Summer). Evaluating teacher effectiveness: Can classroom observations identify practices that raise achievement? Education Next (pp. 55–60). Google Scholar
- Kane, T., McCaffrey, D., Miller, T., & Staiger, D. (2013). Have we identified effective teachers? Validating measures of effective teaching using random assignment. Research Paper. Bill & Melinda Gates Foundation. Google Scholar
- Kennedy, M. M. (2008). Sorting out teacher quality. Phi Delta Kappan,90(1), 59–63. ArticleGoogle Scholar
- Kloser, M. (2014). Identifying a core set of science teaching practices: A Delphi expert panel approach. Journal of Research in Science Teaching,51(9), 1185–1217. ArticleGoogle Scholar
- Kloser, M., Edelman, A., Floyd, C., Martinez, J. F., Stecher, B., Srinivasan, J., & Lavin, E. (2021). Interrogating practice or show and tell? Using a digital portfolio to anchor a professional learning community of science teachers. Journal of Science Teacher Education,32(2), 210–241. ArticleGoogle Scholar
- Kuhfeld, M. (2017). When students grade their teachers: A validity analysis of the tripod student survey. Educational Assessment,22(4), 253–274. ArticleGoogle Scholar
- Kurtz, M. D. (2018). Value-added and student growth percentile models: What drives differences in estimated classroom effects? Statistics and Public Policy,5(1), 1–8. ArticleGoogle Scholar
- Lachlan-Haché, L., Cushing, E., & Bivona, L. (2012a). Student learning objectives as measures of educator effectiveness: The basics. American Institutes for Research. Google Scholar
- Lachlan-Haché, L., Cushing, E., & Bivona, L. (2012b). Student learning objectives: Benefits, challenges, and solutions. American Institutes for Research. Google Scholar
- LAUSD. (2021a, April 3). History of EDST. Retrieved from https://achieve.lausd.net/Page/11782#spn-content
- LAUSD. (2021b). Teaching and learning framework. LAUSD. Google Scholar
- Linn, R. L. (2000). Assessments and accountability. Educational Researcher,29(2), 4–16. ArticleGoogle Scholar
- Lockwood, J. R., McCaffrey, D. F., Hamilton, L. S., Stecher, B., Le, V.-N., & Martinez, J. F. (2007). The sensitivity of value-added teacher effect estimates to different mathematics achievement measures. Journal of Educational Measurement,44(1), 47–67. ArticleGoogle Scholar
- Los Angeles Unified School District. (2019). 2018–2019 EDS final evaluation report for teachers and non-classroom teachers: Administrator handbook. LAUSD. Google Scholar
- Maine Department of Education. (2012). Common core teaching standards. MDE. Google Scholar
- Martínez Rizo, F. (2015). La evaluación del desempeño docente. Una propuesta para la educación básica en México. In G. Guevara Niebla, M. T. Melendez Irigoyen, F. E. Ramon Castaño, H. Sanchez Perez, & F. Tirado Segura (Eds.), La evaluación docente en México (pp. 64–95). INEE-Fondo de Cultura Económica. Google Scholar
- Martinez, J. F. (2012). Consequences of omitting the classroom in multilevel models of schooling: An illustration using opportunity to learn and reading achievement. School Effectiveness and School Improvement,23(3), 305–326. ArticleGoogle Scholar
- Martinez, J. F., & Fernandez, M. P. (2019). Evaluación docente con indicadores múltiples: Consideraciones conceptuales y metodológicas en torno a la validez. In J. Manzi, M. R. Garcia, & S. Taut (Eds.), Validez de Evaluaciones Educacionales en Chile y Latinoamérica (pp. 531–562). Ediciones UC. Google Scholar
- Martinez, J. F., Borko, H., & Stecher, B. (2012). Measuring instructional practices in middle school science using classroom artifacts. Journal for Research in Science Teaching,49, 38–67. ArticleGoogle Scholar
- Martinez, J. F., Schweig, J., & Goldschmidt, P. (2016a). Approaches for combining multiple measures of teacher performance: Reliability, validity, and implications for evaluation policy. Educational Evaluation and Policy Analysis,38(4), 738–756. ArticleGoogle Scholar
- Martinez, J. F., Taut, S., & Schaaf, K. (2016b). Classroom observation for evaluating and improving teaching: An international perspective. Studies in Educational Evaluation,49, 15–29. ArticleGoogle Scholar
- Marzano, R. J., & Toth, M. D. (2013). Teacher evaluation that makes a difference: A new model for teacher growth and student achievement. ASCD. Google Scholar
- Matsumura, L. C., Garnier, H. E., Slater, S. C., & Boston, M. D. (2008). Toward measuring instructional interactions “at-scale.” Educational Assessment,13, 267–300. ArticleGoogle Scholar
- Medley, D. M., & Coker, H. (1987). The accuracy of principals’ judgments of teacher performance. The Journal of Educational Research,80(4), 242–247. ArticleGoogle Scholar
- Meyer, R. H. (1996). Value-added indicators of school performance. In E. A. Hanushek & D. W. Jorgenson (Eds.), Improving America’s schools: The role of incentives (pp. 197–223). The National Academies Press. Google Scholar
- Meyer, R., Pier, L., Mader, J., Christian, M., Rice, A., Loeb, S., Hough, H., et al. (2019). Can we measure classroom supports for social-emotional learning? Applying value-added models to student surveys in the CORE districts. PACE. Google Scholar
- Mihaly, K., McCaffrey, D., Staiger, D., & Lockwood, J. R. (2013). A composite estimator of effective teaching (MET Project). The RAND Corporation. Google Scholar
- Millman, J. (1981). Student achievement as a measure of teacher competence. In Handbook of teacher evaluation (pp. 146–166). Sage. Google Scholar
- Ministry of Education, Chile. (2008). Marco para la Buena Enseñanza. MINEDUC. Google Scholar
- Muijs, D. (2006). Measuring teacher effectiveness: Some methodological reflections. Educational Research and Evaluation,12(1), 53–74. ArticleGoogle Scholar
- Mullens, J. E. (1995). Classroom instructional processes: A review of existing measurement approaches and their applicability for the teacher followup survey. U.S. Department of Education. Google Scholar
- Mullis, I. V., Martin, M. O., Foy, P., Kelly, D. L., & Fishbein, B. (2020). TIMSS 2019 international results in mathematics and science. TIMSS & PIRLS International Study Center. Google Scholar
- National Board for Professional Teaching Standards. (2016). What teachers should know and be able to do (2nd ed.). NBPTS. Google Scholar
- National Commission on Excellence in Education. (1983). A nation at risk: The imperative for educational reform. U.S. Department of Education. Google Scholar
- National Council of Teachers in Mathematics. (2000). Principles and standards for school mathematics. NCTM. Google Scholar
- National Research Council. (2010). Preparing teachers: Building evidence for sound policy. National Academy of Sciences. Google Scholar
- NCTQ. (2015). State teacher policy yearbook: National summary. National Council on Teacher Quality (NCTQ). Google Scholar
- NCTQ. (2017). Running in place: How New teacher evaluations fail to live up to promises. NCTQ. Google Scholar
- NCTQ. (2019). State of the states 2019: Teacher & principal evaluation policy. National Council on Teacher Quality (NCTQ). Google Scholar
- New York City Department of Education. (2019). Advance guide for educators 2019–2020. NYCDE. Google Scholar
- OECD. (2013). Teachers for the 21st century: Using evaluation to improve teaching. OECD Publishing. Google Scholar
- OECD. (2019). TALIS 2018 results: Teachers and school leaders as lifelong learners (Vol. 1). OECD Publishing. Google Scholar
- OECD. (2020). Global teaching insights: A video study of teaching. OECD Publishing. Google Scholar
- Paige, M. (2020). Moving forward while looking back: How can VAM lawsuits guide teacher evaluation policy in the age of ESSA? Education Policy Analysis Archives,28(64), 1–18. Google Scholar
- Papay, J. (2012). Refocusing the debate: Assessing the purposes and tools of teacher evaluation. Harvard Educational Review,82(1), 123–141. ArticleGoogle Scholar
- Pecheone, R. L., Shear, B., Whittaker, A., & Darling-Hammond, L. (2013). 2013 edTPA field test: Summary report. SCALE. Google Scholar
- Peterson, K. D. (1995). Teacher evaluation: A comprehensive guide to new directions and practices. Corwin. Google Scholar
- Pianta, R. C., & Hamre, B. K. (2009). Conceptualization, measurement, and improvement of classroom processes: Standardized observation can leverage capacity. Educational Researcher,38(2), 109–119. ArticleGoogle Scholar
- Pianta, R. C., La Paro, K. M., & Hamre, B. K. (2007). Classroom assessment scoring system. Paul H. Brookes. Google Scholar
- Popham, W. J. (1971). Performance tests of teaching proficiency: Rationale, development, and validation. American Educational Research Journal,8(1), 105–117. ArticleGoogle Scholar
- Popham, W. J. (2007). Instructional insensitivity of tests: Accountability’s dire drawback. Phi Delta Kappan, 146–155. Google Scholar
- Porter, A., Youngs, P., & Odden, A. (2001). Advances in teacher assessments and their uses. In V. Richardson (Ed.), Handbook of research on teaching (4th ed., pp. 259–297). AERA. Google Scholar
- Reynolds, A. (1992). Getting to the core of the apple: A theoretical view of the knowledge base of teaching. Journal of Personnel Evaluation in Education,6, 41–55. ArticleGoogle Scholar
- Rivkin, S. G., Hanushek, E. A., & Kain, J. F. (2005). Teachers, schools, and academic achievement. Econometrica,73(2), 417–458. ArticleGoogle Scholar
- Rothstein, J. (2016). Can value-added models identify teachers’ impacts? IRLE—UC Berkeley. Google Scholar
- Rowan, B., & Correnti, R. (2009). Studying reading instruction with teacher logs: Lessons from the study of instructional improvement. Educational Researcher,38(2), 120–131. ArticleGoogle Scholar
- Rubin, D. B., Stuart, E. A., & Zanutto, E. L. (2004). A potential outcomes view of value-added assessment in education. Journal of Educational and Behavioral Statistics,29(1), 103–116. ArticleGoogle Scholar
- S.B. 736, Student Success Act. (2010). St. FL. Google Scholar
- S.B. 736, Student Success Act, Section 1012.343(3)(a)1 (2010). Google Scholar
- Sass, T. R. (2008). The stability of value‐added measures of teacher quality and implications for teacher compensation policy. CALDER—Urban Institute. Google Scholar
- Sato, M. (2014). What is the underlying conception of teaching of the edTPA? Journal of Teacher Education,65(5), 421–434. ArticleGoogle Scholar
- Sawada, D., Piburn, M. D., Judson, E., Turley, J., Falconer, K., Benford, R., & Bloom, I. (2002). Measuring reform practices in science and mathematics classrooms: The reformed teaching observation protocol. School Science and Mathematics,102(6), 245–253. ArticleGoogle Scholar
- Schweig, J. D. (2016). Moving beyond means: Revealing features of the learning environment by investigating the agreement of student ratings. Learning Environments Research,19(3), 441–462. ArticleGoogle Scholar
- Schweig, J., Baker, G., Hamilton, L. S., & Stecher, B. M. (2018). Building a repository of assessments of interpersonal, intrapersonal, and higher-order cognitive competencies. RAND Corporation. Google Scholar
- Shulman, L. (1998). Teacher portfolios: A theoretical activity. In N. Lyons (Ed.), With portfolio in hand (pp. 23–37). Teachers College Press. Google Scholar
- Shulman, L. S. (1987). Knowledge and teaching: Foundations of the new reform. Harvard Educational Review,57(1), 1–22. ArticleGoogle Scholar
- Stecher, B. M., Wood, A. C., Gilbert, M., Borko, H., Kuffner, K. L., Arnold, S. C., & Dorman, E. H. (2005). Using classroom artifacts to measure instructional practices in middle school mathematics: A two-state field test (CSE Report 662). CRESST. Google Scholar
- Stecher, B., & Kirby, S. N. (2004). Organizational improvement and accountability: Lessons for education from other sectors. RAND Corporation. Google Scholar
- Steele, J., Hamilton, L. S., & Stecher, B. M. (2010). Incorporating student performance measures into teacher evaluation systems. The RAND Corporation. Google Scholar
- Stodolsky, S. S. (1990). Classroom observation. In J. Millman & L. Darling-Hammond (Eds.), The new handbook of teacher evaluation: Assessing elementary and secondary school teachers (pp. 175–190). Corwin Press. Google Scholar
- Taut, S., & Sun, Y. (2014). The development and implementation of a national, standards-based, multi-method teacher performance assessment system in Chile. Education Policy Analysis Archives, 22(71). Google Scholar
- The Danielson Group. (2020). The framework for remote teaching. The Danielson Group. Google Scholar
- Tucker, P. D., & Stronge, J. H. (2005). Linking teacher evaluation and student learning. Association for Supervision and Curriculum Development. Google Scholar
- U.S. Department of Education. (2001). No child left behind act (Executive Summary). U.S. Department of Education. Google Scholar
- Walkington, C., & Marder, M. (2018). Using the UTeach observation protocol (UTOP) to understand the quality of mathematics instruction. ZDM Mathematics Education,50, 507–519. ArticleGoogle Scholar
- Walsh, E., & Isenberg, E. (2013). How does a value-added model compare to the colorado growth model? Mathematica Policy Research. Google Scholar
- Weisberg, D., Sexton, S., Mulhern, J., & Keeling, D. (2009). The widget effect: Our national failure to acknowledge and act on differences in teacher effectiveness. The New Teacher Project. Google Scholar
- West, M. R. (2016). Should non-cognitive skills be included in school accountability systems? Preliminary evidence from California’s CORE districts. Evidence Speaks Reports,1(13), 1–7. Google Scholar
- Windschitl, M., Thompson, J., & Braaten, M. (2018). Ambitious science teaching. Harvard Education Press. Google Scholar
- Wise, A. E., Darling-Hammond, L., McLaughlin, M. W., & Bernstein, H. T. (1985). Teacher evaluation: A study of effective practices. The Elementary School Journal,86(1), 60–121. ArticleGoogle Scholar
- Wragg, E. C. (1999). An introduction to classroom observation. Routledge. Google Scholar
- Yuan, K., Le, V., McCaffrey, D. F., Marsh, J. A., Hamilton, L. S., Stecher, B. M., & Springer, M. G. (2013). Incentive pay programs do not affect teacher motivation or reported practices: Results from three randomized studies. Educational Evaluation and Policy Analysis,35(1), 3–22. ArticleGoogle Scholar
Author information
Authors and Affiliations
- UCLA, Los Ángeles, CA, USA María Paz Fernández & José Felipe Martínez
- María Paz Fernández