Assessment Education Perspectives

Formative Assessment: Evidence-Based or Oversold?

Formative Assessment: Evidence-Based or Oversold? - AL-IMG-12072017

Formative assessment, the integration of assessment with instruction or “assessment for learning,” has the potential to yield substantial learning gains (Black & Wiliam, 1998). Researchers often summarize the impact of programs and practices in terms of an effect size that measures the difference between the treatment and control groups. In this case, effect sizes would reflect the difference between a group that implemented a formative assessment intervention and a control group, which did not implement the intervention. Of approximately 20 quantitative studies reviewed by Black and Wiliam, the effect size of formative assessment interventions ranged from 0.4 to 0.7 standard deviations. This is a very large effect for an educational intervention. To put it in context, an effect size of .4 is equivalent to a gain of 16 percentile points (e.g., moving from the 50th percentile to the 66th percentile) and an effect size of .7 is equivalent to a gain of 26 percentile points (e.g., moving from the 50th percentile to the 76th percentile).[1]

However, other researchers have argued that the effect size for formative assessment has been overstated. Kingston and Nash (2011) found only 13 of the studies of formative assessment interventions provided sufficient information to calculate relevant effect sizes, and that the median observed effect size was .25, smaller than the .40–.70 range. A more recent study of a formative assessment system in mathematics for kindergarten and first grade had positive results with effect sizes of .20 and .24, respectively (Lang, Schoen, LaVenia, & Oberlin, 2014). While still positive, the findings from these studies suggest that formative assessment interventions likely have a more modest impact, around 10 to 12 percentile point gain in student achievement.

Tweet: Formative Assessment: Evidence-Based or Oversold #edchat #educationIn a recent post, Heather Hill of Harvard’s Graduate School of Education questioned the efficacy of formative assessment in isolation from other shifts in teaching practice. She cites examples of studies in which formative assessment or “data-driven instruction” interventions have had little to no measurable impact on student outcomes.  Such disappointing findings could be due to lack of implementation fidelity, or reflect differences in how we define or measure formative assessment. That is, the process of formative assessment may be defined and enacted differently from one place to another. It seems likely that, as Hill concludes, the effectiveness of formative assessment hinges on how it is situated within a broader package of teaching practices that support student learning. Formative assessment may be a necessary but not sufficient component of instructional quality. To make real progress, Bennett (2011, p. 7) reasons that “well-designed and implemented formative assessment should be able to suggest how instruction should be modified, as well as suggest impressionistically to the teacher what students know and can do.”

We at are big believers in the power of formative assessment, but are also critical consumers of education research. While formative assessment has been an effective practice in some situations, it’s not always successful. Another series of posts describes the conditions that must be met to ensure the effective use of classroom assessment. In our next post, we’ll discuss the role of principals in supporting teachers’ use of formative assessment.


More References:

Bennett, R. E. (2011). Formative assessment: a critical review. Assessment in Education: Principles, Policy & Practice, 18, 5 – 25.

Black, P., & Wiliam, D. (1998). Assessment and Classroom Learning. Assessment in Education: Principles, Policy & Practice, (5), 7–74.

Kingston, N., & Nash, B. (2011). Formative Assessment: A Meta-Analysis and a Call for Research. Educational Measurement: Issues and Practice30(4), 28–37.

Lang, L. B., Schoen, R. R., LaVenia, M., & Oberlin, M. (2014). Mathematics Formative Assessment System–Common Core State Standards: A Randomized Field Trial in Kindergarten and First Grade. Society for Research on Educational Effectiveness. Retrieved from

About the Authors:

Dr. Cara Jackson: Dr. Cara Jackson is an evaluation support specialist in the Office of Shared Accountability of Montgomery County Public Schools in Maryland.

Dr. Amelia Wenk Gotwals: Dr. Amelia Wenk Gotwals is an associate professor in the Department of Teacher Education at Michigan State University in East Lansing, MI.

Dr. Beth Tarasawa: Dr. Beth Tarasawa is the manager for Education Research Partnerships at Northwest Evaluation Association in Portland, OR.

Cara Jackson

Cara Jackson is an evaluation specialist in the Applied Research unit in Montgomery County Public School’s Office of Shared Accountability, where she designs studies to inform district policies and conducts research on a variety of educational initiatives. Prior to joining MCPS, she was the Assistant Director of Research and Evaluation and a Strategic Data Project fellow at Urban Teachers, a residency-based teacher preparation program. In that role, she oversaw the development and refinement of a multiple-measure evaluation system for novice teachers and conducted ongoing research on the quality of evaluative measures. Cara earned a doctorate in education policy at the University of Maryland, College Park along with an advanced certificate in education measurement and statistics. Prior to her graduate work, she was a senior policy analyst at GAO (U.S. Government Accountability Office) conducting studies of education-related policies. She also worked as a classroom teacher in the Bronx through Teach for America.

Amelia Gotwals

Beth Tarasawa