Chasing the Golden Unicorn

Chasing the Golden Unicorn: A Critique of Objectivist Educational Research

The following is an edited version of a paper submitted as coursework for the Educational Psychology program at the University of New Mexico in May 2012.

There’s an old joke. Two elderly women are at a Catskill mountain resort. And one of them says, “Boy the food at this place is really terrible.” And the other one says, “Yeah, I know, and such small portions” (Joffe & Allen, 1977).

Golden Unicorn: Click for Image SourceWith the successful completion of this and one other course, I will complete the requirements for the Master of Arts program in Educational Psychology at the University of New Mexico (UNM). Two years ago in my application for admission to the program, I articulated my generalized academic interest in a single question, “Are educational practices consistent with cross-disciplinary knowledge?” I admit the question was asked from a skeptical viewpoint.

In this paper, I want to discuss “educational practices” in an expanded context that includes educational research. I realize that “educational research” generalizes a broad spectrum of differing worldviews regarding objectives, methods, and consequences. For the purposes of this paper, I will adopt the “philosophical continuum … anchored by an objectivist worldview on one end and a subjectivist worldview on the other” (Marley & Levin, 2011, p.198). My comments are directed specifically to the objectivist end of that continuum because it is this worldview that currently prevails in setting the educational research and policy agenda in the United States, as evidenced by federal legislation (Eisenhart & Towne, 2003), federally-funded clearinghouses for educational research (Slavin, 2008; Whitehurst, 2003), and leading publications in the field (Hsieh et al., 2005).

I contend that educational research is open to criticism due to three problematic issues which each result from a lack of awareness, or inadequate consideration, of inherent constraints and limitations:

  1. the commitment to an empirical, experimental approach to educational research is based on a belief;
  2. educational research has adopted standards and created expectations that cannot be achieved; and
  3. educational research ignores or under-appreciates the degree to which it does not exist in isolation from the rest of the environment in which it is situated.

This paper discusses each of these issues, then concludes with a summary.

1. Educational research is rooted in faith in science, not evidence.

If there is a linguistic triad upon which educational research rests, it surely must include some form of the terms scientific, evidence, and empirical. E.L. Thorndike is generally regarded as the first consequential educational psychologist. His pronouncement in 1906 that the success of teaching depends upon the profession’s embrace of scientific methods and investigation provides the cornerstone for the current orthodoxy in educational thinking (Mayer, 2005). This evangelical spirit lives on in the profession’s devotion to the notions ascribed to all things scientific, evidence based, and empirical.

One problem with such specific declarations of faith, however, is that they inevitably invite the question, how well do they withstand self-reflexive scrutiny? In this case, we can ask of Thorndike and those who invoke the spirit of his assertion, how do you know? Is this fundamental belief in evidence-based, scientific, and empirical investigation derived from actual evidence-based, scientific, and empirical investigation? To my knowledge, the answer is no. Whether or not one finds this problematic is likely inversely proportional to the degree one believes it. After all, philosophically one must start somewhere, even if the foundation rests on a profession of faith rather than a demonstration of fact. (It might make for an interesting graduate study to consider how one could empirically test Thorndike’s assertion.)

My criticism is not with Thorndike’s faith in science but with the largely uncritical and singular focus on ‘scientific evidence‘ that now drives the debate on educational research, policy, and funding. Within the educational literature, the three privileged terms mentioned previously have become de rigueur catch phrases that, like a tariff, must be paid their due in any serious discussion about education lest authors expose themselves to immediate rebuke. The ubiquity of these terms has so influenced the educational conversation that, from my standpoint, they are well on their way to becoming the first of R.E. Mayer’s six obstacles to educational reform — sloganeering (Mayer, 2005).

The flip side of indiscriminately exalting science in educational research is an unwarranted disparagement of all non-evidence-based practices (such as teachers sharing best practices and lessons learned from experience), which must be presumed to be included in such dismissive characterizations as superstitions, fancies, unverified guesses, and opinions (Mayer, 2005; Biesta, 2007).

A consequence of this faith in the applicability of scientific method to educational research is the general acceptance that the most scientific of methods — experiments using random assignment — represents the “gold standard” (Biesta, 2007; Hsieh et al., 2005; Maxwell, 2004). This inevitably results in a mindset that presumes an it, a singular right or best intervention or technique that lies beyond merely better. Therefore educational research, as advocated by its most ardent and rigor-focused adherents, is guided more by the quest for the golden best rather than incremental and continual improvements. Is this quest for the best reasonable?

2. Educational research cannot achieve its own self-professed standards or expectations.

A second fundamental belief about educational research espoused by Thorndike is quoted by Dr. Joel Levin, a leading educational psychologist of the past four decades. Regarding the comparison of two teaching techniques, “… all that is required … is for enough teachers to use both methods with enough different classes, keeping everything else except the method constant …” (Levin, 2004, p. 173). This simple prescription suggests a simple, straightforward, common sensical approach to empirically answering the question, which technique is best?

How well does this simple prescription hold up in practice, according to the current standards of rigorous educational research? Not very well, according to Dr. Levin and Dr. Scott Marley (my academic adviser at UNM). They argue that prescriptive statements, defined as “recommendation(s) that, if a course of action is taken, then a desirable outcome will likely occur,” are “hardly ever” justified (Marley & Levin, 2011, p. 205).

This seems to represent an irreconcilable catch by educational researchers. On the one hand, the only hope for achieving evidence-based results is strict adherence to a rigorous scientific research protocol. On the other hand, if one adheres to that scientific rigor, one can “hardly ever” claim actionable results. This leads to something of a conundrum — what is the legitimacy of a prescribed method that purports to produce prescriptive results, if those results are judged to not warrant judgment as prescriptions? Does this not logically cast doubt on the prescribed method itself?

The field of educational research seems to be ensnared in a trap of its own making. Even as there are calls for more empirical research on educational interventions (Hsieh et al., 2005), there are concerns expressed about deficiencies in the rigor of the research that is done (Hargreaves, 1997) and doubts cast as to the relevance and applicability of the results of this research (Biesta, 2007; Lagemann, 2008). A skeptic might point out that in some sense, the thought-leaders in educational research have created a game with such stringent rules that few attempt to play and fewer still can play successfully. The frustrated exasperation of Casey Stengel, manager of the hapless 1962 New York Mets, comes to mind: can’t anybody here play this game? (Breslin & Veeck, 2002).

I can understand how, in the pursuit of respectability, the leaders of an academic field must set and maintain high standards. If you want to be perceived as being scientific, then by necessity you have to manifest some of the attitudes and practices generally recognized as scientific. However, I argue that in their zeal for credibility, the advocates for, and adherents to, educational research (again, those tending to the objectivist end of the philosophical continuum) may have “jumped the shark,” so to speak, in two respects.

First, they seem to discount a critical assumption in Thorndike’s advocacy of classroom intervention testing quoted by Levin: keeping everything else except the method constant (Levin, 2004, p. 173). Rephrasing the question posed by Marley and Levin (Marley & Levin, 2011), I think it’s fair to ask, “When in educational research is everything else except the method constant?” Barring an overly-lenient definition of what constitutes everything or constant, the answer must surely transcend “hardly ever” to an emphatic, “never!” Indeed, this practical impossibility underlies external validity problems related to generalizing research results to both the target population and in different academic settings (Marley & Levin, 2011).

The neglect of this inherent limitation, that everything else can never be equal, is reflected in the educational slogan of choice over the past decade — what works. The educational literature is replete with references to what works (Biesta, 2007; Sanderson, 2003; Slavin, 2004; Slavin, 2008), no doubt because the federal government’s online repository for educational research is called the What Works Clearinghouse (Eisenhart & Towne, 2003; Whitehurst, 2003). The simple-minded slogan is by itself problematic, however, in that it implicitly presumes that what worked somewhere in the past will work everywhere in the future. No matter how many caveats or qualifiers are used to truthfully temper claims of what works — with whom, where, when, under what conditions, and why (Biesta, 2007; Evans & Benefield, 2001; Hargreaves, 1997; Sanderson, 2003; Slavin, 2004) — the slogan itself remains a cornerstone to the overall mindset that drives research goals and expectations … what works. Period, full stop.

Secondly, the proponents of more objectivist educational research forget that, especially in the research setting, the subjects are always human beings. This may sound like a silly and unwarranted criticism since, of course, educational research involves teachers and students. That fact is responsible for establishing requirements for Institutional Review Boards (IRBs) to oversee all research with human subjects.

Beyond the IRB, however, the research language employs abstractions of verbal and psychological constructions, intervention methodologies, and statistical determinations of significance. Independent variables are defined not in terms of teachers, but intervention techniques; dependent variables aren’t the students, but the assessment of their behaviors measured according to some kind of symbolized score. Teachers are not considered variable individuals themselves — they are simply interchangeable implementation vessels assigned to the experiment or control condition. Students are likewise simply data points differentiated solely by their assessment scores and only to the point where they (their scores) are aggregated into statistical means, deviations, significance levels, and effect sizes.

This presumed, elementalistic verbal separation of teaching technique from an individual teacher, and student performance from an individual student, can be contrasted to a different field of research that educational researchers like to point to as analogous — clinical trials (Eisenhart & Towne, 2003; Evans & Benefield, 2001; Hsieh et al., 2005; Mayer, 2003). While there are some challenges to the relevance of this comparison (Evans & Benefield, 2001), the most obvious difference to me is more basic. In clinical trials, the dependent variable consists of biological material in some form, such as a protein, enzyme, chemical, or bacteria. The change in the dependent variable can generally be measured to fine degrees of precisions. In educational research, the dependent variable consists of the cognitively-enabled neurological behaviors of a human being, symbolized to some scored form. Of course, these cognitive behaviors also have a substrate of biological material at the neuronal level, but currently there are no means to measure the effects of an intervention at the level of the actual change; assessments are limited to symbols that suggest the effects of change such as scores on a test or observed behavioral demonstrations. It stands to reason, then, that this difference should temper any attempts to draw meaningful comparisons between the two domains. I would argue that this difference is central to the many “uncontrollable variables” that differentiate the so-called hard and soft sciences (Diamond, 1987).

3. Educational research does not exist in isolation.

The difficulties facing educational research addressed thus far in this paper should not surprise anyone who acknowledges that “educational problems are, by definition, multifaceted and complex” (Lagemann, 2008, p. 424). What has surprised me during the course of my two-year study of educational psychology is the degree to which some researchers (and Department of Education policy-makers) are committed to the belief that education lends itself to scientific study – and methods – on a par with medicine (Mayer, 2003; Whitehurst, 2003). Such a belief can only be sustained if one presumes the external, or ecological, factors that influence educational research are the same as, or comparable to, those that affect clinical trials.

An inconvenient reality in educational research is that the more significant the matter under investigation, the greater the chance that considerations other than the research results will influence decisions. A perfect example of this was illustrated in a recent PBS Newshour report on how a Colorado high school was teaching climate science. A science teacher, Cheryl Manning, explained how, after being challenged by both students and parents about the legitimacy of scientific claims about climate change, she altered her classroom approach to rely on questions to students and student activities in which they would research and collect reported data on their own. Her technique had proven effective, and the report ended with a note that Manning “is now sharing the lessons she has learned with other teachers through online and in-person workshops throughout the country” (Sreenivasan, 2012).

Two aspects of this report run afoul of advocates for rigorous educational research. Manning’s experience is not evidence-based, therefore it amounts (in their view) as nothing more than opinion, hearsay, or fancy. And if there were evidence-based empirical research that demonstrably documented a “gold standard” intervention technique for teaching climate change in high schools, it would probably be trumped by the religious and political influences in the school district. Schools don’t live in an isolated academic bubble; neither do the fruits of educational research, meager as they are.

Religious and political factors are not the only environmental considerations that educators must consider in determining whether or not to implement a particular intervention or program. They must contend with local factors such as budget constraints, administration politics, parent advocates’ and/or activists’ concerns, state mandates, and ideological influences on textbook selections. These are in addition to, and compounded by, national “social, economic, political, and moral-ethical factors” (Lin et al., 2010).

Another factor that speaks to challenges encountered in educational research is the difficulty in replicating studies “due to a number of contextual factors and the lack of researcher control over these factors” (Lin et al., 2010). As a result, despite an acknowledged need for replication to provide support for generalizability, in an analysis of manuscripts submitted to the Journal of Teacher Education, “very few were self-categorized by authors as replications of previous studies” (Lin et al., 2010).

In short, educational decision-makers at all levels do not have the luxury of simply asking, “what’s the best evidence-based curriculum for early readers,” going to a Consumer Reports-type clearinghouse to find which curriculum is top-rated, and then implementing it. At best, the results of educational research can inform the decision-making process but they cannot eliminate the need for prudent judgment and consideration for the myriad other factors that confront policy makers, administrators, and teachers.

Concluding Summary

I have criticized the prevailing thinking of the objectivist end of the educational research continuum in three areas that, from my knothole, represent inherent constraints and limitations on research that are not generally acknowledged: 1) its faith-based commitment to evidence-based research; 2) its unrealistic standards and expectations; and 3) its insular focus that does not appropriately acknowledge the broader context or environment in which educational research is situated.

I should make explicit, in case my arguments have not been as finely targeted as intended, that my criticisms are not directed toward science or the desire to apply scientific methods to the field of educational research. My criticism arises from an impression that the pendulum has swung far from a center of balance, especially in the decade since No Child Left Behind. During this decade, the ideals of “evidence based” and “scientific” research have, for the most part, marginalized other methods or practices without respect to whether or not they offer the potential of “working.” The mantra of “what works” has subtly shifted from focusing on the results of a program or intervention, to assessing the methodology by which the program or intervention has been scientifically or empirically tested. The evidence of interest has become more about the research process and less about the results. Initially driven by deficiencies in student achievement, the focus of the debate (or more correctly, a debate) across the range of the objectivist|subjectivist educational continuum now seems more preoccupied with perceived deficiencies in research paradigms and methodologies than actual student performance.

Consider the mixed results of the federal Internet-based What Works Clearinghouse (WWC), funded in 2002 by the Department of Energy’s Institute of Education Sciences as the “single place for policy makers, practitioners, and parents to turn for information on what works in education” (Whitehurst, 2003). Operating with an initial 5-year contract worth $26 million, the WWC did not produce any “product” until 2004, causing critics to derisively refer to it as the “nothing works clearinghouse” (Viadero, 2006). A Government Accounting Office report in 2010 documented a litany of concerns with the WWC, including the fact that the WWC had failed to establish any cost/benefit metrics; its screen criteria excluded “some rigorous research designs that may be appropriate;” the dissemination of its results to states and school districts was not timely or adequate; and the WWC had failed to disclose potential conflicts of interest with respect to research that had been provided by textbook publishers and authors who stood to benefit financially (U.S. Government Accountability Office, 2010). After ten years, $80 million, and two different contractors, the WWC now offers online reviews of 275 different interventions with these effectiveness ratings:

  • Mixed Effects – 13
  • No Discernible Effects – 118
  • Potentially Negative Effects – 5
  • Potentially Positive Effects – 110
  • Positive Effects – 29

The WWC performance has proved so lackluster that competition has emerged, such as the Best Evidence Encyclopedia (BEE), also funded by the Institute of Education Sciences (Slavin, 2008). One can imagine that in a few more years, we’ll need a clearinghouse just to review the performances of all the federally-funded “one-stop” clearinghouses.

The experience of the WWC represents a telling indictment that the train of thought that began with Thorndike’s profession of faith in the scientific method has derailed due to the single-minded, or mindless, attempts to hold educational research to inappropriate standards. In the pursuit of a theoretical gold standard, more prosaic thinking towards the ideal of continual improvement has been preemptively disparaged. At some point, the most ardent proponents of rigorous evidence based research ought to recognize their own scientific imperative to test the actual results of their convictions: what evidence is there that this philosophy has successfully produced compelling evidence to guide educational decision-makers and educators?

While we wait for that day of reckoning, I’ll continue to find analogous relevance in the opening scene from Woody Allen’s Oscar-winning Annie Hall that prefaces this paper. For until there is evidence that evidence-based research in education can be added to the short list of “what works,” educators can expect more disagreement and dissatisfaction and argument about what counts as research, how research should be conducted, how the results are not conclusive or implementable or applicable to certain situations … and yet even with problematic evidence that what purportedly works doesn’t necessarily …  there’s just not enough of it.

References

Biesta, G. (2007). Why “what works” won’t work: Evidence-based practice and the democratic deficit in educational research. Educational Theory, 57(1), 1-22.

Breslin, J. & Veeck, B. (2002). Can’t anybody here play this game? The improbable saga of the New York Mets’ first year. Chicago: Ivan R. Dee.

Diamond, J. (1987). Soft sciences are often harder than hard sciences. Discover, 1987 (August), 34-37.

Eisenhart, M. & Towne, L. (2003). Contestation and change in national policy on “scientifically based” education research. Educational Researcher, 32(7), 31-38.

Evans J. & Benefield, P. (2001). Systematic reviews of educational research: does the medical model fit? British Educational Research Journal, 27(5), 527-541.

Hargreaves, D.H. (1997). In defence of research for evidence-based teaching: A rejoinder to Martyn Hammersley. British Educational Research Journal, 23(4), 405-419.

Hsieh, P-H., Taylor, A., Chung, W-H., Hsieh, Y-P., Kim, H., Thomas, G.D., et al. (2005). Is educational intervention research on the decline? Journal of Educational Psychology, 97(4), 523-529.

Joffe, C.H. (Producer) & Allen, W. (Director). (1977). Annie Hall. United States: MGM.

Lagemann, E.C. (2008). Education research as a distributed activity across universities. Educational Researcher, 37(7), 424-428.

Levin, J.R. (2004). Random thoughts on the (in)credibility of educational-psychological intervention research. Educational Psychologist, 39(3), 173-184.

Lin, E., Wang, J., Klecka, C.L., Odell, S.J., & Spalding, E. (2010). Judging research in teacher education. Journal of Teacher Education, 61(4), 295-301.

Marley, S.C. & Levin, J.R. (2011). When are prescriptive statements in educational research justified? Educational Psychology Review, 23(2), 197-206.

Maxwell, J.A. (2004). Causal explanation, qualitative research, and scientific inquiry in education. Educational Researcher, 33(2), 3-11.

Mayer, R.E. (2003). Learning environments: The case for evidence-based practice and issue-driven research. Educational Psychology Review, 15(4), 359-366.

Mayer, R. E. (2005). The failure of educational research to impact educational reform: Six obstacles to educational reform. In G. D. Phye, D. H. Robinson, & J. R. Levin (Eds.), Empirical methods for evaluating educational interventions (pp. 67–81). San Diego: Elsevier Academic Press.

Sanderson, I. (2003). Is it ‘what works’ that matters? Evaluation and evidence-based policy making. Research Papers in Education, 18(4), 331-345.

Slavin, R.E. (2004). Education research can and must address “what works” questions. Educational Researcher, 33(1), 27-28.

Slavin, R.E. (2008). Evidence-based reform in education: Which evidence counts? Educational Researcher, 37(1), 47-50.

Sreenivasan, H. (Writer and reporter). (2012). Teachers endure balancing act over climate change curriculum. In PBS Newshour. Washingon, D.C.: Corporation for Public Broadcasting. Retrieved from http://www.pbs.org/newshour/bb/climate-change/jan-june12/teachclimate_05-02.html

U.S. Government Accountability Office. (2010, July). Improved dissemination and timely product release would enhance the usefulness of the What Works Clearinghouse. (Publication No. GAO-10-644). Retrieved from: http://www.gao.gov/products/GAO-10-644

Viadero, D. (2006). ‘One stop’ research shop seen as slow to yield views that educators can use. Education Week, September 26, 2006. Retrieved from http://www.edweek.org/ew/articles/2006/09/27/05whatworks.h26.html?p

Whitehurst, G.J. (2003). Statement of Assistant Secretary Grover J. Whitehurst before the House Subcommittee on Labor/HHS/Education Appropriations on the FY 2004 budget request for the Institute of Education Sciences. Retrieved from http://www2.ed.gov/news/speeches/2003/03/03132003a.html.