From Approved Solution to Dynamic Mosaic

From Approved Solution to Dynamic Mosaic: A Personal Reflection on the Accelerating Evolution of Education

The following is an edited version of a paper submitted as coursework for the Educational Psychology program at the University of New Mexico in May 2012.

Dynamic MosaicTo situate myself by educational era, I learned how to “touch type” in a semester-long elective course in high school. I also learned how to use a slide rule, a technology that was necessary for the more mathematics-intensive courses offered by my small high school in Texas. This technology was the basis for state-wide academic competition sponsored by the Texas University Interscholastic League, just like band, drama, and football.

I did not participate in the slide rule competition, but I continued to develop my own proficiency with the technology throughout my college career at the U.S. Air Force Academy. The first handheld scientific calculator I saw, the HP-35, appeared during my sophomore year, but its cost was prohibitive for me and for most of my classmates. I took the required Computer Science course as a junior, the computer being a Burroughs mainframe computer that was programmed via IBM punch cards. We wrote programs in the ALGOL programming language, tediously typing the commands into the punch cards, then leaving the stack of cards to batch run overnight. The next morning we could pick up a printout that provided the output of our program –  or more likely, the error codes with cryptic explanations for why the program failed to run, such as “undeclared variable in line 17.”

In my math and engineering courses, the instructors employed the instructional technique of “going to the boards.” Each wall in every classroom was covered with a blackboard. A portion of most class periods was spent with the cadets standing at the board, working out assigned problems under the scrutiny of the instructor. At the completion of the drill, he (the Academy’s faculty, as well as the Cadet Wing, was all male at that time) would walk through “the approved solution” to each problem.

Now approaching the 40th anniversary of the day I entered the Air Force Academy, I’ve attended my last class session and am about to graduate from the Educational Psychology Masters program at the University of New Mexico (UNM). The “blackboard” employed in this last class did not use chalk – it’s the brand name of a software Learning Management System (LMS) that was, ironically, displayed to the class by projection onto a whiteboard. Slide rules are now trivia answers or Halloween costume accessories. On most days, I carried two computers to my UNM class, one in a backpack and one in my pocket. The class itself was held in a computer lab with a couple of dozen networked computers available to students. So far as I know, punch cards haven’t been used in decades.

The purpose of this preface, however, is not to reminisce about the good old days but to establish a baseline from which we might consider the mind-blowing (or mind-numbing?) pace of technological change and what it might mean for education. I contend that we are not only well into an indeterminate period of “disruption” with respect to technologies themselves (Conole, de Laat, Dillon, & Darby, 2008; Henn, 2012), but we also find ourselves in great uncertainty about how these disruptive technologies have and will cause us to think differently about virtually all aspects of education.

In this paper I want to discuss how technology may well impact an area of education that is, in my opinion, ripe for widespread and long overdue disruption – research.

The Approved Solution

While my Academy instructors used references to “the approved solution” only in the context of mathematics and engineering problems, I’m going to appropriate the phrase as a metaphor. The phrase here serves to represent a mindset that believes there is always one answer, or approach, or process, that is right, best, only, or simply what works. In philosophical terms, one might say that “the approved solution” succinctly captures the gist of the paradigm or worldview that has been called positivism (Guba & Lincoln, 1994) or objectivism (Marley & Levin, 2011). I believe it is also fair to say that this worldview is the foundational philosophy that underlies the current attitude toward educational research, especially in the decade since No Child Left Behind and the unmistakable bias favoring evidence-based, empirical research (Eisenhart & Towne, 2003).

One of the nation’s leading educational psychologists, Dr. Joel Levin, visited UNM for two days in March 2012. I was fortunate to attend his two lectures. The first, “How to conduct more scientifically credible educational intervention research,” provided a comprehensive summation of the theory and practice for rigorous intervention experiments in educational research. In the second, “Practical points for the prospective professional publisher,” Dr. Levin shared his wisdom and experience regarding the academic publishing business for aspiring researchers who would prefer to publish rather than perish.

Among my many takeaways from these two lectures, I’ll mention two that are relevant to this paper. Regarding rigorous research, I was impressed that valid and reliable research in education is exceedingly difficult to achieve or assess, even among educational researchers. For the average teacher, the results of studies that are published in peer-reviewed journals are not usually understandable or meaningful. As a result, Dr. Levin noted there is a need for “translators” who are skilled in both analyzing research and communicating results to the layperson. Regarding his tips for publishing, he stepped through a typical timeline to illustrate that the process from research idea to publication is much longer than most students realized – almost three years.

This caused me to wonder not only about the timeliness of new research, but also about the age of research that is referenced in new research. To satisfy a curiosity, I compiled the age of references for two datasets from a course on principles of classroom learning, required in the UNM Educational Psychology program. The first dataset included all of the references in the textbook for the course (Bruning, Schraw, and Norby, 2011). The second dataset included a list of 21 articles assigned during the course, including all the references in those articles. Table 1 summarizes the data.

Table 1
Table 1 Age of References

The textbook contained 1,250 unique references. The mean date of publication for the references was 1994, while the median date was 1996. The age at publication data are not included since the book was originally published in 1990 and updated last in 2011. With a median date of publication of 1996, the median age of the textbook’s references is 16 years in 2012. The 21 articles contained a total of 1,307 references, an average of 62 per article with a mean of 48. For the 1,307 references, the mean date of publication was 1986, while the median date was 1996. The mean age at publication was 10.4 years, with a median age at publication of 7.4 years. With a median date of publication of 1989, the median age of the articles’ references is 23 years in 2012.

What do these data mean? I do not suggest any specific interpretation or inference is appropriate, other than these general observations. First, one would expect that the faster a domain is changing and discovering (or creating) new findings, the more recent the references will be in published books and articles. Secondly, the older the references are, one can assume that more of the knowledge in a domain is settled, accepted, and not as susceptible to change or challenge. And thirdly, one might infer from more older references that there is less relevance in more recent research. In this context, therefore, a 3-year wait before the results of the latest educational research are published does not strike me as a concern.

Now I want to return to my takeaway from Dr. Levin’s first lecture and his suggestion that the field of educational research is in need of “translators” to help communicate the results of studies to educators. Spawned and funded by No Child Left Behind legislation, the federal Internet-based What Works Clearinghouse was created in 2002 by the Department of Energy’s Institute of Education Sciences as the “single place for policy makers, practitioners, and parents to turn for information on what works in education” (Whitehurst, 2003). Operating with an initial 5-year contract worth $26 million, the WWC did not produce any “product” until 2004, causing critics to derisively refer to it as the “nothing works clearinghouse” (Viadero, 2006). A Government Accounting Office report in 2010 documented a litany of concerns with the WWC, including the fact that the WWC had failed to establish any cost/benefit metrics; its screening criteria excluded “some rigorous research designs that may be appropriate;” the dissemination of its results to states and school districts was not timely or adequate; and the WWC had failed to disclose potential conflicts of interest with respect to research that had been provided by textbook publishers and authors who stood to benefit financially (U.S. Government Accountability Office, 2010).  After ten years, $80 million, and two different contractors, the WWC now offers online reviews of 275 different interventions with the results shown in Table 2.

Table 2 WWC Summary Results

The Effectiveness column refers to an assessment of the extent to which the intervention achieved the intended outcome, scored as Negative Effects, Potentially Negative Effects, No Discernible Effects, Mixed Effects, Potentially Positive Effects, and Positive Effects. Table 2 indicates only the number of interventions scored as Positive or Potentially Positive. The Extent of Evidence column is an indicator of sample size and number of studies included, which relates to generalizability. The three possible extent indicators are Not Rated, Small, and Medium-Large. This column in Table 2 reflects only the number of interventions with Medium-Large extent of evidence, of those that also had Positive or Potentially Positive Effects.

From a cost-effectiveness standpoint, for its $80 million investment over 10 years, the federal government has so far received 275 intervention reports ($ .291M/report), 139 reports that showed Positive or Potentially Positive Effects ($ .575M/report), and 34 reports that showed Positive or Potentially Positive Effects with a Medium-Large Extent of Evidence ($2.35M/report). Of those 34, only one was in Math and Science. I would also point out that of those 34 “highest rated” interventions, 27 (79%) are commercial textbooks or curriculum offerings that have financial interest in the WWC ratings.

To summarize this “approved solution” mindset in educational research, based on this brief analysis:

  1. it reflects a positivist or objectivist worldview that is beholden to strict and rigorous scientific method;
  2. only research that employs such scientific methods can be deemed to provide evidence on which educational policy and practice decisions should be based;
  3. most results of educational research studies are dated and not understandable by everyday teachers; and
  4. given the 10-year results of the WWC, only a modest number of credible or definitive research results has been reported, and those positive results have come at a high cost.

The Developing Dynamic Mosaic

To attempt any analysis of why technology matters to education is to trivialize it, but I will note a few things to consider.

  • There are families of hardware devices, software services, web or cloud-based capabilities, and applications and platforms that scale from the individual to the enterprise.
  • There are advantages that can be characterized by cost, speed, accessibility (in terms of location, disability, disadvantage), learning modality, pacing, interests, individual need, and institutional scale.
  • The scope of technology applications cross domains and dimensions, from pre-school to adult, formal education to personal enrichment to corporate training, online distance learning to classroom seminars, scheduled synchronous connections to asynchronous on-demand access.

One manifestation of how digital technologies are disrupting education is the degree to which they are disrupting, or influencing, every aspect of our cultural and social lives. Students of all ages who meet even the most rudimentary levels of technology literacy have adopted and integrated these capabilities in their own lives such that “it is central to how they organize and orientate their learning” (Conole, de Laat, Dillon, & Darby, 2008). Our always-on communication devices connect us with friends, family, music, video, games, Facebook, Twitter, email, Flickr, and the rest of the online world all the time. Some educators and researchers want to take this momentum and ride it to an educational philosophy that goes beyond “learning anytime and anywhere” to “learning all the time and everywhere” (Cook, 2012). Just as previous generations sought a work|life balance before idealizing the concept of seamlessly integrating a distinction-less work-as-life, the current generation may be the first to recognize no distinction between living and learning.

The disruption to education has been so tumultuous it has spawned a theory of learning by George Siemens, now affiliated with Athabasca University, Canada’s open university. He has proposed what he calls connectivism, “an integration of principles explored by chaos, network, complexity, and self-organization theories” (Siemens, 2005).

The rate of change has been so swift that the relatively recent incorporation of Learning Management Systems (LMS) as integral components of university online offerings is already viewed by some as an institutional albatross. Rather than the centralized, capable, but cumbersome do-everything system that serves the university, some academics are now advocating for a student-centered approach they call Personal Learning Environments, or PLE (Mott, 2010; Tu, Sujo-Montes, Yeh, Chan, & Blocher, 2012). Tu, Sujo-Montes, Yeh, Chan, and Blocher attribute three characteristics to a PLE: students set their own learning goals, they manage their own learning process and content, and they interact with others throughout their learning process (p. 14). Mott contrasts the PLE from the LMS by describing it as “the educational manifestation of the web’s ‘small pieces loosely joined’.”

I think a better name than “small pieces loosely joined” is dynamic mosaic.

Two examples that illustrate where the momentum of this dynamic mosaic may be heading are the educational initiatives begun by Sebastian Thrun and Salman Khan.

Thrun might be the nearest thing there is to a digital renaissance man. He is the visionary behind Google’s driver-less car project and the new Google glasses. He is also the head of Stanford’s Artificial Intelligence (AI) lab. Last year, he and a colleague, Peter Norvig, offered a free online course in AI through Stanford. They were astounded when over 160,000 people around the world enrolled. Thrun has since given up some of his teaching responsibilities at Stanford and raised venture capital to found udacity.com. The new site offers free, world-class courses that focus primarily on computing-related topics. The philosophy that drives udacity is a commitment to “free online education for everybody” (Henn, 2012).

Salman Khan, founder of the Khan Academy and holder of multiple degrees from MIT and Harvard Business School, consented to help a 13-yeard old cousin with math in 2004. From phone calls to Yahoo chats to crude videos, Khan gradually built a library of short (generally less than 10 minutes) video tutorials covering a variety of classroom lessons and posted them to YouTube. The number of views rapidly climbed to over a million. So began a dizzying ascent to a position of influence in education such that he has attracted funding from the Bill and Melinda Gates Foundation and Google, as well as being profiled by “60 Minutes” and the New York Times.  His YouTube channel is approaching an astonishing 150 million views. His Khan Academy website offers over 3,100 videos where one can “learn almost anything for free.”

In late 2010, Khan used some of his funding to hire developers to create a teacher management system to monitor and direct students in a classroom that uses Khan videos extensively. The students can proceed at their pace while the teacher follows their progress, or their stumbles, on her own monitor. The system also provides real time data on progress, quiz scores, incentive rewards reached, how much time is spent on each module, and other relevant data. That kind of data provides ongoing real time information to the teacher that “old school” researchers must envy.

At the other end of the scale are the data associated with those nearly 150 million viewed tutorial videos. Khan and his associates are now mining that “massive pile of data about how people learn and where they get stuck” (Thompson, 2011). Their objective is to develop algorithms that can be used to tailor or customize lessons for students based on variables such as how many times a video should be viewed before taking a quiz, correlations among results from different subject areas, and how to spot someone who’s stuck on a concept.

More and more individuals and organizations are launching education initiatives based on open source, non-proprietary standards at no, or low, cost. These include a variety of offering agencies that target different constituencies, such as: iTunes U, TED Talks, Connexions, Bill Hammack’s EngineerGuy.com, Google Code U, PBS Teachers, and YouTube Edu.

To summarize the key characteristics of the emerging dynamic mosaic enabled by online technologies:

  • there is increasing focus on the needs and expectations of the learner rather than the provider;
  • platforms cross technologies and modalities;
  • high quality (even world-class) lessons are offered for free; and
  • data on user progress, time on task, and other key metrics are collected and in some cases analyzed in near real-time.

From Approved Solution to Dynamic Mosaic

I close by highlighting what’s missing in the evolution from the educational research mindset of the approved solution to the development of the dynamic mosaic.

In fact, I contend that nothing is missing. So far as I know, the educational research establishment had nothing to do with the development of online technologies. The technologies and capabilities that are beginning to be woven into the ever-developing mosaic of online education did not wait for educational research to give a thumbs-up that online education would or ought to work.

Of course, that’s not to say that the mosaic cannot be improved for future learners. But I will go out on a limb and speculate that the kind of educational research that’s been in search of the “approved solution” is rapidly becoming as obsolete as my old slide rule. When a provider like Salman Khan is collecting real time data on student learning during actual classroom activities, the need for rigorous and expensive experimental trials that produce inconclusive and difficult-to-understand results is obviated.

Some progress just doesn’t want to wait for research.

References

Bruning, R.H., Schraw, G.J., & Norby, M.M. (2011). Cognitive Psychology and Instruction (5th ed.). Boston: Pearson Education.

Conole, G., de Laat, M., Dillon, T., & Darby, J. (2008). ‘Disruptive technologies’, ‘pedagogical innovation’: What’s new? Findings from an in-depth study of students’ use and perception of technology. Computers and Education, 50(2), 511-524.

Cook, V. (2012). Learning everywhere, all the time. The Delta Kappa Gamma Bulletin, 78(3), 48-51.

Eisenhart, M. & Towne, L. (2003). Contestation and change in national policy on “scientifically based” education research. Educational Researcher, 32(7), 31-38.

Guba, E. G., & Lincoln, Y. S. (1994). Competing paradigms in qualitative research. In N. K. Denzin & Y. S. Lincoln (Eds.), Handbook of Qualitative Research (pp. 105–117). Thousand Oaks: Sage Publications.

Henn, S. (January 23, 2012). Stanford takes online schooling to the next academic level. Retrieved from http://www.npr.org/blogs/alltechconsidered/2012/01/23/145645472/stanford-takes-online-schooling-to-the-next-academic-level

Marley, S.C. & Levin, J.R. (2011). When are prescriptive statements in educational research justified? Educational Psychology Review, 23(2), 197-206.

Mott, J. (2010). Envisioning the Post-LMS Era: The Open Learning Network. EDUCAUSE Quarterly, 33(1). Retrieved from http://www.educause.edu/EDUCAUSE+Quarterly/EDUCAUSEQuarterlyMagazineVolum/EnvisioningthePostLMSEraTheOpe/199389

Siemens, G. (April 5, 2005). Connectivism: A Learning Theory for the Digital Age. Retrieved from: http://www.elearnspace.org/Articles/connectivism.htm

Thompson, C. (July 15, 2011). How Khan Academy is changing the rules of education. Wired, August 2011. Retrieved from http://www.wired.com/magazine/2011/07/ff_khan/all/1

Tu, C-H., Sujo-Montes, L., Yeh, C-J., Chan, J-Y., & Blocher, M. (2012). Personal Learning Environments & Open Network Learning Environments. TechTrends, May/June 2012. Retrieved from http://www.springerlink.com/content/36541j5782346770/

U.S. Government Accountability Office. (2010, July). Improved dissemination and timely product release would enhance the usefulness of the What Works Clearinghouse. (Publication No. GAO-10-644). Retrieved from: http://www.gao.gov/products/GAO-10-644

Viadero, D. (2006). ‘One stop’ research shop seen as slow to yield views that educators can use. Education Week, September 26, 2006. Retrieved from  http://www.edweek.org/ew/articles/2006/09/27/05whatworks.h26.html?p

Whitehurst, G.J. (2003). Statement of Assistant Secretary Grover J. Whitehurst before the House Subcommittee on Labor/HHS/Education Appropriations on the FY 2004 budget request for the Institute of Education Sciences. Retrieved from http://www2.ed.gov/news/speeches/2003/03/03132003a.html.

Chasing the Golden Unicorn

Chasing the Golden Unicorn: A Critique of Objectivist Educational Research

The following is an edited version of a paper submitted as coursework for the Educational Psychology program at the University of New Mexico in May 2012.

There’s an old joke. Two elderly women are at a Catskill mountain resort. And one of them says, “Boy the food at this place is really terrible.” And the other one says, “Yeah, I know, and such small portions” (Joffe & Allen, 1977).

Golden Unicorn: Click for Image SourceWith the successful completion of this and one other course, I will complete the requirements for the Master of Arts program in Educational Psychology at the University of New Mexico (UNM). Two years ago in my application for admission to the program, I articulated my generalized academic interest in a single question, “Are educational practices consistent with cross-disciplinary knowledge?” I admit the question was asked from a skeptical viewpoint.

In this paper, I want to discuss “educational practices” in an expanded context that includes educational research. I realize that “educational research” generalizes a broad spectrum of differing worldviews regarding objectives, methods, and consequences. For the purposes of this paper, I will adopt the “philosophical continuum … anchored by an objectivist worldview on one end and a subjectivist worldview on the other” (Marley & Levin, 2011, p.198). My comments are directed specifically to the objectivist end of that continuum because it is this worldview that currently prevails in setting the educational research and policy agenda in the United States, as evidenced by federal legislation (Eisenhart & Towne, 2003), federally-funded clearinghouses for educational research (Slavin, 2008; Whitehurst, 2003), and leading publications in the field (Hsieh et al., 2005).

I contend that educational research is open to criticism due to three problematic issues which each result from a lack of awareness, or inadequate consideration, of inherent constraints and limitations:

  1. the commitment to an empirical, experimental approach to educational research is based on a belief;
  2. educational research has adopted standards and created expectations that cannot be achieved; and
  3. educational research ignores or under-appreciates the degree to which it does not exist in isolation from the rest of the environment in which it is situated.

This paper discusses each of these issues, then concludes with a summary.

1. Educational research is rooted in faith in science, not evidence.

If there is a linguistic triad upon which educational research rests, it surely must include some form of the terms scientific, evidence, and empirical. E.L. Thorndike is generally regarded as the first consequential educational psychologist. His pronouncement in 1906 that the success of teaching depends upon the profession’s embrace of scientific methods and investigation provides the cornerstone for the current orthodoxy in educational thinking (Mayer, 2005). This evangelical spirit lives on in the profession’s devotion to the notions ascribed to all things scientific, evidence based, and empirical.

One problem with such specific declarations of faith, however, is that they inevitably invite the question, how well do they withstand self-reflexive scrutiny? In this case, we can ask of Thorndike and those who invoke the spirit of his assertion, how do you know? Is this fundamental belief in evidence-based, scientific, and empirical investigation derived from actual evidence-based, scientific, and empirical investigation? To my knowledge, the answer is no. Whether or not one finds this problematic is likely inversely proportional to the degree one believes it. After all, philosophically one must start somewhere, even if the foundation rests on a profession of faith rather than a demonstration of fact. (It might make for an interesting graduate study to consider how one could empirically test Thorndike’s assertion.)

My criticism is not with Thorndike’s faith in science but with the largely uncritical and singular focus on ‘scientific evidence‘ that now drives the debate on educational research, policy, and funding. Within the educational literature, the three privileged terms mentioned previously have become de rigueur catch phrases that, like a tariff, must be paid their due in any serious discussion about education lest authors expose themselves to immediate rebuke. The ubiquity of these terms has so influenced the educational conversation that, from my standpoint, they are well on their way to becoming the first of R.E. Mayer’s six obstacles to educational reform — sloganeering (Mayer, 2005).

The flip side of indiscriminately exalting science in educational research is an unwarranted disparagement of all non-evidence-based practices (such as teachers sharing best practices and lessons learned from experience), which must be presumed to be included in such dismissive characterizations as superstitions, fancies, unverified guesses, and opinions (Mayer, 2005; Biesta, 2007).

A consequence of this faith in the applicability of scientific method to educational research is the general acceptance that the most scientific of methods — experiments using random assignment — represents the “gold standard” (Biesta, 2007; Hsieh et al., 2005; Maxwell, 2004). This inevitably results in a mindset that presumes an it, a singular right or best intervention or technique that lies beyond merely better. Therefore educational research, as advocated by its most ardent and rigor-focused adherents, is guided more by the quest for the golden best rather than incremental and continual improvements. Is this quest for the best reasonable?

2. Educational research cannot achieve its own self-professed standards or expectations.

A second fundamental belief about educational research espoused by Thorndike is quoted by Dr. Joel Levin, a leading educational psychologist of the past four decades. Regarding the comparison of two teaching techniques, “… all that is required … is for enough teachers to use both methods with enough different classes, keeping everything else except the method constant …” (Levin, 2004, p. 173). This simple prescription suggests a simple, straightforward, common sensical approach to empirically answering the question, which technique is best?

How well does this simple prescription hold up in practice, according to the current standards of rigorous educational research? Not very well, according to Dr. Levin and Dr. Scott Marley (my academic adviser at UNM). They argue that prescriptive statements, defined as “recommendation(s) that, if a course of action is taken, then a desirable outcome will likely occur,” are “hardly ever” justified (Marley & Levin, 2011, p. 205).

This seems to represent an irreconcilable catch by educational researchers. On the one hand, the only hope for achieving evidence-based results is strict adherence to a rigorous scientific research protocol. On the other hand, if one adheres to that scientific rigor, one can “hardly ever” claim actionable results. This leads to something of a conundrum — what is the legitimacy of a prescribed method that purports to produce prescriptive results, if those results are judged to not warrant judgment as prescriptions? Does this not logically cast doubt on the prescribed method itself?

The field of educational research seems to be ensnared in a trap of its own making. Even as there are calls for more empirical research on educational interventions (Hsieh et al., 2005), there are concerns expressed about deficiencies in the rigor of the research that is done (Hargreaves, 1997) and doubts cast as to the relevance and applicability of the results of this research (Biesta, 2007; Lagemann, 2008). A skeptic might point out that in some sense, the thought-leaders in educational research have created a game with such stringent rules that few attempt to play and fewer still can play successfully. The frustrated exasperation of Casey Stengel, manager of the hapless 1962 New York Mets, comes to mind: can’t anybody here play this game? (Breslin & Veeck, 2002).

I can understand how, in the pursuit of respectability, the leaders of an academic field must set and maintain high standards. If you want to be perceived as being scientific, then by necessity you have to manifest some of the attitudes and practices generally recognized as scientific. However, I argue that in their zeal for credibility, the advocates for, and adherents to, educational research (again, those tending to the objectivist end of the philosophical continuum) may have “jumped the shark,” so to speak, in two respects.

First, they seem to discount a critical assumption in Thorndike’s advocacy of classroom intervention testing quoted by Levin: keeping everything else except the method constant (Levin, 2004, p. 173). Rephrasing the question posed by Marley and Levin (Marley & Levin, 2011), I think it’s fair to ask, “When in educational research is everything else except the method constant?” Barring an overly-lenient definition of what constitutes everything or constant, the answer must surely transcend “hardly ever” to an emphatic, “never!” Indeed, this practical impossibility underlies external validity problems related to generalizing research results to both the target population and in different academic settings (Marley & Levin, 2011).

The neglect of this inherent limitation, that everything else can never be equal, is reflected in the educational slogan of choice over the past decade — what works. The educational literature is replete with references to what works (Biesta, 2007; Sanderson, 2003; Slavin, 2004; Slavin, 2008), no doubt because the federal government’s online repository for educational research is called the What Works Clearinghouse (Eisenhart & Towne, 2003; Whitehurst, 2003). The simple-minded slogan is by itself problematic, however, in that it implicitly presumes that what worked somewhere in the past will work everywhere in the future. No matter how many caveats or qualifiers are used to truthfully temper claims of what works — with whom, where, when, under what conditions, and why (Biesta, 2007; Evans & Benefield, 2001; Hargreaves, 1997; Sanderson, 2003; Slavin, 2004) — the slogan itself remains a cornerstone to the overall mindset that drives research goals and expectations … what works. Period, full stop.

Secondly, the proponents of more objectivist educational research forget that, especially in the research setting, the subjects are always human beings. This may sound like a silly and unwarranted criticism since, of course, educational research involves teachers and students. That fact is responsible for establishing requirements for Institutional Review Boards (IRBs) to oversee all research with human subjects.

Beyond the IRB, however, the research language employs abstractions of verbal and psychological constructions, intervention methodologies, and statistical determinations of significance. Independent variables are defined not in terms of teachers, but intervention techniques; dependent variables aren’t the students, but the assessment of their behaviors measured according to some kind of symbolized score. Teachers are not considered variable individuals themselves — they are simply interchangeable implementation vessels assigned to the experiment or control condition. Students are likewise simply data points differentiated solely by their assessment scores and only to the point where they (their scores) are aggregated into statistical means, deviations, significance levels, and effect sizes.

This presumed, elementalistic verbal separation of teaching technique from an individual teacher, and student performance from an individual student, can be contrasted to a different field of research that educational researchers like to point to as analogous — clinical trials (Eisenhart & Towne, 2003; Evans & Benefield, 2001; Hsieh et al., 2005; Mayer, 2003). While there are some challenges to the relevance of this comparison (Evans & Benefield, 2001), the most obvious difference to me is more basic. In clinical trials, the dependent variable consists of biological material in some form, such as a protein, enzyme, chemical, or bacteria. The change in the dependent variable can generally be measured to fine degrees of precisions. In educational research, the dependent variable consists of the cognitively-enabled neurological behaviors of a human being, symbolized to some scored form. Of course, these cognitive behaviors also have a substrate of biological material at the neuronal level, but currently there are no means to measure the effects of an intervention at the level of the actual change; assessments are limited to symbols that suggest the effects of change such as scores on a test or observed behavioral demonstrations. It stands to reason, then, that this difference should temper any attempts to draw meaningful comparisons between the two domains. I would argue that this difference is central to the many “uncontrollable variables” that differentiate the so-called hard and soft sciences (Diamond, 1987).

3. Educational research does not exist in isolation.

The difficulties facing educational research addressed thus far in this paper should not surprise anyone who acknowledges that “educational problems are, by definition, multifaceted and complex” (Lagemann, 2008, p. 424). What has surprised me during the course of my two-year study of educational psychology is the degree to which some researchers (and Department of Education policy-makers) are committed to the belief that education lends itself to scientific study – and methods – on a par with medicine (Mayer, 2003; Whitehurst, 2003). Such a belief can only be sustained if one presumes the external, or ecological, factors that influence educational research are the same as, or comparable to, those that affect clinical trials.

An inconvenient reality in educational research is that the more significant the matter under investigation, the greater the chance that considerations other than the research results will influence decisions. A perfect example of this was illustrated in a recent PBS Newshour report on how a Colorado high school was teaching climate science. A science teacher, Cheryl Manning, explained how, after being challenged by both students and parents about the legitimacy of scientific claims about climate change, she altered her classroom approach to rely on questions to students and student activities in which they would research and collect reported data on their own. Her technique had proven effective, and the report ended with a note that Manning “is now sharing the lessons she has learned with other teachers through online and in-person workshops throughout the country” (Sreenivasan, 2012).

Two aspects of this report run afoul of advocates for rigorous educational research. Manning’s experience is not evidence-based, therefore it amounts (in their view) as nothing more than opinion, hearsay, or fancy. And if there were evidence-based empirical research that demonstrably documented a “gold standard” intervention technique for teaching climate change in high schools, it would probably be trumped by the religious and political influences in the school district. Schools don’t live in an isolated academic bubble; neither do the fruits of educational research, meager as they are.

Religious and political factors are not the only environmental considerations that educators must consider in determining whether or not to implement a particular intervention or program. They must contend with local factors such as budget constraints, administration politics, parent advocates’ and/or activists’ concerns, state mandates, and ideological influences on textbook selections. These are in addition to, and compounded by, national “social, economic, political, and moral-ethical factors” (Lin et al., 2010).

Another factor that speaks to challenges encountered in educational research is the difficulty in replicating studies “due to a number of contextual factors and the lack of researcher control over these factors” (Lin et al., 2010). As a result, despite an acknowledged need for replication to provide support for generalizability, in an analysis of manuscripts submitted to the Journal of Teacher Education, “very few were self-categorized by authors as replications of previous studies” (Lin et al., 2010).

In short, educational decision-makers at all levels do not have the luxury of simply asking, “what’s the best evidence-based curriculum for early readers,” going to a Consumer Reports-type clearinghouse to find which curriculum is top-rated, and then implementing it. At best, the results of educational research can inform the decision-making process but they cannot eliminate the need for prudent judgment and consideration for the myriad other factors that confront policy makers, administrators, and teachers.

Concluding Summary

I have criticized the prevailing thinking of the objectivist end of the educational research continuum in three areas that, from my knothole, represent inherent constraints and limitations on research that are not generally acknowledged: 1) its faith-based commitment to evidence-based research; 2) its unrealistic standards and expectations; and 3) its insular focus that does not appropriately acknowledge the broader context or environment in which educational research is situated.

I should make explicit, in case my arguments have not been as finely targeted as intended, that my criticisms are not directed toward science or the desire to apply scientific methods to the field of educational research. My criticism arises from an impression that the pendulum has swung far from a center of balance, especially in the decade since No Child Left Behind. During this decade, the ideals of “evidence based” and “scientific” research have, for the most part, marginalized other methods or practices without respect to whether or not they offer the potential of “working.” The mantra of “what works” has subtly shifted from focusing on the results of a program or intervention, to assessing the methodology by which the program or intervention has been scientifically or empirically tested. The evidence of interest has become more about the research process and less about the results. Initially driven by deficiencies in student achievement, the focus of the debate (or more correctly, a debate) across the range of the objectivist|subjectivist educational continuum now seems more preoccupied with perceived deficiencies in research paradigms and methodologies than actual student performance.

Consider the mixed results of the federal Internet-based What Works Clearinghouse (WWC), funded in 2002 by the Department of Energy’s Institute of Education Sciences as the “single place for policy makers, practitioners, and parents to turn for information on what works in education” (Whitehurst, 2003). Operating with an initial 5-year contract worth $26 million, the WWC did not produce any “product” until 2004, causing critics to derisively refer to it as the “nothing works clearinghouse” (Viadero, 2006). A Government Accounting Office report in 2010 documented a litany of concerns with the WWC, including the fact that the WWC had failed to establish any cost/benefit metrics; its screen criteria excluded “some rigorous research designs that may be appropriate;” the dissemination of its results to states and school districts was not timely or adequate; and the WWC had failed to disclose potential conflicts of interest with respect to research that had been provided by textbook publishers and authors who stood to benefit financially (U.S. Government Accountability Office, 2010). After ten years, $80 million, and two different contractors, the WWC now offers online reviews of 275 different interventions with these effectiveness ratings:

  • Mixed Effects – 13
  • No Discernible Effects – 118
  • Potentially Negative Effects – 5
  • Potentially Positive Effects – 110
  • Positive Effects – 29

The WWC performance has proved so lackluster that competition has emerged, such as the Best Evidence Encyclopedia (BEE), also funded by the Institute of Education Sciences (Slavin, 2008). One can imagine that in a few more years, we’ll need a clearinghouse just to review the performances of all the federally-funded “one-stop” clearinghouses.

The experience of the WWC represents a telling indictment that the train of thought that began with Thorndike’s profession of faith in the scientific method has derailed due to the single-minded, or mindless, attempts to hold educational research to inappropriate standards. In the pursuit of a theoretical gold standard, more prosaic thinking towards the ideal of continual improvement has been preemptively disparaged. At some point, the most ardent proponents of rigorous evidence based research ought to recognize their own scientific imperative to test the actual results of their convictions: what evidence is there that this philosophy has successfully produced compelling evidence to guide educational decision-makers and educators?

While we wait for that day of reckoning, I’ll continue to find analogous relevance in the opening scene from Woody Allen’s Oscar-winning Annie Hall that prefaces this paper. For until there is evidence that evidence-based research in education can be added to the short list of “what works,” educators can expect more disagreement and dissatisfaction and argument about what counts as research, how research should be conducted, how the results are not conclusive or implementable or applicable to certain situations … and yet even with problematic evidence that what purportedly works doesn’t necessarily …  there’s just not enough of it.

References

Biesta, G. (2007). Why “what works” won’t work: Evidence-based practice and the democratic deficit in educational research. Educational Theory, 57(1), 1-22.

Breslin, J. & Veeck, B. (2002). Can’t anybody here play this game? The improbable saga of the New York Mets’ first year. Chicago: Ivan R. Dee.

Diamond, J. (1987). Soft sciences are often harder than hard sciences. Discover, 1987 (August), 34-37.

Eisenhart, M. & Towne, L. (2003). Contestation and change in national policy on “scientifically based” education research. Educational Researcher, 32(7), 31-38.

Evans J. & Benefield, P. (2001). Systematic reviews of educational research: does the medical model fit? British Educational Research Journal, 27(5), 527-541.

Hargreaves, D.H. (1997). In defence of research for evidence-based teaching: A rejoinder to Martyn Hammersley. British Educational Research Journal, 23(4), 405-419.

Hsieh, P-H., Taylor, A., Chung, W-H., Hsieh, Y-P., Kim, H., Thomas, G.D., et al. (2005). Is educational intervention research on the decline? Journal of Educational Psychology, 97(4), 523-529.

Joffe, C.H. (Producer) & Allen, W. (Director). (1977). Annie Hall. United States: MGM.

Lagemann, E.C. (2008). Education research as a distributed activity across universities. Educational Researcher, 37(7), 424-428.

Levin, J.R. (2004). Random thoughts on the (in)credibility of educational-psychological intervention research. Educational Psychologist, 39(3), 173-184.

Lin, E., Wang, J., Klecka, C.L., Odell, S.J., & Spalding, E. (2010). Judging research in teacher education. Journal of Teacher Education, 61(4), 295-301.

Marley, S.C. & Levin, J.R. (2011). When are prescriptive statements in educational research justified? Educational Psychology Review, 23(2), 197-206.

Maxwell, J.A. (2004). Causal explanation, qualitative research, and scientific inquiry in education. Educational Researcher, 33(2), 3-11.

Mayer, R.E. (2003). Learning environments: The case for evidence-based practice and issue-driven research. Educational Psychology Review, 15(4), 359-366.

Mayer, R. E. (2005). The failure of educational research to impact educational reform: Six obstacles to educational reform. In G. D. Phye, D. H. Robinson, & J. R. Levin (Eds.), Empirical methods for evaluating educational interventions (pp. 67–81). San Diego: Elsevier Academic Press.

Sanderson, I. (2003). Is it ‘what works’ that matters? Evaluation and evidence-based policy making. Research Papers in Education, 18(4), 331-345.

Slavin, R.E. (2004). Education research can and must address “what works” questions. Educational Researcher, 33(1), 27-28.

Slavin, R.E. (2008). Evidence-based reform in education: Which evidence counts? Educational Researcher, 37(1), 47-50.

Sreenivasan, H. (Writer and reporter). (2012). Teachers endure balancing act over climate change curriculum. In PBS Newshour. Washingon, D.C.: Corporation for Public Broadcasting. Retrieved from http://www.pbs.org/newshour/bb/climate-change/jan-june12/teachclimate_05-02.html

U.S. Government Accountability Office. (2010, July). Improved dissemination and timely product release would enhance the usefulness of the What Works Clearinghouse. (Publication No. GAO-10-644). Retrieved from: http://www.gao.gov/products/GAO-10-644

Viadero, D. (2006). ‘One stop’ research shop seen as slow to yield views that educators can use. Education Week, September 26, 2006. Retrieved from http://www.edweek.org/ew/articles/2006/09/27/05whatworks.h26.html?p

Whitehurst, G.J. (2003). Statement of Assistant Secretary Grover J. Whitehurst before the House Subcommittee on Labor/HHS/Education Appropriations on the FY 2004 budget request for the Institute of Education Sciences. Retrieved from http://www2.ed.gov/news/speeches/2003/03/03132003a.html.

 

Evaluation of the Strong Interest Inventory (SII)

Strong InterestFor more than eight decades, the work of E.K. Strong, Jr., has served as a standard bearer in the field of vocational and educational interest measurement. This evaluation will summarize the history of the instrument, its current form, content, and administration, and evidence of its reliability and validity. It should be noted that the author has twice used earlier versions of the Strong Interest Inventory to inform his own personal career reassessment.

E.K. Strong, Jr., served in World War I as a psychologist who contributed to designing new vocational instruments necessitated by America’s entry into the war. For the first time, the U.S. government faced the urgent need to rapidly mobilize a large fighting force. The emerging field of psychology was called upon to assist the war effort by developing tests to guide military decision-making in quickly assessing a recruit’s fitness for combat service. These early tests provided a general indication as to whether each recruit would best serve as “cooks and which should be members of the cavalry” (Donnay, Morris, Schaubhut, & Thompson, 2004, p. 1).

After the war, Strong continued to develop his thinking on the factors related to vocational interests. Observing the rapid post-war industrialization boom and the accelerated transition of America from a dispersed agricultural economy to a diversified manufacturing economy, Strong’s rigorous scientific research methods (Campbell & Hansen, 1981) were based on two assumptions. First, different occupations tended to attract people with certain personality and psychological attitudes and interests. Second, people who share similar attitudes and interests with those who are happily engaged in a particular occupation will likely find fulfillment in that occupation themselves. In 1927, he published his first instrument, the Strong Vocational Interest Blank (SVIB), which purported to assess both vocational and educational interests.

Beginning in the 1950s, other psychological researchers continued to build on, refine, modify, and update Strong’s work. These included David P. Campbell, who extended Strong’s inventory and analysis such that for almost two decades the assessment was known as the Strong-Campbell Interest Inventory (Campbell & Hansen, 1981). Jo-Ida C. Hansen worked closely with Campbell during this period and continued to contribute research and analysis through the 1990s. John L. Holland’s theories and taxonomies on occupational themes, based on Strong’s work, were incorporated into the assessment in 1985 in the form of the General Occupational Themes (GOT) and the Basic Interest Scales (BIS) (Donnay et al, 2004). A significant milestone in the evolution of the assessment instrument came in 1974 when, reacting to the heightened sensitivities for gender equality and non-discrimination, the contributors merged what had been two different forms for men and women into just one form.

Since its initial publication in 1927, the inventory bearing Strong’s name has been regularly and methodically revised. The current form of the Strong Interest Inventory (SII) was revised in 2004 and documented in the manual edited by Donnay, Morris, Schaubhut, and Thompson (2004), on which the following summary is based.

The SII is published by CPP, Inc., based in Mountain View, California, which claims that more than 70% of all U.S. colleges and universities use it to assist students in identifying their educational and vocational interests (CPP, Inc., 2009). Different forms of the instrument are tailored for high school students, college students, professionals, as well as for individuals in different countries across the globe. There are also versions of the inventory that are sold and administered in conjunction with the Strong Skills Confidence Inventory and the Myers-Briggs Type Indicator. The per-item cost of the SII begins at $7.85 for the college form in quantities of more than 500, with higher costs for different forms and for lesser quantities. The manual and a variety of evaluation reports and guides are also available from the publisher. Online as well as paper versions of the different forms are available for both the 1994 and 2004 editions of the SII.

The SII instrument consists of 291 item statements, organized in six sections that reflect different activities, occupations, traits, etc., that respondents rate using a 5-point Likert scale, from Strongly Dislike (or Strongly Unlike Me) to Strongly Like (or Strongly Like Me). The instrument is not time-limited, but is typically completed in 30 to 45 minutes. The instrument itself, its instructions, and the individual report are written at an eighth to ninth grade reading level.

The individual respondent’s responses to the item statements are evaluated against three norm-referenced sample categories that have been continually updated throughout the evolution of the SII. The General Representative Sample (GRS) consists of 1,250 men and 1,250 women who closely represent the latest demographics published by the U.S. Census. The second category of samples includes those reflected in the 244 Occupational Scales, or job codes, published in the appendix. The third category of samples consists of college students who indicated their academic major at the time of completing the inventory, providing results reflecting 75 different majors.

The results of the SII are provided to the individual in four different formats. At the most general level, the respondent receives a score for each of the six General Occupational Themes (GOT) defined by Holland and known by the acronym RIASEC — Realistic, Investigative, Artistic, Social, Enterprising, and Conventional. These scales are based on an approximate mean of 50 with standard deviation of 10. Holland’s theory purports that these six themes differentiate occupational environments that attract and fulfill different personalities and interests. Each of these six themes can be characterized by four to six categories of professions, which total 30 and are known as the Basic Interest Scales (BIS). Like the GOT, the individual receives an individual score for each of the 30 BIS categories with an approximate mean of 50 and standard deviation of 10. At the most detailed level, reflecting Strong’s initial research, the respondent’s interests are scored against 244 Occupational Scales (OS) or specific job codes. Although presented in similar numerical scale as the GOT and BIS, the scores must be interpreted differently as the norm group for each of the 244 different scales consists solely of people engaged in that particular occupation. The fourth format assesses the respondent’s Personal Style Scale (PSS) according to discriminators related to work style, learning environment, leadership style, and preference for risk taking.

Holland’s theory maintains that the 244 occupations listed in the OS can be mapped to both the 30 BIS categories and the six GOT themes by using either the dominant GOT theme, or a combination of the two or three highest-scored themes. These codes, such as “ASC” for someone scoring highest on the Artistic theme followed by the Social and Conventional themes, are known as “Holland Codes.” He further theorized that each of the six GOT themes could be correlated with the other five according to an expected degree of overlap, with two adjacent themes correlating more highly while the opposing theme correlating much less. This theory yields a hexagonal diagram labeled with the GOT acronyms at each intersecting point such that, for example, one can discern that a high score on the Artistic theme overlaps most highly with scores on the Investigative and Social themes, while the Conventional theme could be considered the opposite of the Artistic with a lower degree of overlapping correlation.

The manual for the SII has evolved with the instrument itself. The contents of the manual are organized as the instrument is organized with chapters devoted to the item statements, GOT, BIS, OS, and PSS. Each of these substantive chapters is similarly structured, including sections that address issues related to interpretation, construction, norming, reliability, and validity. The manual also contains a short introduction, a chapter on the “Administrative Indexes” with useful information for interpreting unusual responses and scores, a chapter to guide the interpreter or counselor on “Strategies and Challenges in Interpreting the Strong,” and the appendix containing data for each of the 244 occupational scales (Donnay et al, 2004).

The current manual provides a straightforward and understandable presentation for the reader who is interested in not only the mechanics of administering the inventory and interpreting its results, but who is also concerned about the theory and the research on which the assessment is based. Each of the five data-driven chapters contains both narrative descriptions and tabulated statistical data regarding composition of samples, means and standard deviations of scores, correlation coefficients, test-retest results, norm groups and norming adjustments, and more.

With respect to reliability, four primary methods are documented in the manual in both descriptive and tabulated data format. Internal consistency calculated with Cronbach’s coefficient alpha is reported for the item statements, themes, and scales. Test-retest methodology was used to gather additional reliability data using two time intervals between tests, “short term” defined as 2-7 months, and “long term” from 8-23 months. Because the 2004 version of the test differed significantly from the previous 1994 version (due to a reduction in the number of item statements, change from a 3-point to 5-point Likert scale, and revisions to both BIS and OS), there was a need to compare the reliability data obtained from the two different test versions for each of the major sections. Because of the purported correlations of Holland Codes that relate GOTs to BISs to OSs, there was also a need to evaluate the internal consistency between each theme and its subordinate scales. The manual presents a plethora of data documenting these wide varieties of reliability calculations. Therefore no single reliability figure can represent the entire inventory. The editors do, however, offer some summarized analysis such as this for the six GOTs in the 2004 version: Cronbach’s alpha for five of the six themes improved from 1994 with all six measuring at least .90, while test-retest reliabilities remained roughly the same at .87 for all themes (Donnay et al, 2004).

With respect to validity, as with reliability, the editors offer an abundance of data that serve as an effective example of what Cronbach and Meehl (1955) proposed as a nomological net for construct validity. Each aspect of a respondent’s assessment (the item statement responses, theme scores, and scale scores) can be correlated to the three comparative samples — the General Representative Sample (GRS, n=2,250), the Occupational Sample (n=3,340), and the Academic Majors Sample (n=879). Within the instrument itself, validity can be assessed in the inter-correlations between GOT, BIS, and OT. The manual reports, for example, that the correlations between GOT themes support Holland’s theory of adjacent and opposite interests on the RIASEC hexagon in that the three “opposite” interest pairs (RS, IE, AC) have correlation coefficients of .07 to .12, while the six adjacent pairs correlate from .36 to .56. Discriminant validity between the 244 Occupational Scales was determined by correlating results within each GOT and BIS according to gender and race/ethnicity. The manual reports on studies which have attempted to assess predictive validity in terms of high school respondents’ selections of academic majors in college, and which occupations college respondents eventually chose. Such studies, as the manual explains, are fraught with complications related to subject attrition, timing, and the inability to control for other factors and changes in the life circumstances of each respondent.

There have been an “extensive number of studies” (Donnay et al, 2004, p. 41) that relate Strong results to other similar measures as evidence of validity. Following is a brief discussion of five studies that reflect a variety of validity studies based on construct evidence, predictive evidence, concurrent evidence, and consequential evidence.

Drawing on data collected over a 29-year period, Johansson (1970) used 129 male occupational samples to assess the construct validity of the SVIB with respect to an Occupational Introversion-Extraversion (OIE) scale. Johansson ranked the 129 samples according to the mean scores determined by the OIE and generally concluded (by eyeball, not quantification) that the degree of Introversion-Extraversion roughly correlated to the SVIB occupational scale. For example, individuals at the extreme of the Introversion scale represented occupations of physicists, farmers, mathematicians, and artists, while at the extreme of the Extraversion scale Johansson found various types of salesmen, political officers, and social workers.

Despite the problems inherent in gathering predictive evidence, Athelstan and Paul (1971) studied longitudinal data from more than 1,500 medical students to determine if their SVIB results accurately predicted their primary medical specialty area of interest. From the initial results, six different specialty areas were determined to be sufficiently represented to consider as criterion groups: 1) general practice; 2) general surgery; 3) OB/GYN; 4) internal medicine; 5) pediatrics; and 6) psychiatry. The students completed the SVIB at the same time as indicating their specialty interests. After a four-year period for those who graduated and became physicians, the SVIB was administered again. Due to a high number of “false positives,” the researchers determined that while the SVIB should not be used as a predictor of medical specialty, the results suggested that a SVIB constructed specifically for this purpose might be beneficial.

A different element of predictive evidence was studied by Hansen and Swanson (1983). They determined that the 1981 revision of the Strong-Campbell Interest Inventory was as predictive of student academic majors as the 1974 version of the inventory had been. They also concluded that it was slightly more predictive of female interests than male. The researchers used a sample of 428 college freshmen who completed the SCII as freshmen, then compared their SCII results with their college major three years later.

Shortly after publication of the 1974 SCII that combined both male and female forms into one common form, Mary C. Whitton (1975) conducted a concurrent validity study to assess same-sex and cross-sex validities. Using the accepted methodology of classifying results as “Good Hit,” “Poor Hit,” or “Complete Miss,” Whitton determined that the percentage of “Good Hits” for same-sex and cross-sex were not significantly different for the General Occupational Themes and the Basic Interest Scales. However, due to small sample size and the fact that the sample was drawn from student volunteers, Whitton expressed caution in drawing firm conclusions from the study.

An attempt to assess consequential evidence of validity was made by Swanson, Gore, Leuwerke, Achiardi, Edwards, and Edwards (2006). They presumed that one measure of “consequence” of the SII results would be the respondent’s ability to recall his or her scores. They designed a study using freshmen students in a semester-long orientation course, which included administration of the SII as part of a 4-class lecture and discussion on career assessments. The researchers designed a test used to indicate the degree to which the students recalled the results of their SII. This test was administered on three occasions: immediately after the students received their SII results, six weeks later, and six months later. The results of the study were inconclusive, other than verifying that the longer-term recall of high scores was related to the accuracy of the immediate recall, and that generally the ability to recall scores diminished over time.

These five studies could be considered “stretches” in terms of their intentions and methodologies, but they are indicative of the breadth and variety of analyses possible from the SII. Given the full and complete manual documentation of the instrument, combined with the enlightened results gleaned from personal experiences with the instrument, this reviewer has no hesitation in endorsing the Strong Interest Inventory as an effective tool for career and educational counseling.

References

Athelstan, G.T., & Paul, G.J. (1971). New approach to the prediction of medical specialization: Student-based Strong Vocational Interest Blank Scales. Journal of Applied Psychology, 55(1), 80-86.

Campbell, D.P. & Hansen, Jo-Ida. (1981) Manual for the SVIB-SCII: Strong-Campbell Interest Inventory, Third Edition. Stanford, California: Stanford University Press.

CPP, Inc. (2009). Strong Interest Inventory: Help students find satisfying college majors and careers they can be passionate about.. Retrieved from https://www.cpp.com/products/strong/index.aspx

Cronbach, L.J. & Meehl, P.E. (1955). Construct validity in psychological tests. Journal of Psychological Bulletin, 52(4), 281-302.

Donnay, D.A.C., Morris, M.L., Schaubhut, N.A., & Thompson, R.C. (Eds.) (2004). Strong Interest Inventory Manual. Mountain View, CA: CPP, Inc.

Hansen, J-I.C, & Swanson, J.L. (1983). Stability of interests and the predictive and concurrent validity of the 1981 Strong-Campbell Interest Inventory for College Majors. Journal of Counseling Psychology, 30(2), 194-201.

Johansson, C.B. (1970). Strong Vocational Interest Blank introversion-extraversion and occupational membership. Journal of Counseling Psychology, 17(573), 451-455.

Swanson, J.L, Gore, P.A. Jr., Leuwerke, W., D’Achiardi, C., Edwards, J.H., & Edwards, J. (2006). Accuracy in recalling interest inventory information at three time intervals. Measurement and Evaluation in Counseling and Education, 38, 236-246.

Whitton, M.C. (1975). Same-sex and cross-sex reliability and concurrent validity of the Strong-Campbell Interest Inventory. Journal of Counseling Psychology, 22(3), 204-209.

Implications of Neuroscience for Education (video)

I prepared the following video, “What Difference Does It Make?” for an introductory neuroscience class, Biological Bases of Behavior. The textbook is the 1,300-page Principles of Neural Science by Kandel, Schwartz, and Jessell. My intention was to highlight seven points, or findings, from neuroscience that have implications for educational theory/practice. These findings include:

  1. One difference between humans and other species is the capability for abstract thought. As described by Daniel Povinelli, “we invent the unobservable (god, ghosts, gravity) in order to explain the observable.” Other species don’t do that.
  2. “Learning” literally changes the brain, and brains are continuously “learning,” or changing.
  3. We don’t experience “reality” directly. Our senses and nervous system mediate our experiences based on our individual models of the world that our brains have constructed.
  4. The brain operates a continuously-running simulation of motor and sensory behaviors.
  5. The nervous system expects feedback, and responds to feedback – even when that feedback isn’t ‘real’.
  6. To focus attention on any one object, the brain must suppress attention on all other objects.
  7. Language organization in the brain, especially grammatical structures, take years to develop and may be related to tool use.

The paper is available here.

Program Evaluation Proposal

Program Evaluation PlanA Proposal to Evaluate the Graduate Student Writing Studio

DISCLAIMER: This proposal is hypothetical. No such evaluation has been requested. There are no plans to conduct such an evaluation, nor is the funding for the Graduate Student Writing Studio under review. Dr. Palm was interviewed for this proposal with the understanding that it was for a course requirement and only hypothetical. Fall 2010

Background and Rationale

The Graduate Student Writing Studio (GSWS, or the Writing Studio) was established five years ago by the College of Education (COE) at the University of New Mexico (UNM) to address perceived deficiencies in the quality of graduate student academic writing.  Rebel Palm, Ph.D., has led the Writing Studio as Director since its inception. Beginning with two graduate assistants, she now employs four graduate assistants who each work from 10-20 hours per week.   With Dr. Palm herself serving as a tutor, GSWS employs approximately two full-time equivalents.  The Studio is funded by graduate student fees and typically assists 130-140 COE graduate students each semester.  As a service available exclusively to graduate students in the College of Education, the GSWS program has no counterpart on the UNM Main Campus, although the Center for Academic Program Support (CAPS) has recently begun offering writing assistance to graduate students across the campus.

Dr. Palm reports that most students seek tutoring on more than one writing project, which suggests students find value in the services provided by the GSWS tutors.  Some professors within COE regularly refer their students to the Studio, while others do not. Although she suspects that students who use the services offered by the Writing Studio are demographically representative of the graduate student population within the COE, her records and assessment surveys do not provide that type of sensitive demographic data.

There has been no formal evaluation of GSWS in its five year history.  Dr. Palm regularly sends out an assessment survey at the end of each semester to students who have received tutoring assistance.  She estimates a return rate of about 60%.  These assessments provide feedback regarding the students’ degree of satisfaction with the tutoring as well as the quality of the individual tutors who assisted them.

Given the continued budgetary pressures faced by the University, Richard Howell, Dean of the College of Education, prompted by a challenge from certain departments who believe the budget allocated to the Graduate Student Writing Studio could be better spent elsewhere, has requested a comprehensive evaluation of the Writing Studio (the evaluand).

Purpose

The purpose of the proposed evaluation, to be completed over the course of the Spring 2011 semester, is to: 1) assess the quality of the tutoring services GSWS provides to COE graduate students; 2) evaluate the value proposition that the Writing Studio, as presently implemented, provides to COE; and 3) provide a range of alternatives to Dean Howell regarding the future of GSWS in terms of how COE and its graduate students can realize the most cost-effective results from the Writing Studio.

Stakeholders

Primary stakeholders consist of the leadership of the College of Education and the Writing Studio.  Dean Howell is responsible for all academic programs and operations within the College of Education. As the decision maker for  all budgetary matters, Dean Howell will use the results of the evaluation in determining future budget levels for the GSWS program.  He will also consider how the three alternatives to be provided by the evaluation (status quo, contract the program, or expand the program) can best position GSWS fits into the overall COE five-year strategy.  He will be asked to participate in an interview at the beginning of the evaluation period, and his office will be asked to provide historical financial data for GSWS.  Otherwise, he will not contribute to the evaluation.   He will receive the four Monthly Progress Reports, the Final Report, and be invited to the presentation of the Final Report.  As the decision maker for COE, Dean Howell will have the authority to implement any strategic decisions resulting from this proposed evaluation.

The heads of the six academic departments reporting directly to Dean Howell are also primary stakeholders.  As responsible for the academic performance of their departments, and in their leadership roles in directing the COE faculty, they will be interested in the execution of this evaluation as well as its results.  They will each be individually interviewed for this evaluation, receive the Monthly Progress Reports, the Final Report, and be invited to the presentation of the Final Report.

Other primary stakeholders include Dr. Palm and her staff of tutors, as they contribute directly to the evaluation and will be affected directly by the evaluation.  Over the course of the evaluation, they will be asked to participate in individual interviews, one group interview, and one survey.  It is recognized that as vested stakeholders with personal interests in the outcome of the evaluation, the tutors and Dr. Palm herself are not impartial participants. However, their participation is required in order to fully assess the overall value proposition of the GSWS program.  Their contributions to the evaluation will therefore be accepted and considered with this in mind.

Secondary stakeholders include the current faculty and graduate students in the College of Education who would be affected by any changes to the tutoring program. These include those faculty and students who use the services of the Writing Studio, as well as those who do not.  Tertiary stakeholders with less immediate interests in the outcome of the evaluation are future COE graduate students. Additional tertiary stakeholders who could be affected by the results of the evaluation include graduate programs in other colleges on the UNM campus, depending on the recommendations produced by the evaluation.  For example, the services offered by CAPS could be affected by the results of the evaluation.  The secondary and tertiary stakeholders will not be asked to contribute to the evaluation.  Distribution of the monthly and final reports to these shareholders will be at the discretion of the COE department heads.

It should be noted that GSWS will continue to provide tutoring services to students during the course of the evaluation.  The evaluation has been designed to minimize the impact on the tutors and the students and not interfere with their important work.  Ultimately, the results of the interactions between these two parties drives this evaluation.  The evaluation will therefore focus on these individuals and value their inputs and comments as most critical data.  While there might be merit in pursuing a broader survey of COE faculty and students regarding their opinions about the GSWS program, that is not the purpose of this evaluation.

Key Questions

The key questions to be addressed during the proposed evaluation include:

  1. How successfully has the Graduate Student Writing Studio achieved its stated purpose to improve the quality of graduate student writing from the perspectives of:
    • COE graduate students who use the tutoring services?
    • COE Department Heads, as communicated through their faculty members?
    • the GSWS tutors?
  2. What value does the Writing Studio provide to COE in its delivery of tutoring services in terms of cost-effectiveness and quality of service?
  3. What factors should be considered by COE administration in deciding the future direction of the Graduate Student Writing Studio?

Evaluation Design

The proposed evaluation is formative in nature. It will assess an ongoing program with the intent of improving the program for future implementation, and it will provide well-justified alternative strategies for Dean Howell to consider regarding GSWS.  Because the purpose of the proposed evaluation is to determine the effectiveness of an ongoing program, the behavioral objectives evaluation approach will be employed.  This approach is best adapted to determinations of whether or not a program is meeting its objectives.

A mixed method approach has been selected for this evaluation.  While judgments about writing and improvements in writing are inherently subjective, quantitative data do exist and can be collected throughout the evaluation to provide a productive balance of qualitative and quantitative techniques.  Qualitative observations and assessments will be derived from analysis of archived assessments from prior years, personal interviews, and group interviews.  In keeping with a qualitative approach, data collected from verbal sources (previous assessments and the interviews will be analyzed and summarized in narrative form in order to provide a “thick description” of the how GSWS tutoring is delivered and how it results in benefits to the students.

A quantitative approach will be employed in analyzing the survey results, and in coding and characterizing data from the prior assessments so that both the previous and new data can be aggregated for analysis.  Financial data will factor into determinations about the value proposition delivered by GSWS.  Financial metrics will be developed for analysis and comparison, such as average cost per student, average cost per hour, and the average value of benefit as perceived and reported by students.

Data Collection and Sampling

The intent of the proposed evaluation is to collect data from all primary stakeholders and student participants during the evaluation period, as well as all available archived data.  Therefore sampling is not applicable as the objective of this evaluation is to provide a census of direct participants (tutors and students) of the GSWS program.  It is acknowledged, however, that some students will elect to not participate in the evaluation, therefore it is assumed that 60% of students will complete the survey (estimated 84 of 140) and 25% will agree to provide writing samples (35 of 140).

The available archived data consist of post-semester assessment surveys completed by GSWS students in prior years.  According to Dr. Palm, approximately 60% of past students completed and returned the surveys, so it is estimated that approximately 336 should be available for analysis.  Other archived data assumed to be available to the evaluator are financial records pertaining to GSWS for the past five years, including salaries paid, cost of computer resources, supplies, and facilities charges. [TASK #3 on the Management Plan.]

Upon contract award, the first task will be to schedule individual interviews with Dean Howell and Dr. Palm. Prior to the scheduling the interviews, both Dean Howell and Dr. Palm will be asked to provide a short declaration of 500-750 words that generally expresses what they want the evaluation to accomplish.  These statements will be provided to the evaluator and available at least two days prior to the interviews.  The purpose of these interviews will be to gather information regarding their personal expectations for and inputs to the evaluation process.   Inputs will include their desires for specific types of data or analysis output, as well as their initial feedback on the preliminary content of the evaluation instruments (interview and survey scripts). [TASK #1]  The evaluator will incorporate their individual feedback comments into a second iteration of the instruments, obtain their feedback, and continue to revise until both Dean Howell and Dr. Palm concur with the formats and contents of the interview and survey scripts.  As part of the instrument finalization process, the student surveys will be piloted tested with students from prior semesters. [TASK #2]

The heads of the six departments within the College of Education will be individually interviewed as soon as can be scheduled after the evaluation instruments have been approved.  [TASK #4]  The purpose of these interviews is to gather impressions, opinions, and comments reflecting the perspectives of faculty regarding the Writing Studio.  These, and all subsequent individual and group interviews, will follow a similar protocol.  The interviews will be conducted by the evaluator and assistant.  The evaluator will ask permission to audio record each interview.  The evaluator and assistant will each take notes on pre-printed forms that contain the script questions and room for notes in order to compare notes and observations afterwards.

The five graduate student tutors, including Dr. Palm, will be scheduled for individual interviews as can be mutually scheduled. [TASK #5]  There is no time or sequence requirement for the staff interviews.  The purpose of these interviews is to give the tutors an opportunity to express their personal feelings and opinions about the general performance of GSWS, tutoring philosophies and processes, notable successes and failures/frustrations, critiques of the current program, and suggestions for how it might be improved.

The tutors will also be asked to individually complete a survey designed specifically to capture their responses to the items of concerns expressed by Dean Howell and Dr. Palm. [TASK #6]  This survey will be constructed using multiple choice and Likert-type formats in which the respondents will indicate their degree of agreement or disagreement with certain statements regarding the GSWS program. This survey will be provided to the tutors, including Dr. Palm, during the second half of the semester. They will have approximately two weeks to complete and return the survey to the evaluator.

Approximately one month prior to the end of the semester, the evaluator will schedule a group interview with all tutors. [TASK #9]  The purpose of the group interview is to allow a free exchange of ideas and discussion among the tutors in order to capture any relevant data (opinions, suggestions, frustrations, anecdotes, etc.) not already collected, as well as to confirm or discuss in more detail findings from the staff surveys. The evaluator recognizes that during these final weeks of the semester, the demands for time from the staff may make scheduling a meeting with all tutors difficult. Should this prove the case, the evaluator will use his discretion in deciding to either schedule the group interview with one tutor absent, or delay the group interview until after finals week in order to obtain the best possible attendance.

The evaluator is cognizant of the fact that students may not wish to participate in the evaluation for a variety of reasons, including time, personal embarrassment, or unwillingness to obligate themselves. Nevertheless, participation from current graduate students seeking assistance from the writing tutors is vital to assess how effectively the GSWS program is performing its objectives, therefore student participation is vital to the success of the evaluation.

The evaluator will ask each tutor to review the following information, provided in the form of a one-page consent form, with each participant prior to the initiation of each tutoring engagement:

  1. The Dean of the College of Education and the Director of the GSWS have requested an evaluation of the Writing Studio and its services to the COE graduate student community.
  2. Each of the writing tutors is participating in the evaluation.
  3. Each COE graduate student who seeks assistance from GSWS tutors will be asked to participate. Participation is strictly voluntary and will not affect the quality of the tutoring the student will receive.
  4. Names of participants will not be released to the evaluator. Assigned control numbers will be used to track responses.
  5. Students who elect to participate can withdraw at any time.
  6. Students who elect to participate will be asked to complete an online survey following their tutoring engagement. Individual survey results will not be released to GSWS staff.
  7. Selected participants may be asked to contribute to a one-hour group interview. Participation in the online survey does not commit the student to participate in a group interview if asked.
  8. Participants will be asked to provide writing samples that illustrate the effects of their individual tutoring engagement. These samples will be provided through the tutors with student names replaced by assigned control numbers.
  9. Participant signature.

It is anticipated that it should require no more than five minutes of the appointment time for the tutor to review the above consent information with the student. Should this not be the case, the evaluator and staff may jointly decide to make the consent form available to potential participants via email prior to the scheduled appointment time in order to more quickly facilitate its completion.

Participants will be contacted after each tutoring session via email by their tutor and given instructions to complete an online survey. [TASK #7]  The tutor will provide the participant with an assigned control number to use to authenticate their online survey completion. The participant survey questions will concentrate on the participant’s perceptions of value received from the tutoring session, satisfaction with the tutor, whether or not expectations were met, and benefits received (increased confidence, knowledge, better grade, etc.) The survey will consist of twenty questions using multiple choice and Likert-type formats for discrete responses, and one text area for general comments.

From those online surveys completed, 24 participants will be selected to participate in one of four participant group interviews. Two group interviews will be comprised of six participants per group who have sought assistance from GSWS more than once (repeat students). Two group interviews will be comprised of six participants per group who have only sought assistance once. Each group interview will be loosely structured to allow the repeat students to discuss why they have returned to the Writing Studio, and first-timers to discuss their experiences with GSWS and the likelihood they will return. These group interviews will be scheduled as soon as practicable so they are completed well before the end of semester.

The final data to be collected will be writing samples offered by the participants. At the completion of the online survey, each student will be provided with instructions on how to submit before/after writing samples to their tutor. They may submit either selections from a paper or an entire paper. They will be requested to remove all identifying information from the papers, save each using a prescribed file naming convention that uses their individual control number, and attach as an email to their tutor. The tutors will then forward the attached files to the evaluator for analysis.

In summary, the data collection methods and number of collected items will include:

[TASK #1]  Individual interviews with Dean Howell and Dr. Palm. (2)

[TASK #4]  Individual interviews with the six COE department heads. (6)

[TASK #5]  Individual interviews with the staff of tutors.  (5)

[TASK #9]  Group interview with staff of tutors (5 individuals). (1)

[TASK #6]  Staff surveys of 20 questions plus comments. (5)

[TASK #7]  Student surveys of 20 questions plus comments. (estimate 60% of 140, or 84)

[TASK #10]  Student participant group interviews (6 individuals each). (4)

[TASK #8]  Writing samples from participants. (estimate 25% of 140, or 35)

[TASK #3]  Archived assessment surveys from prior years (estimate 60% of 140 x 4 years, or 336).   Financial data from prior and current years.

Internal validity has been considered in these data collection designs.  The required approvals by Dean Howell and Dr. Palm should assure that the content of the interviews and surveys are directly applicable to their interest.  The interviews will be conducted by the interviewer and an assistant to provide corroboration of observations and inferences.  The student surveys will be pilot tested prior to use in order to ensure readability, interpretation, and relevance.  And the interim reports for Dean Howell, the department heads, and Dr. Palm will preclude deviations from the plan and allow immediate corrective actions should events dictate.

Sample Topics and Questions for Data Collection Instruments

As previously noted, the data collection instruments will be constructed with input from Dean Howell and Dr. Palm and will be reviewed and approved by them before they are finalized for use. The following examples are representative of topics and questions that will be initially offered to the Dean and Dr. Palm for consideration.

Interviews with Dean Howell and Dr. Palm:

a)     Without respect to cost, what do you believe is the ideal solution for eliminating deficiencies in graduate student writing?

b)    What are your general impressions from students and faculty regarding the success of the GSWS program?

c)     What do you believe distinguishes good student writing from bad?

d)    At a department or college level, how do you assess whether the quality of writing is acceptable or not?

e)     How might this evaluation fail your expectations?

f)     What three pieces of data are you most interested in obtaining through this evaluation?

Interviews with department heads:

a)     To what degree does the faculty in your department refer students to GSWS?

b)    What do your faculty members report in terms of improvement or benefits that students gain from GSWS tutoring?

c)     What reasons do your faculty member give for not referring students to GSWS?

d)    How could GSWS provide better tutoring to students in your department?

Interviews with tutoring staff:

a)     What are the most common reasons students give for seeking tutoring assistance?

b)    How many students have you assisted? How many were repeats?

c)     What training or preparation did you receive prior to working as a writing tutor?

d)    How would you assess the general quality of work that students bring to the Writing Studio, in terms of a) organization and content; b) grammar, vocabulary, and spelling; and c) appropriate format and style?

e)     What are your frustrations about working as a tutor?

f)     What would allow you to do a better job as a tutor?

Student and Tutor surveys (modified accordingly)

a)     How many times have you come to the Writing Studio for writing assistance?

b)    How satisfied have you been with the assistance you’ve received?

c)     Why did you initially decide to seek assistance from the Writing Studio?

d)    Prior to coming to GSWS, how confident were you about the content and organization of your writing? Your grammar, vocabulary, and spelling? Your compliance with the appropriate formatting and style?

e)     What did you find most valuable about your experience with the writing tutor?

f)     If the GSWS services were not available to you, would you have sought assistance elsewhere? Where?

g)     How much time did you spend preparing for your tutoring appointment?

h)    What’s the most you would have paid in out-of-pocket for the services you received at GSWS? In other words, how much monetary value would you place on the benefits you received from the Writing Studio?

Evaluation Project Management Plan

The proposed evaluation will be executed over a 22-week period during the Spring 2011 semester and include the 13 Tasks identified previously.  Key assumptions include:

  1. Contract Award for the evaluation will occur no later than January 3, 2011 to allow for adequate preparation prior to evaluation kickoff with Dean Howell and Dr. Palm.
  2. Archived assessment data will be made available to the evaluator by mid-February.
  3. Any students that do come to the Studio prior to final approval of the evaluation instruments will not be asked to participate in the evaluation.
  4. The Final Report to Dean Howell and Dr. Palm will be due on June 15, 2011.
  5. The evaluation will not be subject to Institutional Review Board (IRB) approval.

Data Analysis

Qualitative techniques will be used in analyzing data collected via interviews, archived survey assessments, and the participant writing samples. These sources will produce primarily verbal data that must be subjectively interpreted and analyzed, therefore the evaluator will develop, as appropriate, coding, classifying, and summarizing schemes that fit the data in order to  provide the types of information and level of detail necessary to meet the objectives of the evaluation. These primarily categorical data will provide measures of frequency that can be graphically depicted using bar and pie charts, tables, or other techniques to best visually represent the results. Narrative samples of text and statements from the group interviews may be used to provide rich description to further illustrate specific conclusions or assessments.  The evaluator will contract directly with some number of faculty members outside of CO to offer independent assessments of the before/after writing samples.  The criteria for these assessments will be approved with other evaluation instrument requirements by Dean Howell and Dr. Palm.

Quantitative techniques including descriptive statistics will be used to analyze the survey results. These can also be depicted in appropriate graphical formats (bar and line charts, histograms, etc.) in order to best present data associated with prevalence of attitudes, ratings, etc.  Financial data will be analyzed and computed in order to provide the specified metrics, including cost per student, cost per hour, and value of perceived benefits.  The key factor in determining the specific course of the survey data analyses will be the requirements established by Dean Howell and Dr. Palm as the evaluation instruments are created and finalized.

Evaluation Constraints

The fundamental constraints that may threaten the success of this evaluation relate primarily to the willing and committed participation by the primary stakeholders and the student participants throughout the semester.  Each step of the data collection process involves participation in the form of candid, truthful, and accurate responses to the pertinent questions.   The guiding consideration that the evaluators must continually communicate is that this evaluation serves the purposes of everyone involved — everyone wants to provide the best possible services to COE graduate students.

However, any effort to “improve” an ongoing service inevitably runs the risk of being perceived in a negative sense in that some deficiency must usually be identified in order for “improvement” to occur. Therefore the evaluators must be especially sensitive to the possibility of negative perceptions of questions and comments during all interviews.

It is vital to the success of the evaluation that the students who seek assistance from GSWS during the period of the evaluation participate in the interviews, survey, and offering their writing samples for analysis.  If, for whatever reasons, a sufficient number of student participants cannot be obtained, the evaluation will lack a critical component.

Communicating and Reporting

The management plan for the evaluation provides for four monthly progress reports to be delivered to Dean Howell, the department heads, and Dr. Palm.  These interim reports will keep them informed as to accomplishments, status, and issues that may need to be addressed or considered as the evaluation proceeds.  These reports will be submitted as PDF file attachments to email and include: a) accomplishments completed during the reporting period; b) any variances to the planned schedule; c) interim results or items or interest as specified during the instrument finalization process; d) accounting for how many students have elected to become participants, volunteered writing samples, etc.  Distribution beyond these primary stakeholders will be at their discretion.

The Final Report will include a written report of approximately 20 pages that provides a narrative summary of the evaluation process, analysis and significance of data collected, and articulates the considerations and forecasted consequences of three alternative scenarios — the status quo, a contraction of GSWS service, and an expansion of services.  This report, with the explanation of alternatives, will inform Dean Howell and his leadership team of department heads as to how best to utilize the GSWS resources to maximize its value to COE in the coming years.

Cost Estimates and Pricing Support

The proposed price to conduct the evaluation is $25,724. A detailed cost breakdown is provided in the table below. This estimate is based on the following assumptions:

  1. The evaluation will be conducted by a local independent contractor who has no affiliation with the University of New Mexico.
  2. Appropriate support, estimated below, from University employees will be provided at no cost to the contractor.  Estimated requirements for UNM employee support is included in the table below for informational purposes only.
  3. The contractor will arrange for expert faculty review of writing samples through direct contract with the faculty members, therefore that cost is included in the proposed bid.
  4. Work space will be provided to the evaluators adequate for conducting private individual interviews and group interviews.
  5. The contractor will provide all necessary computer resources.
  6. All deliverables (Progress Reports and the Final Report) will be delivered electronically in PDF format. Costs for scanning or re-formatting source data into electronic format are included in the contractor’s hours under Assistant.

Quantitative Research Prospectus

Obama White SlaveryRacial Color-Blindness and Context-Blindness (Quantitative Research Prospectus)

Following is a research proposal submitted as a course requirement for EDPY 505, Conducting Quantitative Educational Research, Fall 2010.

With the election of President Barack Obama, the phrase “post-racial” became a widely-used and often-debated adjective within the American mass media landscape. Among segments on both sides of the political divide, sentiments such as “race doesn’t matter” and “race shouldn’t matter” seemed to conflate. Research such as the most recent update of the seminal Bogardus social distance scale supports the assertion that there has indeed been a narrowing of self-reported expressions of racial, ethnic, and religious discrimination over the past 70 years (Parrillo and Donoghue, 2005).

Within weeks of President Obama’s inauguration, however, protesters gathered in Washington, D.C., to counter the new president’s perceived “socialist” agenda. A small but visible slice of the protesting crowd proudly displayed signs depicting or associating President Obama with Hitler, Lenin, Mao, Muslims, Arabs, and terrorists. Exploiting the “post-racial” eradication of racial boundaries, some of these protesters felt free to display signs featuring symbols historically associated with slavery and denigration of blacks, including apes and monkeys, chains, nooses, and claims of “white slavery.” Language and images that even a decade earlier would have been roundly condemned as overtly racist were expressed openly on signs, billboards, and bumper stickers.

This interpretation and exploitation of “post-racial” attitudes is referred to in critical race theory as color-blindness, which has been theoretically described as attitudes or beliefs that “ideological and structural racism does not exist” (Neville, Lilly, Duran, Lee, & Browne, 2000). Operationally, color-blindness is manifested by “the denial, distortion, and/or minimization of race and racism” (Neville, Spanierman, & Doan, 2006). A psychometric instrument, The Color-Blind Racial Attitudes Scale (CoBRAS), was developed to measure how strongly an individual maintains color-blind attitudes (Neville et al., 2000). Respondents indicate their beliefs about assertions concerning race on a scale from 1 (strongly disagree) to 6 (strongly agree). Since its initial development and validation in 2000, CoBRAS has been subsequently used and further validated by studies in which color-blind attitudes have been correlated with insensitivities toward Native-themed sports mascots (Steinfeldt & Wong, 2010), counseling competencies among white graduate students (Neville, Spanierman, & Doan, 2006), and attitudes toward affirmative action policies (Oh, Choi, Neville, Andreson, & Landrum-Brown, 2010).

Neville et al. (2000) take care to differentiate color-blindness from racism, which they define as “the belief in racial superiority and also the structures of society, which create racial inequalities in social and political institutions.” Color-blind attitudes, as measured by CoBRAS, do not equate to racist attitudes, but rather tend to mask or deny realities that arise from racial inequities and injustices. What is perceived as racism or racist acts in actual life situations is a function of the context and behaviors of the actors, objects, and observers. Like beauty, racism would seem to reside in the eye of the beholder. According to color-blind theory, those manifesting color-blind attitudes are more likely to not “behold” racism or attribute it as a factor in perceived judicial, economic, social, or political inequities. Therefore, even benign color-blindness can tacitly allow racism and racist attitudes to perpetuate.

Studies point to a variety of psychological constructs that may account for different factors that influence color-blindness, without specifically using the term. O’Brien, Crandall, Horstman-Reser, Warner, Alsbrooks, and Brooks (2010) found one possible explanation in the notion of Downward Social Comparison (DSC). They concluded that individuals within their specific sociological and cultural peer groups feel pressure to “maintain unprejudiced self-images” even though the group itself may exhibit prejudiced behaviors (O’Brien, et al., 2010). Such individuals exhibit DSC when they compare themselves only to the worst or most explicit of the objectionable behaviors. Not wanting to be perceived as “part of the prejudice problem,” they exclude their own attitudes and behaviors as racist by downwardly comparing themselves to more openly intolerant peers (O’Brien et al., 2010). Sommers and Norton (2006) determined that individuals who manifest color-blind attitudes may fail to accept their own prejudices because they “pick and choose aspects of these theories to fit their own psychological needs.” In other words, those exhibiting high levels of color-blindness tend to define racist acts or racism in terms which are self-excluding. Their study further implies that individuals who are the most likely to hold or act on racist beliefs are also the least likely to define such acts or beliefs as racist (Sommers & Norton, 2006). Federico and Sidanius (2002) debunk the oft-cited theory that the culprit behind such inability to see color and race as a consequential reality is a lack of education and knowledge. Their findings reveal the contrary — among individuals who exhibit “racist and anti-egalitarian motives,” those with the most pronounced and entrenched attitudes are those with the highest levels of education and political knowledge (Federico & Sidanius, 2002).

A different potential consequence of color-blindness was suggested in the August 2010 media firestorm resulting from radio commentator Dr. Laura Schlessinger’s usage of the word nigger in response to a self-identified black female caller. Most of the media commentary and criticism focused on Schlessinger’s seemingly-gratuitous usage of the “n-word,” which she uttered 11 times during her on-air exchange with the caller. Left ignored for the most part, however, was her comment, captured by program transcripts, that “if you’re that hypersensitive about color and don’t have a sense of humor, don’t marry out of your race.” In the context of the exchange with the caller, this statement seems more problematic as an example of color-blindness in that it purports the existence of a racial hierarchy. However, most media reports headlined and featured Schlessinger’s use of the “n-word,” while few provided the fuller context of the exchange including the “don’t marry out of your race” statement.

In light of this and other highly-publicized episodes involving accusations of racism, the proposed study seeks to explore potential associations between color-blindness and the inability, or unwillingness, to thoughtfully evaluate the full context of potentially offensive racial situations. In other words, in this 21st-century “post-racial” environment, have the terms racist and racism come to be so narrowly defined to those with color-blind attitudes that the utterance, or avoidance, of the “n-word” epithet is the sole predicate that constitutes whether or not a statement or a situation is judged to reflect racist attitudes and behaviors?

The proposed study hypothesizes that individuals who exhibit higher degrees of color-blindness are more likely to over-consider the usage of racial epithets, and under-consider context, when evaluating racially-sensitive situations than are individuals who exhibit lower levels of color-blindness. In this directional correlational study, the independent (predictor) variable is the degree of racial color-blindness as measured on an internal scale by the CoBRAS instrument. The dependent (criterion) variable is the response to potentially-offensive racial situations as measured on an interval scale by an instrument created for this study, the Study Instrument (SI). It is predicted that individuals with higher CoBRAS scores will be, compared to individuals with lower CoBRAS scores: 1) more likely to label behaviors and attitudes as racist when racial epithets are used, even when the usage of the epithet is not germane to the context of the situation; and 2) less likely to label behaviors and attitudes as racist when racial epithets are not used, even when the context of the situation involves generally-accepted racist attitudes and behaviors.

Method

Sampling

Two populations are targeted in the proposed study. The first is the 18-23 year-old young adult demographic that has come to adulthood during the so-called “post-racial” period. The second targeted population is the older half of the “Baby Boomer” generation, individuals born prior to 1954, whose formative years encompassed the height of the civil rights movement.

The two respective accessible populations are undergraduate students on the main campus of the University of New Mexico (UNM) representing 18-23 year olds, and faculty and staff at UNM who are 56 years of age or older representing older Baby Boomers.

Two groups will be recruited to represent the undergraduate students (Group A), and faculty and staff over 56 (Group B). The Group A samples will consist of four intact university classes whose professors have agreed to participate in the study. Undergraduate classes in the language arts and social studies departments will be solicited for the study with the objective of matching the following criteria: a) a minimum of 20 enrolled students; b) gender imbalance of no more than 60/40; c) the instructor agrees to administer the two instruments in accordance with a prescribed procedure; and d) the instructor is confident that the two instruments can be integrated into the class presentation such that students will not feel that being asked to complete the instruments is beyond the scope of the course.

The Group B sample will be recruited from the faculty and staff at UNM. Potential respondents will be solicited through written and word-of-mouth communications. The Group B sample will be selected from respondents based on their age and gender in order to achieve the same gender imbalance of no more than 60/40. Up to 60 respondents will be selected to participate, depending on response rate within the prescribed schedule window, with a minimum of 30 required for this component of the study.

This sampling plan does not reflect random selection or assignment. It has been selected to purposefully draw from the students and adults accessible on campus. This approach is generally consistent with sampling schemes employed in the studies referenced herein, including the initial CoBRAS validation study (Neville et al., 2000), as well as Steinfeldt and Wong (2010), O’Brien et al. (2010), and Sommers and Norton (2006). As human subjects are involved, the study proposal must be approved by the UNM Institutional Review Board (IRB).

Instrumentation

The CoBRAS assessment instrument provides a reliable measurement for the construct of racial color-blindness (Neville et al., 2000). The 20-item instrument employs a 6-point Likert scale for expressing the participant’s agreement or disagreement with statement such as: White people in the U.S. have certain advantages because of the color of their skin; Everyone who works hard, no matter what race they are, has an equal chance to become rich; and Race plays an important role in who gets sent to prison. The instrument was initially tested in a series of 5 studies involving 760 university students. Results from each study were used to calculate Cronbach’s alpha coefficient to determine reliability figures, which ranged from .84 to .91 (Neville et al., 2000). Further reliability data were collected through a two-week test-retest design in one of the 5 studies, which resulted in a total CoBRAS reliability statistic of .68. Subsequent studies that used CoBRAS reported alpha coefficients of .85 (Neville et al., 2006; Steinfeldt & Wong, 2010). Measurement validity was demonstrated through extensive consultation and consensus with racial studies experts, reading teachers, and students in the construction and pilot testing of the instrument prior to its use in the studies. One of the initial 5 studies was designed to examine internal validity of the instrument by correlating its results with those from instruments that measure factors associated with color-blindness. These included the Global Belief in a Just World Scale, the Multidimensional Belief in a Just World Scale, and the Marlowe-Crowne Social Desirability Scale, with intercorrelation scores for the CoBRAS factors ranging from .42 to .54 (Neville et al., 2000).

The Study Instrument (SI) will be developed specifically for this study. This instrument will consist of 8 hypothetical scenarios of 3-5 sentences each. Each scenario will describe an encounter in which one character makes an assertion or exhibits behavior that could be interpreted as racially offensive. Four of the scenarios will include the use of a commonly-recognized racial epithet (“With Epithet” set), while the other four will not (“Without Epithet” set). (See Table 1 below.) Within each set, two scenarios will be constructed to indicate actions or assertions which are deemed by subject matter experts to be more offensive, while two scenarios will be constructed to indicate actions or assertions that are deemed less offensive. The respondent will be asked to read each scenario, then answer 4 questions that reflect the respondent’s attitudes and judgments about the degree of offensiveness exhibited in the scenario. Questions will be constructed using a Likert scale ranging from 1 (strongly disagree) to 5 (strongly agree). In order to demonstrate internal content validity, each scenario will be critiqued by a panel of faculty members whose academic interests include the historical effects of racism, critical race theory, sociology, and language arts/linguistics. These subject matter experts will assess and offer guidance regarding both the readability of each scenario (is it understandable as intended?), as well as the degree of offensiveness each scenario indicates (would most people find it racially objectionable, or not?). The SI will be written to a 6th-7th grade reading level, consistent with CoBRAS (Neville et al., 2000), and must not introduce any confounding factors that may influence participant responses, such as character likeability, religion, politics, etc. The SI will be pilot tested with a minimum of 20 selected students and faculty members. As an instrument created specifically for the proposed study, an assessment of reliability will be calculated using the Cronbach alpha coefficient formula as part of the pilot testing and for each sample obtained (4 student samples, 1 adult sample). The expert feedback and pilot testing results will be completed prior to study submission to the IRB for approval.

Procedures

Table 1 illustrates the design of the Study Instrument scenarios. The scenarios of interest for analyzing the predicted correlations are scenarios 4 and 7 (for which higher CoBRAS scorers are predicted to judge more offensive than lower CoBRAS scorers) and scenarios 2 and 5 (for which higher CoBRAS scorers are predicted to judge less offensive than lower CoBRAS scorers). For Scenarios 1, 3, 6, and 8, the no significant differences on the SI results are expected. Each participant will complete the same CoBRAS and SI instruments.

For Group A, the instructors of the classes selected for the study will administer the two instruments to their respective classes during the same class period. Each student will be given a packet that contains a consent form to be signed, a copy of the CoBRAS, and a copy of the SI. The CoBRAS and the SI will include pre-printed control numbers in order to correlate responses. The SI will ask, but not require, the respondent to provide information regarding age, gender, and race/ethnicity. Two of the 4 classes will be instructed to complete the CoBRAS instrument first, followed by the SI; the other 2 classes will be instructed to reverse the order. The purpose of the ordering is to preclude priming effects and recognize them if suspected.

Predicted Responses on the Study Instrument (Sensitivity to Racial Offense)

Research Table
Group B participants will receive packets at their work place that include the same consent and assessment forms provided to Group A. In addition, Group B respondents will be provided with an instruction sheet and a postage-paid addressed envelope in which to return completed materials. Someone other than the researcher will prepare the Group B packets to ensure anonymity and disassociation of control numbers to individuals. To mitigate potential priming effects, half the packets will have the CoBRAS on top of the Study Instrument, while the other half will reverse the order. The instructions will request that the two instruments be completed during the same sitting and returned by mail by a specified date.

Data Analyses

Scores will be analyzed in the aggregate as well as for each sample (5 total). Scatterplots will be generated to plot score pairs (x axis = CoBRAS score, y axis = SI score) and visually assess the predicted results (positive correlation for Scenarios 4 and 7, negative correlation for 2 and 5, no correlation for 1, 3, 6, and 8). Scores between Groups will be evaluated for potential age differences, but the study makes no prediction of difference due to age. Correlation coefficients using Pearson’s r techniques will be calculated to assess the statistical significance of the correlations. Cronbach’s alpha coefficient will assess reliability. CoBRAS means and variances will be compared to previous published studies as a check on validity. Results will be scrutinized to detect influences that may affect scores on the SI scale, other than color-blindness.

Strengths and Limitations

The study relies on convenience samples, which introduces external validity risks in that participants may not adequately represent the target populations and may reflect a bias in terms of who chooses to participate in the survey. The use of intact university classes may risk ecological validity in that the students may answer differently than they would in a more natural, less structured, environment. Internal validity risks include the possibility of data collector influence or bias if the instructors do not appropriately administer the instruments as prescribed. Subject testing risks include the possibility of priming effects, or if participants anticipate the responses they believe are “correct” and do not answer carefully and sincerely. Instrument decay is a threat depending on when the participants choose to take the assessments, if they are rushed, tired, or distracted. As the Study Instrument is a new assessment tool, its reliability and validity must be considered suspect until demonstrated by results.

However, the study design attempts to mitigate validity threats. Having participants complete both instruments in one sitting reduces mortality, maturity risks, and the threat of history or intervening events influencing results. The possibility of priming effects is recognized and accounted for in the protocol. Care has been taken to avoid contact between the researcher and participants. Although not directly pertinent to the study’s hypotheses, the data collected from different age groups may yield informative results that may guide future research in replicating the study and further exploring the causes, as well as effects, of color-blindness.

References

Federico, C. M. & Sidanius, J. (2002). Sophistication and the antecedents of Whites’ racial policy attitudes: Racism, ideology, and affirmative action in America. The Public Opinion Quarterly, 66, 145-176.

Neville, H. A., Lilly, R. L., Duran, G., Lee, R. M., & Browne, L. (2000). Construction and initial validation of the Color-Blind Racial Attitudes Scale (CoBRAS). Journal of Counseling Psychology, 47, 59-70.

Neville, H. A., Spanierman, L., & Doan, B.-T. (2006). Exploring the association between color-blind racial ideology and multicultural counseling compentencies. Cultural Diversity and Ethnic Minority Psychology, 12, 275-290.

O’Brien, L. T., Crandall, C. S., Horstman-Reser, A., Warner, R., Alsbrooks, A. & Brooks, A. (2010). But I’m no bigot: How prejudiced white Americans maintain unprejudiced self-images. Journal of Applied Social Psychology, 40, 917-946.

Oh, E., Choi, C.-C., Neville, H. A., Anderson, C. J., & Landrum-Brown, J. (2010). Beliefs about affirmative action: A test of the group self-interest and racism belief models. Journal of Diversity in Higher Education, 3, 163-176.

Parrillo, V. N. & Donoghue, C. (2005). Updating the Bogardus social distance studies: A new national survey. The Social Science Journal, 42, 257-271.

Sommers, S. R. & Norton, M. I. (2006). Lay theories about white racists: What constitutes racism (and what doesn’t). Group Processes & Intergroup Relations, 9, 117-138.

Steinfeldt, J. A., & Wong, Y. J. (2010). Multicultural training on American Indian issues: Testing the effectiveness of an intervention to change attitudes toward Native-themed mascots. Cultural Diversity and Ethnic Minority Psychology, 16(2), 110-115.

Muraling Myths

Muraling Myths: A Qualitative Research Prospectus

Fall 2010

Abstract

In this study prospectus, I pose the question, “What do University of New Mexico students, faculty, and staff think about the Kenneth Adams murals in the west wing of the Zimmerman Library?” This question presupposes that the university community does think something about the murals. Commissioned in 1939 by James F. Zimmerman, then-president of the university, the murals have been criticized for their idealized depictions of cultural assimilation among Native American, Spanish/Mexican, and Anglo peoples. Critics such as Chris Wilson (2003) maintain that the artistic representations portrayed in the murals perpetuate inaccurate and outdated cultural stereotypes, reflecting the narrow historical perspective of the White privileged world view. Although no recent student protests have been directed toward the murals, during the 1970s and 1990s attempts were made to deface the fourth mural (Wilson, 2003). However, the current national political climate of anti-Muslim and anti-immigrant demonstrations highlights an American predisposition toward the political subjugation, castigation, and exploitation of targeted minorities. The Adams murals reflect this unfortunate reality in its representations of White privilege over Mexican and Native Americans in Northern New Mexico. I contend that the UNM community of students, faculty, and staff needs to understand the historical and sociological myths memorialized in these murals. This study proposes to assess the degree to which this community does think about the murals, what they think, and what actions they feel are appropriate in response to the murals’ criticized false history, cultural and symbolic stereotypes of non-Whites as “the other, ” and the ongoing effects of these depictions on all races and cultures in Northern New Mexico.

The Kenneth Adams Murals in the Zimmerman Library, University of New Mexico

Adams Murals in Zimmerman Library
Photographs by Steve Stockdale (2010)

Historical significance is not a property of the event itself. It is something that others ascribe to that event, development, or situation. (Counsell, 2004, p. 30)

I learned of the Kenneth Adams murals in the west wing of the Zimmerman Library on the main campus of the University of New Mexico (UNM) from a professor. Off-handedly, prior to beginning a discussion about a book by bell hooks, he informed our graduate class that on this very campus there existed an exhibition of public art, condoned and rationalized by the university administration that depicts White-privileged racism and male-dominated oppression. My initial reaction was to judge his comment as a biased overstatement. I recognized that, primed with his opinion, I was already prejudiced against the murals as something that, according to this professor’s matter-of-fact pronouncement, I should find offensive.

I made a visit to the library to see the murals for myself. To my surprise, I judged the murals in terms similar to those of the professor. I left with questions: Who did this? Why? What were they thinking? Finding the answers easily on the university’s website, I wrote about the murals in a graded essay for the course. After the semester ended, I exchanged emails with the Dean of Libraries about the murals. I offered my criticisms, as well as the general recommendation that something should be done. The Dean’s initial responses were promising, acknowledging that my analysis and suggestions were appropriate. However, after consulting with library staff, her conclusion, “which seems to reflect widespread thinking,” was that the murals were fine as is and warranted no “extra effort or attempt to define a stand or create additional context” (M.A. Bedard, personal communications, July 9 – August 31, 2010). She attached a note from an unnamed faculty representative who offered a more pointed reaction, “Will we start putting plaques on all other works of art, displays, exhibits and books to explain their historical context too?” [See SUBSEQUENT NOTES below.]

The purpose of this proposed study is to challenge the assertions by Dean Bedard and to deliberately address the question posed by the unnamed faculty representative. What does the UNM community think about the murals? To what degree are they aware of the murals’ history, purpose, and criticisms? To what degree does the community believe that the murals do indeed deserve special scrutiny and “historical context’?

The merits of this proposed study can be assessed by inquiring into three domains of research that bear directly on the issues pertinent to the Adams murals: 1) the employment of public art as a means to achieve educational, political, and cultural objectives; 2) the historical realities of the peoples depicted in the murals; and 3) the contemporary realities of these peoples seen through the critical lenses of multiple cultural perspectives.

Public Art to Achieve Specific Objectives

With regard to controversial public art, Lankford and Pankratz (1992) highlight two central questions that must be considered — what is the political environment and social context in which the art was produced, and what was the intention of the artist? For the Adams murals, these questions are answered directly by the university on its website. Then-president of the university James F. Zimmerman commissioned the four murals in 1939 with the purpose to depict “the Indian, the Spanish, and the Anglo” and the “union of all three in the life of the Southwest” (University of New Mexico Libraries, 2009). The same webpage quotes from a 1939 article in the student newspaper The Lobo, that the murals “will be purely architectural decoration” and goes on to describe the four murals. The article declares that the fourth mural “represents the dawn of a new day, all the contributions combining for better living … reflecting the spirit of democracy … three races as socially equal” (University of New Mexico Libraries, 2009). Adams’ intention was to comply with the two-fold commission articulated by Zimmerman ¬— depict the history of the past union, and the dawn of a socially-equal future.

Sociological and cultural aspirations for public art are not uncommon. In South Africa, before and immediately after Nelson Mandela’s release from prison, officially-sanctioned wall art covered entire blocks of buildings in cities across the country (Marschall, 2008). Used for both political and commercial purposes, the wall art projected ideals and possibilities for a post-apartheid democratic nation. As Marschall notes, however, after almost two decades those idealistic images depicting hopeful national aspirations have in most cases become crumbling markers of progress stymied, still awaiting social, economic, and political union.

Thus, a work of art created with historical motives can be evaluated in two directions — how accurately does the work depict the past and are its aspirations for the future realized?

Historical Accuracy

There can be no dispute that Adams’ flat paint glosses over the historical realities of “tricultural New Mexico” (Wilson, 2003) to an astonishing, if not irresponsible, degree. The adage that history is written by the victors has no more convincing witness than the White telling of the history of the southwestern U.S. Laura E. Gómez (2007) provides a detailed account of how White Anglos from the U.S. exploited historical animosities and rivalries between Indian and Mexican descendants for their own interests, especially after the Treaty of Guadalupe Hidalgo in 1848. By treaty, former Mexican settlers in territories ceded to the U.S. by Mexico were afforded citizenship in the new territory, including voting rights, which at the time were restricted to white males. Therefore Mexicans had a legal claim to “whiteness,” as opposed to the native Indians who were expressly defined as “not white,” ineligible for citizenship (Gómez, 2007). Socially, however, a huge divide separated the minority Whites who emigrated from the U.S. from the long-established “white” Mexican majority. This divide comprehended differences of language, religion, cultural traditions, and work ethic, which led to White stereotyping and denigration of Mexicans. For these and other reasons, Mexicans were more accurately considered as “off-white” rather than white (Gómez, 2007). Thus was established a racial hierarchy topped by Whites, bottomed by Indians, with Mexicans in between. Gómez documents the accepted practice of Indians serving as slaves, often to Mexican owners, through the early 20th century. She explains that, even though most of the western lands were ceded by Mexico at the same time, New Mexico and Arizona were not admitted to the Union until sixty-two years after California’s admission in 1850. [Watch an illustration.] This lifetime of a delay was due to the American public’s distrust of and skepticism toward Mexicans and Indians in terms of their intellect, temperament, and overall worthiness as franchised citizens (Gómez, 2007).

After admission to statehood, Mexicans in New Mexico and Arizona joined other Southwest Mexican Americans as Americans and began to face more overt and explicit discrimination. Years before the phrase “civil rights” became associated with the struggle of black Americans for equal legal protections, Mexican Americans went to the courts to establish important precedents to remedy segregationist and discriminatory practices related to voting, housing, marriage, language, and education (Ruiz, 2006).

Native Americans suffered even more severe treatment, sanctioned by official U.S. Government policy. In addition to well-documented disputes over treaties, land rights, and tribal sovereignty, one of the signature denigrations of Native Americans was the Indian boarding school program. Exhibiting no shame, the rationale behind shipping Indian children off to distant boarding schools was stated as, “Kill the Indian, save the man” (King, 2008).

None of these historical and unjust realities is evident in the Adams murals. Instead, the White-dictated assimilation of the cultures is depicted in the fourth mural. Ostensibly projecting an idealized future, the blond, blue-eyed White mediates between Mexican and Indian, shown in the “business casual” attire of the White. The inevitability of such forced assimilation is poignantly reflected in the comments of a Native American grandmother in 1950, “Some day we’re all going to be like white people” (Horse, 2005).

The tricultural myth is silent regarding the unfairness of the inherent tensions that Native Americans and Mexican Americans must continually reconcile, for which White Americans have no counterpart. Whites do not face the relentless pressures of maintaining a cultural identity with distinctive traditions, while at the same time adapting to the more mainstream ways dictated by a dominating socio-economic-political culture. For Native Americans, especially those residing on reservations and pueblo lands, this dilemma is acute. Their challenges have been described as having to reconcile five different and sometimes conflicting aspects of awareness: 1) ethnic nomenclature, or the labels that are applied to groups of “the other”; 2) racial attitudes by the dominant Whites about “the other”; 3) the legal and political status of minority groups such as Native Americans; 4) the inevitability of cultural change; and 5) issues related to “personal sensibility” — native language, genealogy, world view, self-concept, and tribal relations (Horse, 2005).

The White appropriation of marketable Mexican American and Native American cultural attributes was the basis of the thriving tourism industry in the late 19th and early 20th centuries. The tricultural myth, including the peaceful and settled Indian and industrious “Spanish” (not “Mexican”), was a cornerstone of the tourism strategy to quell fears about the “Spanish” majority in New Mexico, as well as a lingering distrust of Indians (Gómez, 2007; Wilson, 2003).

Realization of Public Art Aspirations

These inconvenient historical realities continue to have consequences in the contemporary realities of Mexican Americans and Native Americans. Centuries-old grievances lie at the root of acts of political vandalism, such as that involving the monument erected to commemorate the area’s first Spanish colonial governor, Don Juan de Oñate, who arrived in 1598. In advance of planned celebrations of the 400th anniversary of Spanish colonization, the right foot of the memorialized bronze figure was sawed off, mimicking the heinous practice commonly ordered against dissidents under Oñate. Communications to local newspapers from a group called The Friends of Acoma, a nearby pueblo, claimed responsibility (Trujillo, 2009). In Santa Fe, prior to the city’s largest annual market, Indian Market, vandals have covered the white Cross of the Martyrs, dedicated to Spanish priests killed during the pueblo uprising of 1680, with red paint (Grammer, 2010). The murals themselves have been targeted for protest, as Wilson (2003) notes that “the final panel was defaced with splattered paint twice in the early 1970s, and students repeatedly protested for their removal through the early 1990s” (p. 29). These events serve as data points that the aspirations for tricultural assimilation have not been fully realized.

Historical Significance

Historian Christine Counsell (2004) has proposed the “Five R’s’” for considering whether an event merits consideration as historical significant: it must be Remarkable (in that it has been remarked upon since its occurrence); Remembered; Resonant (in that it connects people to a specific memory or experience); Results (it had consequences); and it Reveals something about other events in the past. By this standard, it would seem that not only the myth of tricultural “union” qualifies, but the mural itself qualifies as significant in its depiction of the myth. With respect to contemporary consequences for Mexican Americans, Gómez (2007) warns that “by continuing to uncritically reproduce the standard account of race in the United States, we may inadvertently reinforce white supremacy” (p. 143). Native Americans continue to struggle with the dilemma of resisting the “essentializing tendencies” that result in simplistic stereotypes, while maintaining a cultural identity that embraces those same tendencies (Brayboy, 2000).

Therefore it seems reasonable to conclude that, looking to the past, the Adams murals do contribute to the myth of tricultural assimilation and depict a false history. Looking to the future, they reveal the failure of seven decades of White articulated aspirations based on denial, ignorance, and privilege. The question becomes, “now what?”

Methodology

Tradition and Philosophy

This prospectus reflects the influences of three world views or philosophical paradigms as described by Creswell (2007). Social constructivism recognizes the inevitable multiple realities that arise from different traditions, cultures, and perspectives. The Adams murals clearly reflect only the reality of White Americans of privilege and power. The advocacy/participatory paradigm is applicable because the proposed study does advocate for action. I am motivated by a desire to not only gauge awareness, but to raise awareness levels such that others perceive the need to “do something” beyond silently condone the murals. The objectives of this study are also tempered by pragmatism in that it is reasonable to acknowledge both the existence of the murals and their significance as a fixture to the library and the university. There is no question that the murals will remain. There is a question as to what reasonable alternatives to the status quo may exist that reflect the informed sensibilities of the UNM community.

Creswell (2007) defines the over-arching theories that inform this inquiry. “Critical theory perspectives are concerned with empowering human beings to transcend the constraints placed on them by race, class, and gender” (p. 27). The murals depict each of these constraints while purporting to deny them. To my mind, a necessary step toward empowering those affected by these constraints has yet to occur, and that is to formally acknowledge complex and unattractive truths underlying historical myths. Creswell also explains that critical race theory focuses on “race and how racism is deeply embedded within the framework of American society” (p. 28). While some may argue that the murals do not depict different races but rather ethnicities, or cultures, or traditions, two arguments justify viewing the murals from a critical racial perspective — the university’s webpage refers to “the three racial groups” (University of New Mexico Libraries, 2009), while the extensive evidence revealing a racial hierarchy (Gómez, 2007) clearly meets the most general definition of racism, “the belief in racial superiority” embedded in “the structures of society which create racial inequalities in social and political institutions” (Neville, Lilly, Duran, Lee, & Browne, 2000). Such inequalities are inevitable outcomes of a society’s construction of “race” as a defining discriminator between “them” and “us.”

A third theory that informs this research is known as color-blind theory, which refers to the post-civil rights era belief that “ideological and structural racism does not exist” (Neville et al., 2000). Such beliefs result in a tendency to not see racism when it does exist, hence less sensitivity to objections about certain examples or manifestations of racial offense. I suggest that color-blindness is partially responsible for the status quo and reluctance to engage in critical dialogue about the murals.

Methods for Data Collection

Two populations are targeted in the proposed study, UNM students (Group A) and UNM faculty/staff (Group B). Purposive sampling will be used for both populations. The Group A samples will consist of eight intact university classes whose professors have agreed to participate in the study. Undergraduate classes in the language arts and social studies departments will be solicited to match these criteria: a) a minimum of 20 enrolled students; b) gender imbalance of no more than 60/40; and c) the instructor is confident that the study survey can be unobtrusively introduced within the scope of the class. The Group B sample will be recruited from faculty and staff volunteers at UNM. Potential respondents will be solicited through written and word-of-mouth communications. Up to 60 respondents will be selected to participate in Group B. Institutional Review Board approval will be required due to the reliance on human subjects.

Data collection will consist of two phases for each group. The first phase will be to administer a survey to the Group A intact classes and the Group B individuals. The surveys will be administered to the Group A classes by the instructors, who will read a prepared script, then provide the students with a consent form and the survey forms. Group B participants will receive a packet at their work place that includes an instruction sheet, consent form, survey form, and a postage-paid addressed envelope in which to return the completed materials.

The survey will include the Color-Blind Racial Attitudes Scale (CoBRAS). The CoBRAS assessment provides a reliable measurement of racial color-blindness (Neville et al., 2000). The 20-item instrument employs a 6-point Likert scale for expressing agreement or disagreement with statement such as: White people in the U.S. have certain advantages because of the color of their skin. Everyone who works hard, no matter what race they are, has an equal chance to become rich. Race plays an important role in who gets sent to prison. The instrument was initially tested and validated in a series of 5 studies involving 760 university students (Neville et al., 2000). Appended to the CoBRAS form, using the same 6-point Likert scoring scale, will be 4 statements such as: I am familiar with the murals in the west wing of the Zimmerman Library. I am aware of objections to the murals. I have an opinion about the appropriateness of the murals. Although I may not agree with them, I understand why some objections have been voiced. At the conclusion of the survey, space will be provided for the participant to volunteer for the second phase of the study.

The second phase of data collection for the study will consist of a series of group interviews consisting of up to 4 groups of 8 volunteers from Group A, and 2 groups of 6 faculty/staff volunteers from Group B. Interviews will be scheduled to last 60-90 minutes. The group interviews will be facilitated by the evaluator, who will provide each group with an explanation of the objectives for the interview, slides of the four murals, a summary of some of the objections to the murals, and then facilitate an open-ended discussion with participants concerning their feelings and attitudes about the murals. Tailored from questions suggested for classroom discussions about public art (Argiro, 2004), questions such as these could be asked to stimulate discussion: If the characters in the murals emerged today, what stories might they tell about their lived experiences? Discuss what each of the three ‘races’ might object to, or embrace, in each of the murals. If a similar work were to be commissioned today, what would you suggest to President Schmidly to include, or avoid?

The researcher will be assisted by a colleague whose role is to observe, take notes, and prompt the facilitator as necessary. The group will be recorded using an audio recorder.

Each survey will be scored and analyzed to compute the CoBRAS color-blind score and note the responses to the 4 questions regarding awareness of the mural. Results will be entered into an Excel spreadsheet and summed by sample, Group, and total. Means and standard deviations will be calculated. CoBRAS scores will be compared with responses to the 4 additional questions to look for any correlations between degree of color-blindness and awareness of sensitivity about the murals. Detailed notes from the group interviews will be transcribed and analyzed. Items of interest will be coded and categories will be recognized as they emerge from the analysis. Observations regarding the interpersonal dynamics of the participants, revealing comments, and tabulated frequency results from coded items will be captured. From all these data, a thorough narrative report will be prepared and delivered to the university administration.

Conclusions/Implications

It is hoped and anticipated that this study will raise awareness about the Kenneth Adams beyond those who participate in the survey and interviews. I hope to capture a representative sample of the UNM community’s sentiments about not only the murals as art, but also their views about the historical and sociological issues inherent in the murals. Through the interviews, I expect to gain insight into the effects of a rational and informative conversation about the murals on those who previously expressed little awareness or sensitivity about them. I believe that a significant number of these participants will agree that action is called for, beginning with a dialogue with the university administration. I hope that such a dialogue would become an ongoing activity beyond the limited term of the study, bringing much-needed and long-awaited focus on the lived realities of the three depicted peoples, rather than the muralized myths perpetuated by White privilege.

References

Argiro, C. (2004). Teaching with public art. Art Education, July. 25-32.

Brayboy, B.M. (2000). The Indian and the researcher. Qualitative Studies in Education, 13(4), 415-426.

Counsell, C. (2004). Looking through a Josephine-Butler-shaped window: focusing pupils’ thinking on historical significance. Teaching History, 114, 30-36.

Creswell, J.W. (2007). Qualitative Inquiry and Research Design: Choosing Among Five Approaches (2nd ed.) Thousand Oaks, CA: Sage Publications, Inc.

Gómez, L.E. (2007). Manifest Destinies: The Making of the Mexican American Race. New York and London: New York University Press.

Grammer, G. (2010, August 23). Vandals hit Cross of the Martyrs for third year in a row. The Santa Fe New Mexican. Retrieved from http://www.santafenewmexican.com/localnews/Cross-of-the-Martyrs-Third-year-for-vandals–message-in-red

Horse, P. G. (2005). Native American Identity. New Directions for Student Services, 109, 61-68.

King, C.R. (2008). Teaching intolerance: Anti-Indian imagery, racial politics, and (anti)racist pedagogy. Review of Education, Pedagogy & Cultural Studies, 30(5), 420-436. doi: 10.1080/10714410802426574

Lankford, E.L., & Pankratz, D.B. (1992). Justifying controversial art in arts education. Design for Arts in Education, 93(6), 17-25.

Marschall, S. (2008). Transforming symbolic identity: Wall art and the South African city. African Arts, 41(2), 12-23.

Neville, H. A., Lilly, R. L., Duran, G., Lee, R. M., & Browne, L. (2000). Construction and initial validation of the Color-Blind Racial Attitudes Scale (CoBRAS). Journal of Counseling Psychology, 47, 59-70.

Noley, G.B. (2008). Writing American Indian history. American Educational History Journal, 35(1/2), 95-101.

Ruiz, V.L. (2006). Nuestra América: Latino history as United States history. Journal of American History, 93(3), 655-672.

Steinfeldt, J.A., & Wong, Y.J. (2010). Multicultural training on American Indian issues: Testing the effectiveness of an intervention to change attitudes toward Native-themed mascots. Cultural Diversity and Ethnic Minority Psychology, 16(2), 110-115.

Trujillo, M.L. (2009). Remembering and Disremembering. In Land of Disenchantment: Latina/o Identities and Transformations in Northern New Mexico (pp. 27-56). Albuquerque, NM: University of New Mexico Press.

University of New Mexico Libraries. (2009). Murals by Kenneth Adams in Zimmerman Library. Retrieved from http://elibrary.unm.edu/zimmerman/murals.php

Wilson, C. (2003). Ethnic/Sexual Personas in Tricultural New Mexico. In Rothman, H. (Ed.), The Culture of Tourism, the Tourism of Culture: Selling the past to the present in the American Southwest (pp. 12-37). Albuquerque, NM: University of New Mexico Press.

———————

SUBSEQUENT NOTES:

  1. A version of the material found on the UNM website regarding the murals is posted on a piece of paper next to the first mural in the Zimmerman Library. This was overlooked when I prepared the prospectus and should have been mentioned. If anything, to my mind it does more harm than good by explicitly stating the underlying myth.
  2. In an email dated December 15, 2010, Dr. Bedard objected to my summarized characterization of the views that she had expressed about the murals during the previous July and August. After carefully reviewing that email exchange, I maintain that my wording in this prospectus is appropriate.

Methodological Review

A Methodological Review of
Muslim American Youth: Understanding Hyphenated Identities through Multiple Methods
by Selcuk R. Sirin and Michelle Fine

Sirin Fine CoverIn Muslim American Youth: Understanding Hyphenated Identities through Multiple Methods (2008), Selcuk R. Sirin and Michelle Fine document two years of research on what it means to come of age as a Muslim living in the shadow of moral exclusion in post-9/11 America. As indicated by the title, the authors employ a variety of research methods, drawing from both qualitative and quantitative approaches in order to study the effects of living on, at, and in the hyphenated construct of “Muslim American.” Sirin and Fine use the figurative (not literal) hyphen as a metaphor throughout the book to highlight the identity-related tensions that this group of young Americans, who also happen to be Muslims, must reconcile.

Based on their own upbringings and lived experiences in the Muslim (Sirin) and the Jewish (Fine) traditions, and informed by a thorough review of historical precedents and insightful analysis of previous empirical research on the effects of discrimination and prejudice, Sirin and Fine document their study and findings in descriptive and engaging detail. They present a compelling portrait of the many and varied complexities that underlie not only the multiple realities of Muslim American identities, but also the research methodologies required to study, analyze, and understand such diverse realities. They challenge pre-defined research categories and paradigms (e.g., their insistence on self-identifying as methodologists rather than as qualitative or quantitative researchers) as well as conventional thinking (e.g., the “clash of civilizations” frame). The perspectives that guide their “research exploration” (p. 13) are complex, flexible, adaptable to the subject under investigation, and not beholden to any particular textbook-based theoretical approach.

Therefore, while this methodological review of the Sirin and Fine study will follow the general flow of a naturalistic study as defined by Lincoln and Guba (1985), I will attempt to highlight deviations or exceptions to the Lincoln and Guba process as applicable.

Theoretical Paradigms

Using the four theoretical paradigms described by Creswell (2007, Chapter 2 ), this study by Sirin and Fine does not fit discretely into any one paradigm, but manifests perspectives which can be attributed, to some degree, to each of the four.

While the study relies primarily on qualitative methods, certain aspects of Postpositivism are clearly exhibited by Sirin and Fine, including: their rigor and scientific approach in defining and conducting methods and processes, as well as in their documentation of results; their use of previously-validated quantitative instruments, such as assessment tools for measuring perceptions of discrimination and prejudice; their recognition of multiple perspectives; and the use of multiple levels of data analysis.

The Social Constructivism worldview is also clearly reflected in their research methods, as well as the language that Sirin and Fine use in describing their results. While not specifically using the term “multiple realities,” they repeatedly emphasize the importance of context in multiple dimensions — as a group, as individuals, as a gender, as a generation, during a specific period, etc. They struggle with whether to use the label of “Muslim American,” acknowledging that there is no such single, fixed identity as the categorical label implies. Therefore they make concerted and conscientious efforts to qualify the term when used within a specific context. For them, the underlying purpose of this study is to understand and give meaning to the phenomenon of growing up as a Muslim in post-9/11 America. To achieve this, they devote significant effort to giving voice to those living the phenomenon. In doing so, they take care to constrain their findings and qualify their generalizations, recognizing the contextual limits of such generalizations.

The Advocacy/Participatory worldview is evidenced by the authors’ decision to study this specific demographic group — a group defined and constructed by social and political events following 9/11. Sirin and Fine exhibit this perspective in their participatory engagement with the young people they recruited for the study. Given the sensitive and personal nature of the questions they asked and the topics they addressed through interviews, focus groups, surveys, and identity map drawings, Sirin and Fine forced their participants to self-reflexively think about not only their responses, but the feelings and attitudes about their life conditions that yielded those responses. These 200 or so young people could not avoid being personally affected, to varying degrees, by their participation in such a study. And I presume that readers of this book cannot avoid being personally affected after gaining such an insightful understanding of the lived experiences of these young participants and fellow American citizens.

Lastly, the worldview of Pragmatism is reflected in the authors’ insistence on avoiding categorical labels while emphasizing their roles as methodologists. Indeed, they profess the belief that there is “no methodological justification to limit ourselves to a single ‘fixed’ methodology.” (p. 24)

Informing the Study

Sirin and Fine prepared for this complex and uncertain research exploration by establishing an extensive and informed foundation of history, research results, and potential methods. Their personal backgrounds — Sirin as a child raised by a secular father and devout Muslim mother, forced to “come out” as a Muslim after 9/11 (p. 22), and Fine as the child of Jewish immigrants who has previously compiled an impressive record of critical feminist research — reflect their own histories of living “at the hyphen” (p. 23) of national and religious identities. As researchers for this specific study, they bring a unique blend of both lived experience and informed praxis that are required in order to define, implement, and analyze such a complex and multi-dimensional study. They recognize the challenges of such an inquiry in their insistence on identifying themselves as “methodologists.” (p. 23) They eschew association with “qualitative and quantitative camps” (p. 23), insisting on the flexibility of employing multiple methods in order to provide the most accurate, robust and insightful portrait of their subjects.

Sirin and Fine demonstrate an impressive familiarity with American history and the body of previous empirical research that addresses the impact of prejudice and discrimination on minority youth. They summarize the history of targeted marginalization of minorities such as African slaves beginning in the 17th century, the subjugation of Native American Indians during the westward realization of manifest destiny, the oppression of Mexican and Hispanic descendants after the Treaty of Guadalupe Hidalgo, the Jim Crow era of segregation of blacks that continued through the Civil Rights era, and the internment of Japanese Americans during World War II. In historical context, therefore, the current “moral exclusion” of Muslim Americans is nothing new. To inform the design and scope of their study, Sirin and Fine draw on the body of prior moral exclusion research literature, available instruments to assess perceived discrimination and prejudice, demographic census data, and the advice of an Advisory Group comprised of six student participants.

The authors acknowledge approximately thirty individuals who assisted with the study, from conducting the focus groups to providing clerical support. They acknowledge a debt to New York University and City University of New York for providing facilities and support throughout the two-year study, which was funded by New York University and the Foundation for Child Development.

Study Process and Methods

Having bounded their study to focus on the phenomenon of Muslim American youths coming of age “on the hyphen” and in the shadow of moral exclusion in post-9/11 America, Sirin and Fine quite explicitly selected their sample of participants. They required that potential participants identified themselves as Muslim, were either first or second generation in this country, had to have become an official resident of the United States prior to age 10, speak fluent English, and were between the ages of 12 and 25. The selected sample of over 200 young people provided adequate representation from which the researchers were able to collect raw data through a variety of forms, including pre-existing validated assessment surveys, questionnaires, focus groups, in-depth personal interviews and psychological projection tools such as identity maps.

They employed an array of techniques to analyze the collected data. While the scoring of the quantitative surveys was straightforward, Sirin and Fine had to implement a thoughtful and rigorous process for interpreting and understanding the voluminous data acquired through the interviews and focus groups. Not only did they record these encounters (evidenced by extensive verbatim quotes in the book), but they insured that at least two researchers were present in order to compare reactions and interpretations. As they compiled these data, they developed a scheme for categorically coding the results in order to systematically analyze, tabulate, and evaluate what the data meant and suggested. From these data, Sirin and Fine were able to provide rich portraits of individuals, accurate composites of distinct types, and statistically relevant characterizations of the overall sample. For interpreting the psychological projections inferred from the identify maps, they relied not only on Fine’s expertise as a psychologist but also consulted other professionals to confirm interpretations and judgments. From these interpretations, three categories emerged that summarized the inferred assessments of how well the young people had either: 1) integrated their Muslim identities with their American identifies; 2) maintained parallel and unconflicted identities; or 3) exhibited significant conflict between their two identities. (p. 121) These results exemplify the thorough and deliberate processes applied by Sirin and Fine in their efforts to inductively analyze the data in order to understand its significance and meaning.

From what began as a research exploration lacking an a priori thesis or prescribed process, a design for the study emerged from the inferences and understandings suggested by the data. The first example of this recounted by Sirin and Fine was their reliance on an Advisory Group of six students, who provided initial guidance and feedback on certain topics or themes to address or avoid. The study proceeded as the researchers assessed the results from their various data collection techniques, not according to a predetermined schedule.

This emergent process, over the course of the two-year study, produced the results that Sirin and Fine document in the book as descriptive findings, careful conclusions, and limited generalizations. The research results provide not only a detailed description of the targeted phenomenon, but they suggest a theoretical viewpoint about how any marginalized subset of American society is affected by being forced to “live at the hyphen” of two different identities. This theoretical perspective is thoroughly grounded in the research and extensively documented by the authors.

Study Results

Although Sirin and Fine do not comply explicitly with Lincoln and Guba’s mandate that study results “must be subjected to scrutiny by respondents who earlier acted as sources,” (p. 211) they do provide evidence that their data and conclusions derived from the data were confirmed or negotiated with the participants throughout, rather than after, the study. They give voice to their participants indirectly through the guidance and feedback offered by the Advisory Group. They give direct voice to their participants by the numerous and sometimes lengthy verbatim quotations of the participants’ comments from interviews and focus groups. Their insistence on having at least two researchers present during discussions indicates to me an exemplary sensitivity to accuracy and thoroughness at the point of the researcher-participant encounter, rather than a more distant after-the-fact member checking.

The narrative form that documents this study (the book) could be considered a “case report” as loosely defined by Lincoln and Guba (p. 214). Sirin and Fine protect the anonymity of their respondents by using pseudonyms. Their detailed accounts of historical precedents, as well as extensive documentation of their methods, observations, encounters, and analyses of all forms of collected data, represent fine examples of “thick description.” However, they do not focus their study on individual cases, but the sample as a whole. The profiles of specific individuals, the transcribed excerpts from interviews and focus groups, and the selected identity maps all provide rich details about individuals as to be expected in a case report, but Sirin and Fine use these as data points to portray and characterize the sample from which to draw tentative conclusions.

This study serves as a fine example of an “idiographic interpretation,” as defined by Lincoln and Guba. (p. 216) Sirin and Fine repeatedly emphasize the importance of context in terms of time, place, and individual. They point out the “deep distinctions and variations among Muslim Americans” (p. 43) revealed through their research based on discriminating variables such as age, gender, generation, geography, religious sect, etc. They strive to understand the perspectives of their participants according to their particular contexts, and they take care to generalize only as appropriate to the data. Their struggle with the label of Muslim American reflects their sensitivity to terms, not for simple politically-correctness but as a sincere effort to accurately describe a potentially inaccurate, yet still useful, verbal construct.

This overall attitude of context-specific and holistic interpretation carries through to their summary conclusions regarding the meaning of and applications for their findings. Throughout their narrative, Sirin and Fine exhibit a self-awareness of the inherent limitations of a study like this that involves dynamic, developing human beings. They characterize their conclusions in qualified language that is commensurate with their findings without speculation. They describe findings using phrases such as provide evidence for and lean toward rather than in more dogmatic or absolutistic terms. This articulation of tentative conclusions and applications for their work is evidenced in the title of the very last section of the book, “Concluding Thoughts … For Now.” (p. 204)

Lincoln and Guba conclude their discussion of the naturalistic inquiry process by suggesting four criteria to assess the trustworthiness of a study: credibility, transferability, dependability, and confirmability. (p. 219) Applying these criteria, the Sirin and Fine study certainly scores high in credibility. They conducted a prolonged engagement with persistent observations over the two-year period of the research project. They employed a variety of research methods, data collection and analysis techniques, and activity sequences with their participants in order to triangulate results for more robust, valid, and supportable conclusions. They sought peer reviews by consulting advisors and other experts throughout the study period, and they solicited critical comments from peers as their prepared their manuscript. They pursued some measure of negative case analysis by noting individual exceptions whose responses or identity maps did not conform to the majority or the norm. They obtained some degree of member checking throughout the process (as explained above under “Study Results”), although they did not apparently provide an opportunity for respondents to review the manuscript that documented their final results.

With respect to transferability, in Muslim American Youth Sirin and Fine have provided a full accounting, or “thick description,” (Lincoln and Guba, p. 219) of not only their research methods and processes, but their own histories, perspectives, tentative conclusions, and suggestions for future research. They have provided a well-documented template which other researchers could use to conduct similar studies for similar purposes, using different participants of perhaps different ages, in different geographic areas, from different socioeconomic backgrounds, etc. Therefore I would maintain that this study exceeds the threshold defined by Lincoln and Guba for transferability.

Finally, Lincoln and Guba suggest that auditing can establish a measure for determining dependability and confirmability. (p. 219) Although Sirin and Fine do not indicate that any audit was performed on their data, I presume from the material they present in the appendices that they maintain all raw data and such an audit could be performed.

Critique

As the first book-length qualitative research study I’ve read, I was encouraged to find that a thorough and rigorous empirical study could be documented in such a readable and engaging format. While I have several criticisms regarding some of the narrative choices made by Sirin and Fine in their organization of the book, I can offer only one comment concerning the study itself. In applying the methodological framework described by Lincoln and Guba, the one recognized omission is that Sirin and Fine did not specifically state they negotiated outcomes with their respondents during the preparation of the manuscript. Without knowing this for sure, I can only speculate two cases — they did not solicit comments, or they did but chose not to mention it in the book. My criticism, then, is that it would have been informative and beneficial for the reader if the authors had explained the reason for the omission.

Several aspects of the book’s organization caused me some difficulties as a reader. First, the non-linear organizational structure chosen by the authors made it impossible to discern the sequence of the study. Second, I was confused by the inclusion of so many different participant names. The authors highlighted six students in different chapters, but throughout the book also referred to or quoted a dozen or so others. I would have found it easier to relate to the students on a more personal and intimate basis if the authors had included more about the six highlighted students throughout the book, rather than in isolated profiles. Third, although I very much appreciated the inclusion of the colorful identity maps, I found it cumbersome and interruptive to have to leaf to the back of the book to view the maps referenced in the text. (I realize this choice likely resulted from a publication necessity to colocate the glossy pages in order to reproduce the color artwork.)

Lastly, even though they did explain that references to “the hyphen” were to be taken figuratively (or metaphorically), I feel there is a legitimate case to be made that a literal hyphen (“Muslim-American”) may have provided a more effective construction in presenting their results. Had they chosen to use the actual hyphen, it would have been easier to visualize, and therefore easier to manipulate. For example, in discussing the categories that emerged from their analysis of the identity maps, Sirin and Fine could have demonstrated the joining effects on the hyphen in the case of those students who evidenced integrated identities (in which case the hyphen is appropriate), compared to those who felt parallel or non-integrated identities (no hyphen or perhaps “Muslim|American”), compared to those with conflictual identities (perhaps “Muslim/American”). This could have led to a more generalized discussion that challenges not only the appropriateness of the hyphen, but the appropriateness of all multiple-identity monikers (e.g., African Americans, Hispanic Americans, Native Americans, etc.). However, given the bounds of the research study as they defined it, their decision to use the figurative hyphen is understandable.

References

Creswell, J.W. (2007). Qualitative Inquiry & Research Design: Choosing among five approaches (Second Edition). Thousand Oaks, CA: Sage Publications, Inc.

Guba, E.G. and Lincoln, Y.S. (1994). Competing paradigms in qualitative research. In N.K. Denszin and Y.S. Lincoln (Eds.), Handbook of Qualitative Research (pp. 105-117). Thousand Oaks, CA: Sage Publications, Inc.

Lincoln, Y. and Guba, E. (1985). Naturalistic Inquiry. London: Sage Publications, Inc.

Sirin, S.R. and Fine, Michelle. (2008). Muslim American Youth: Understanding Hyphenated Identities through Multiple Methods. New York and London. New York University Press.

WWbhD?

bell hooksWhat Would bell hooks Do? (Regarding the Kenneth Adams murals in the Zimmerman Library, University of New Mexico)

June 2010

Per the objectives for this course, I have gained a greater understanding of how educational institutions reproduce social inequalities. I have also studied, and understood to some degree, the severe criticisms levied against these institutions by authors such as Paulo Freire, Ivan Illich and bell hooks. Thanks to Bill Moyers, I’ve become acquainted with the work and accomplishments of Myles Horton, founder of the Highlander School. Having no prior knowledge of any of these educators, their philosophies or their work, and therefore no pre-existing bias or opinion, “openness” was not an issue for me in terms of assessing their views although I don’t concur with or accept all of them.

However, when it was suggested to the class that on this very campus there existed a good example of the white supremacist, racist, male-dominated oppression that has fueled the critical theorists, I bristled. In advance of my own chance to experience and evaluate it on my own, I heard from a figure of authority that there was something on campus — specifically, the mural in the west wing of the Zimmerman Library, funded by the Works Progress Administration (WPA) in the 1930s — that I was supposed to find offensive. So what if I didn’t find the mural panels racist, or objectionable, or insensitive? What if I determined through my own “openness” of experience and reflection that the mural simply depicted the artistic views of one artist which were inevitably influenced and shaped by his times? Was there room here for an open debate and consideration without resorting to easy, lazy and dismissive labels? Or had it already been decreed that the panels were racist, et al — case closed. And if that was the case, did that mean there was something fundamentally deficient and wrong with me that I didn’t react with the requisite outrage expressed by one authority figure?

I had to see the murals for myself. Last Friday I made a quick pass through the library, fully prepared to find the murals benign, the artist unfairly maligned, and further evidence of how someone sees what they want to see because it fits the narrative of their own predisposed agenda.

Instead, I had to shake my head and sigh at what I saw. I didn’t even stop to analyze the specifics. As I absorbed the fourth panel, I thought to myself — this is bad. I hurried out and away with questions such as who did this, and why, and more specifically, what were they thinking?

Googling UNM Zimmerman library mural, the first search result returned the library’s own page that described and explained the mural, painted by Kenneth Adams, one of the Taos Society of Painters (http://elibrary.unm.edu/zimmerman/murals.php). The library building, including space for the four panels in the west wing, was designed by the well-known southwestern architect John Gaw Meem and was named a building of the century by the American Institute of Architects (http://elibrary.unm.edu/zimmerman/history.php). According to the mural’s web page, then university president James Fulton Zimmerman envisioned the four panels to depict the Indians as artists, the Spanish contributing agriculture and architecture, the Anglos contributing science and the fourth panel showing “the union of all three in the Southwest.” With his commission from Zimmerman, Adams was given complete artistic freedom to interpret and express this vision.

Upon seeing the mural panels myself, without knowing the history behind them, I based my this is bad response on what seemed to me obvious objections. Each different “race” is represented by stark color differences; the Indians and Spanish are depicted in subservient poses with heads bowed, with women kneeling, men engaged in menial labor wearing “native” work clothes; the fair-haired, blue-eyed Anglo doctor is responsible for delivering life as his identically-fair assistants are seated doing “scientific” work; and then the “union” of the three (male) races made possible by the Anglo in the middle facing outward with full facial features, with the Indian and Spanish now adopting the Anglo’s clothes, joined only through the patriarchial Anglo, both faces in profile without discernible features.

After reading the history of the mural and Zimmerman’s charge to Adams, the deeper issue to me is the underlying presumption that the three “races” had indeed “united,” with the clear implication that this union resulted through mutual and peaceful willingness. There is no hint of Spanish conquest (both through military force and religion) and subjugation of the native peoples, or the wars that Anglos (Americans) fought with the Spanish descendants and Mexicans, or the broken treaties, exploitative native-as-tourist-attraction commercialization of a culture, or the near eradication of that culture through patriarchal and dominating policies such as the Indian schools that the Anglos (Americans) perpetrated upon the proud indigenous peoples who had occupied these lands for over a thousand years.

Both in terms of the presumptions that created the vision, and the artistic expression of that vision onto the panels, I find the mural worthy of offended judgments, sincere objections and harsh criticism, irrespective of its otherwise “artistic” contribution to its historic home.

So … now what? The mural is there. What do we do with it?

Let’s consider a range of actions that are available to critics, perhaps along a continuum from the most benign (a “1”) to the most radical (say “5”). At the extremes, a “1” could be to do nothing, to just maintain the status quo, tolerate it, don’t draw attention to it, and let it be. A radical “5” reaction would reflect the harshest, most visceral offended feelings that might advocate elimination of the offensive material by any means necessary to deface or destroy it. Not surprisingly, I would not advocate either extreme.

A less benign approach (call it a “2”) for critics might be to request a meeting with the university administration to present their concerns about the presumptions and depictions symbolized by the murals. The objective of this approach would be to merely get the hearing, while trusting in good faith that the administration will act with good judgment after duly considering all the factors.

A less-than fully radical approach (“4”) would also include a meeting with administrators, but the objective of this more aggressive approach might be to demand a prescribed action. The demands could themselves be chosen from a spectrum of possibilities, such as to paint over the murals, to commission a more critical artwork to reside in the same room as a counterpoint to Adams’ murals, or to simply mount a small written display that explains to the viewer why the work was commissioned and why some find it objectionable.

The “3” option, per my scheme which reflects my own feelings, would be to dialogue (in the terms of critical theory) with the administration, the objective being to use the context of the murals as an ongoing learning opportunity on the university campus. Rather than scorned and despised by some while ignored by others, the mural could become an educational asset for all. Some examples: it could be a stop for all incoming freshmen during summer orientation with an informative briefing about the mural and its history so that they can interpret it in context, or to allow them to simply experience it for themselves and then discuss their reactions; it could be the source of assigned essays for many different courses from various perspectives (artistic, historical, sociological/cultural, psychological); it could serve as an example of how different our perspectives are now than they were 70 years ago, and as a basis for speculating how the perspectives of those 70 years into the future might differ from ones we hold today.

That’s what I would do about the Kenneth Adams mural. But since I’ve read Teaching Community: A Pedagogy of Hope by bell hooks this week, I’ve wondered … what would bell hooks do? (WWbhD?)

I won’t attempt to seriously speculate an answer to that question, but based on her book, here are some reasons why I believe bell hooks might support my solution. First, she emphasizes throughout the book the importance of not just thinking and thoughts and words, but the resulting actions and behaviors; I don’t believe she would tolerate silence or be content with verbal outrage. Second, she advocates for living in community, which for her goes beyond simple recognition of diversity to what she defines as pluralism, or “commitment to communicate with and relate to the larger world — with a very different neighbor, or a distant community.” She recognizes the humanizing value of living with and among “folks not like us” (my term). Entering into the dialogue I’ve proposed initiates that type of communication. Third, she reiterates that those seeking to advance anti-racist, anti-white-supremacist attitudes (and behaviors) must be willing to accept and expect that change among the racists, et al, is possible, and that change can and must involve learning to unlearn racist, white supremacist, patriarchal ways and views. Therefore I believe that bell hooks would also see the potential for learning and opportunity for community building that the presence of the Kenneth Adams mural provides.

Would something like I proposed work? Does something, anything, even need to be done? What if nothing is done? It seems to me that the worst failing for an educational institution is for it to avoid a learning opportunity or fail to take advantage of a “teachable moment.” I’ve experienced this before. As I wrote in a column for the Fort Worth Star-Telegram in May 2008, two months earlier the trustees of TCU had run away from hosting an event to honor and feature the Reverend Jeremiah Wright, less than two weeks after then-Senator Barack Obama’s speech on race in America, which had been precipitated by video clips from Reverend Wright’s sermons. I asked then, what if Reverend Wright had been permitted to come as planned? What educational value might the TCU students have derived? “This community had an opportunity to go beyond talking about talking about race. We could have started the conversation.” The pressure to thwart that conversation was exerted by white male patriarchs whose primary concern was not educating the students, but rather not offending multi-million dollar donors. Here at UNM, there does not seem to be any existential pressure from either critics or the administration. So perhaps this is simply an essay written to complete an assignment. But it seems to me that someone should be considering if this could be a learning opportunity that’s too valuable to ignore. What matters isn’t the speculation about what bell hooks would do, but the consequences of what UNM does, or doesn’t, do.

Changing Thinking

Deschooling Society CoverChanging Thinking (Reaction to Ivan Illich’s Deschooling Society)

After reading the first chapter of Ivan Illich’s Deschooling Society, I felt something like a cheerleader rooting for the home team. With no significant exceptions, I agreed with his characterizations about problematic schools created by and perpetuated within by a problematic society (thinking about the U.S. specifically). I concurred with his analysis and the presentation of his arguments regarding the economic wastes associated with schooling; the cycle of dependency that schools create within the society; the misplaced trust and reliance on credentials or certifications; the instilled presumption that children (and adults) must learn within the context of a school or else “learning” doesn’t take place; the inevitable and inarguable social divisions that schools exacerbate; and, perhaps most indicting, the fact that despite ever-increasing resources and expenditures, the gap continues to widen between the actual performance of schools compared to the expectations of the communities which support them.

However, after standing to cheer Illich for his “victory” in articulating the problems, I found myself dealing with a similar dissolution as I felt with Freire. Once again, I interpreted a major disconnect between the insightful diagnosis of a dire, perhaps intractable, institutional failing, and the resulting grand pronouncements of an ideologically-consistent but pragmatically- unworkable “non-starter” of a prescription. When I read Illich’s admission on page 73 that the “educational institutions I will propose, however, are meant to serve a society which does not now exist,” my inclination was to throw the book to the floor — so why even bother? At what point do these intellectually-gifted but reality-challenged revolutionary theorists acknowledge that they do not have the luxury of starting clean, without the dirty constraints of a real world peopled by real people who continue to act everyday in accordance with an established set of ideals, attitudes, presumptions, beliefs, expectations, and uncompromising demands?

Thankfully, and appreciatively, that point was exhibited yesterday in Bill Moyers’ interview with Myles Horton regarding his Highlander School. Rather than theorize about an idyllic prescription to manifest “the revolution,” Horton chose to take the approach that engineers might call a “prototype” or “proof of principle.” On a small scale, he implemented big ideas. He demonstrated in action the first necessary change required for “the revolution” — that people can learn to think differently from what they have been taught; indeed, they can learn to think for themselves, as themselves.

I don’t recall that Horton talked specifically about changing thinking, but I interpreted this from his comment that (paraphrasing) we’re about people, not institutions, and nothing will change until we change.

I immediately recalled the quote attributed to Einstein, in various forms, that I’ve noted as, “The world we have created today as a result of our thinking thus far has problems which cannot be solved by thinking the way we thought when we created them.” I believe that Myles Horton lived the principle that Einstein advocated; not only did he change the way he thought about problems, he taught (or facilitated) others to change their own thinking. I also infer from his comment to Moyers that Horton understood the abstract nature of what our language reifies as institutions. Institutions are created, perpetuated, and peopled by people. Therefore to change an institution, you have to change the people who people that institution. And the first step in the change process is to change the thinking.

Two recent experiences illustrate the effects of “institutional” thinking that has not changed because the thinking of individuals has not changed.

Yesterday morning I left my apartment at 6:45a to walk to the Santa Fe Depot to catch the train to Albuquerque. As soon as I stepped outside I heard a low-level humming noise that I first associated with a lawn mower. Within two blocks of the downtown post office, I located the noise source as a worker with a backpack-mounted gasoline-powered leaf blower who was clearing the employee parking lot on the north side of the post office. (I assume stronger-than-usual winds the night before could be blamed for whatever needed to be blown.) My first judgmental reaction was about the early morning noise pollution created by this worker, which caused me to ask myself rhetorically, “why couldn’t you just use a broom?” Not only would that not disturb the neighbors within a three-block radius, but you wouldn’t be wasting gasoline and burning more carbon.

I contemplated this question all the way to the Depot. I came to the conclusion that this worker, and undoubtedly his employer who contracted with whatever governmental entity managed the facilities for the post office, chose to use the gasoline-powered leaf blower because it was the most time- and cost-efficient solution. Assuming the parking lot needed to be leaf-free, the leaf blower might get the job done in a fraction of the time that it would take the same worker to sweep the leaves with a broom. For the contractor, the cost of the gasoline, the emissions, and the noise were insignificant as compared to his out-of-pocket hourly cost for the labor. With the more efficient blower, the contractor could offer the facility manager a good value proposition — for perhaps no more than $20 cost to the manager, the contractor would give her a leaf-free lot (assuming an hour of labor with applicable overheads and profit).

But I kept thinking, why not the broom? If it took the worker an hour with the leaf blower, would it take that much longer with a broom? Which led me to question the need for clearing the lot in the first place; how much was it really worth for the manager to have a leaf-free lot? Would she pay $1,000? Obviously not. Was it worth $50, if that’s what the broom method cost? Maybe, maybe not. So from the manager’s standpoint, there was only value to be gained from the work if the cost for the work was minimal. The leaf blower, powered by a small amount of affordable gasoline, provided the means for this value proposition to be satisfactory to all parties. Except me, of course, and perhaps a few of my sleep-deprived neighbors. What’s there to even think about?

Well, just last week Jon Stewart on “The Daily Show” highlighted in brilliant, comedic detail the decades-old institutional thinking that has maintained and assured the value proposition for small gasoline engines — and big ones. The day after President Obama’s Oval Office declaration to the nation that “now is the moment” for this generation of Americans to “seize control of our own destiny” regarding foreign oil dependence, Stewart’s team pieced together video footage from each of the seven presidents preceding Obama who pronounced similar pablum about petroleum. Beginning with Nixon in 1974, who promised to “break the back of the energy crisis, to meet America’s needs from America’s resources,” each decried our dependence on foreign oil and extolled our can-do ability to break that dependence. For Nixon in 1974, the target date was 1980; for Ford in 1975, 1985; for Carter in 1979, 2000; for Bush in 2006, merely a 75% reduction by 2025. (Apparently, good things don’t always come to those who wait.) Each president touted new technologies, new ideas, new know-how, etc. But yet, if we can consider the U.S. demand for foreign oil as a cancerous pathology, rather than an addiction, it would seem the cancer has now metastasized beyond any foreseeable cure that would not, in turn, kill the patient.

I argue that this particular cancer is the result of self-serving, yet collaborative, institutional thinking by corporate entities that are too big to be displaced. Oil companies, natural gas companies, the drilling industries, the refineries, the automobile manufacturers, parts suppliers, and all the businesses, schools, and government tax bases that depend on petroleum cannot allow thinking that obviates their products and services. Without attributing any more nefarious motives to them other than self-interested profitability and desire to maintain the petroleum-based status quo, these corporate interests, together with their governmental accomplices, have facilitated the fallacious thinking that we can just “keep on keeping on” with respect to our “American way of life.”

The success of this institutionalized message/thought control is that, for many (most?) Americans, the post-BP spill national priority ought to simply be a return to “normal” as soon as possible. The thought that this type of thinking could be debated, or even considered, seems … unthinkable.

It’s ironic-to-me that in 1974, when Nixon first proposed energy independence as a national objective, there was nothing known as a “personal computer.” The following year, a small company in Albuquerque built a kit known as the Altair 8800 that was featured on the cover of Popular Mechanics magazine. Two friends at Harvard, Bill Gates and Paul Allen, bought the magazine and rushed to Albuquerque to convince the Altair’s designer and owner, Ed Roberts, that they could write the software for the Altair. They eventually left to form their own company, Microsoft. In 1976, Steve Wozniak and Steve Jobs formed Apple Computer in California. Thus was born the personal computer industry.

So in the past 36 years, one industry —personal computing — was created and has grown beyond any realistic expectations, while another industry — petroleum — has steadily grown, despite repeated pronouncements by presidents that this cannot be sustained. So while our thinking about computers has changed dramatically, our thinking about petroleum — and all its economical, ecological, and environmental consequences — has not. It would seem that we Americans can adapt our thinking to assimilate new and inventive and innovative discontinuities (like personal computers), but we have a hard time accepting the inevitable discontinuities associated with displacing or obviating long-established institutions that have satisfactorily served us in the past, and at the present.

Therefore initiatives like Myles Horton’s Highlander School are needed now more than ever, with ever-increasing stakes, but for the same purpose — not to promote change that can be believed in, but change in thinking that inexorably necessitates action.