For more than eight decades, the work of E.K. Strong, Jr., has served as a standard bearer in the field of vocational and educational interest measurement. This evaluation will summarize the history of the instrument, its current form, content, and administration, and evidence of its reliability and validity. It should be noted that the author has twice used earlier versions of the Strong Interest Inventory to inform his own personal career reassessment.
E.K. Strong, Jr., served in World War I as a psychologist who contributed to designing new vocational instruments necessitated by America’s entry into the war. For the first time, the U.S. government faced the urgent need to rapidly mobilize a large fighting force. The emerging field of psychology was called upon to assist the war effort by developing tests to guide military decision-making in quickly assessing a recruit’s fitness for combat service. These early tests provided a general indication as to whether each recruit would best serve as “cooks and which should be members of the cavalry” (Donnay, Morris, Schaubhut, & Thompson, 2004, p. 1).
After the war, Strong continued to develop his thinking on the factors related to vocational interests. Observing the rapid post-war industrialization boom and the accelerated transition of America from a dispersed agricultural economy to a diversified manufacturing economy, Strong’s rigorous scientific research methods (Campbell & Hansen, 1981) were based on two assumptions. First, different occupations tended to attract people with certain personality and psychological attitudes and interests. Second, people who share similar attitudes and interests with those who are happily engaged in a particular occupation will likely find fulfillment in that occupation themselves. In 1927, he published his first instrument, the Strong Vocational Interest Blank (SVIB), which purported to assess both vocational and educational interests.
Beginning in the 1950s, other psychological researchers continued to build on, refine, modify, and update Strong’s work. These included David P. Campbell, who extended Strong’s inventory and analysis such that for almost two decades the assessment was known as the Strong-Campbell Interest Inventory (Campbell & Hansen, 1981). Jo-Ida C. Hansen worked closely with Campbell during this period and continued to contribute research and analysis through the 1990s. John L. Holland’s theories and taxonomies on occupational themes, based on Strong’s work, were incorporated into the assessment in 1985 in the form of the General Occupational Themes (GOT) and the Basic Interest Scales (BIS) (Donnay et al, 2004). A significant milestone in the evolution of the assessment instrument came in 1974 when, reacting to the heightened sensitivities for gender equality and non-discrimination, the contributors merged what had been two different forms for men and women into just one form.
Since its initial publication in 1927, the inventory bearing Strong’s name has been regularly and methodically revised. The current form of the Strong Interest Inventory (SII) was revised in 2004 and documented in the manual edited by Donnay, Morris, Schaubhut, and Thompson (2004), on which the following summary is based.
The SII is published by CPP, Inc., based in Mountain View, California, which claims that more than 70% of all U.S. colleges and universities use it to assist students in identifying their educational and vocational interests (CPP, Inc., 2009). Different forms of the instrument are tailored for high school students, college students, professionals, as well as for individuals in different countries across the globe. There are also versions of the inventory that are sold and administered in conjunction with the Strong Skills Confidence Inventory and the Myers-Briggs Type Indicator. The per-item cost of the SII begins at $7.85 for the college form in quantities of more than 500, with higher costs for different forms and for lesser quantities. The manual and a variety of evaluation reports and guides are also available from the publisher. Online as well as paper versions of the different forms are available for both the 1994 and 2004 editions of the SII.
The SII instrument consists of 291 item statements, organized in six sections that reflect different activities, occupations, traits, etc., that respondents rate using a 5-point Likert scale, from Strongly Dislike (or Strongly Unlike Me) to Strongly Like (or Strongly Like Me). The instrument is not time-limited, but is typically completed in 30 to 45 minutes. The instrument itself, its instructions, and the individual report are written at an eighth to ninth grade reading level.
The individual respondent’s responses to the item statements are evaluated against three norm-referenced sample categories that have been continually updated throughout the evolution of the SII. The General Representative Sample (GRS) consists of 1,250 men and 1,250 women who closely represent the latest demographics published by the U.S. Census. The second category of samples includes those reflected in the 244 Occupational Scales, or job codes, published in the appendix. The third category of samples consists of college students who indicated their academic major at the time of completing the inventory, providing results reflecting 75 different majors.
The results of the SII are provided to the individual in four different formats. At the most general level, the respondent receives a score for each of the six General Occupational Themes (GOT) defined by Holland and known by the acronym RIASEC — Realistic, Investigative, Artistic, Social, Enterprising, and Conventional. These scales are based on an approximate mean of 50 with standard deviation of 10. Holland’s theory purports that these six themes differentiate occupational environments that attract and fulfill different personalities and interests. Each of these six themes can be characterized by four to six categories of professions, which total 30 and are known as the Basic Interest Scales (BIS). Like the GOT, the individual receives an individual score for each of the 30 BIS categories with an approximate mean of 50 and standard deviation of 10. At the most detailed level, reflecting Strong’s initial research, the respondent’s interests are scored against 244 Occupational Scales (OS) or specific job codes. Although presented in similar numerical scale as the GOT and BIS, the scores must be interpreted differently as the norm group for each of the 244 different scales consists solely of people engaged in that particular occupation. The fourth format assesses the respondent’s Personal Style Scale (PSS) according to discriminators related to work style, learning environment, leadership style, and preference for risk taking.
Holland’s theory maintains that the 244 occupations listed in the OS can be mapped to both the 30 BIS categories and the six GOT themes by using either the dominant GOT theme, or a combination of the two or three highest-scored themes. These codes, such as “ASC” for someone scoring highest on the Artistic theme followed by the Social and Conventional themes, are known as “Holland Codes.” He further theorized that each of the six GOT themes could be correlated with the other five according to an expected degree of overlap, with two adjacent themes correlating more highly while the opposing theme correlating much less. This theory yields a hexagonal diagram labeled with the GOT acronyms at each intersecting point such that, for example, one can discern that a high score on the Artistic theme overlaps most highly with scores on the Investigative and Social themes, while the Conventional theme could be considered the opposite of the Artistic with a lower degree of overlapping correlation.
The manual for the SII has evolved with the instrument itself. The contents of the manual are organized as the instrument is organized with chapters devoted to the item statements, GOT, BIS, OS, and PSS. Each of these substantive chapters is similarly structured, including sections that address issues related to interpretation, construction, norming, reliability, and validity. The manual also contains a short introduction, a chapter on the “Administrative Indexes” with useful information for interpreting unusual responses and scores, a chapter to guide the interpreter or counselor on “Strategies and Challenges in Interpreting the Strong,” and the appendix containing data for each of the 244 occupational scales (Donnay et al, 2004).
The current manual provides a straightforward and understandable presentation for the reader who is interested in not only the mechanics of administering the inventory and interpreting its results, but who is also concerned about the theory and the research on which the assessment is based. Each of the five data-driven chapters contains both narrative descriptions and tabulated statistical data regarding composition of samples, means and standard deviations of scores, correlation coefficients, test-retest results, norm groups and norming adjustments, and more.
With respect to reliability, four primary methods are documented in the manual in both descriptive and tabulated data format. Internal consistency calculated with Cronbach’s coefficient alpha is reported for the item statements, themes, and scales. Test-retest methodology was used to gather additional reliability data using two time intervals between tests, “short term” defined as 2-7 months, and “long term” from 8-23 months. Because the 2004 version of the test differed significantly from the previous 1994 version (due to a reduction in the number of item statements, change from a 3-point to 5-point Likert scale, and revisions to both BIS and OS), there was a need to compare the reliability data obtained from the two different test versions for each of the major sections. Because of the purported correlations of Holland Codes that relate GOTs to BISs to OSs, there was also a need to evaluate the internal consistency between each theme and its subordinate scales. The manual presents a plethora of data documenting these wide varieties of reliability calculations. Therefore no single reliability figure can represent the entire inventory. The editors do, however, offer some summarized analysis such as this for the six GOTs in the 2004 version: Cronbach’s alpha for five of the six themes improved from 1994 with all six measuring at least .90, while test-retest reliabilities remained roughly the same at .87 for all themes (Donnay et al, 2004).
With respect to validity, as with reliability, the editors offer an abundance of data that serve as an effective example of what Cronbach and Meehl (1955) proposed as a nomological net for construct validity. Each aspect of a respondent’s assessment (the item statement responses, theme scores, and scale scores) can be correlated to the three comparative samples — the General Representative Sample (GRS, n=2,250), the Occupational Sample (n=3,340), and the Academic Majors Sample (n=879). Within the instrument itself, validity can be assessed in the inter-correlations between GOT, BIS, and OT. The manual reports, for example, that the correlations between GOT themes support Holland’s theory of adjacent and opposite interests on the RIASEC hexagon in that the three “opposite” interest pairs (RS, IE, AC) have correlation coefficients of .07 to .12, while the six adjacent pairs correlate from .36 to .56. Discriminant validity between the 244 Occupational Scales was determined by correlating results within each GOT and BIS according to gender and race/ethnicity. The manual reports on studies which have attempted to assess predictive validity in terms of high school respondents’ selections of academic majors in college, and which occupations college respondents eventually chose. Such studies, as the manual explains, are fraught with complications related to subject attrition, timing, and the inability to control for other factors and changes in the life circumstances of each respondent.
There have been an “extensive number of studies” (Donnay et al, 2004, p. 41) that relate Strong results to other similar measures as evidence of validity. Following is a brief discussion of five studies that reflect a variety of validity studies based on construct evidence, predictive evidence, concurrent evidence, and consequential evidence.
Drawing on data collected over a 29-year period, Johansson (1970) used 129 male occupational samples to assess the construct validity of the SVIB with respect to an Occupational Introversion-Extraversion (OIE) scale. Johansson ranked the 129 samples according to the mean scores determined by the OIE and generally concluded (by eyeball, not quantification) that the degree of Introversion-Extraversion roughly correlated to the SVIB occupational scale. For example, individuals at the extreme of the Introversion scale represented occupations of physicists, farmers, mathematicians, and artists, while at the extreme of the Extraversion scale Johansson found various types of salesmen, political officers, and social workers.
Despite the problems inherent in gathering predictive evidence, Athelstan and Paul (1971) studied longitudinal data from more than 1,500 medical students to determine if their SVIB results accurately predicted their primary medical specialty area of interest. From the initial results, six different specialty areas were determined to be sufficiently represented to consider as criterion groups: 1) general practice; 2) general surgery; 3) OB/GYN; 4) internal medicine; 5) pediatrics; and 6) psychiatry. The students completed the SVIB at the same time as indicating their specialty interests. After a four-year period for those who graduated and became physicians, the SVIB was administered again. Due to a high number of “false positives,” the researchers determined that while the SVIB should not be used as a predictor of medical specialty, the results suggested that a SVIB constructed specifically for this purpose might be beneficial.
A different element of predictive evidence was studied by Hansen and Swanson (1983). They determined that the 1981 revision of the Strong-Campbell Interest Inventory was as predictive of student academic majors as the 1974 version of the inventory had been. They also concluded that it was slightly more predictive of female interests than male. The researchers used a sample of 428 college freshmen who completed the SCII as freshmen, then compared their SCII results with their college major three years later.
Shortly after publication of the 1974 SCII that combined both male and female forms into one common form, Mary C. Whitton (1975) conducted a concurrent validity study to assess same-sex and cross-sex validities. Using the accepted methodology of classifying results as “Good Hit,” “Poor Hit,” or “Complete Miss,” Whitton determined that the percentage of “Good Hits” for same-sex and cross-sex were not significantly different for the General Occupational Themes and the Basic Interest Scales. However, due to small sample size and the fact that the sample was drawn from student volunteers, Whitton expressed caution in drawing firm conclusions from the study.
An attempt to assess consequential evidence of validity was made by Swanson, Gore, Leuwerke, Achiardi, Edwards, and Edwards (2006). They presumed that one measure of “consequence” of the SII results would be the respondent’s ability to recall his or her scores. They designed a study using freshmen students in a semester-long orientation course, which included administration of the SII as part of a 4-class lecture and discussion on career assessments. The researchers designed a test used to indicate the degree to which the students recalled the results of their SII. This test was administered on three occasions: immediately after the students received their SII results, six weeks later, and six months later. The results of the study were inconclusive, other than verifying that the longer-term recall of high scores was related to the accuracy of the immediate recall, and that generally the ability to recall scores diminished over time.
These five studies could be considered “stretches” in terms of their intentions and methodologies, but they are indicative of the breadth and variety of analyses possible from the SII. Given the full and complete manual documentation of the instrument, combined with the enlightened results gleaned from personal experiences with the instrument, this reviewer has no hesitation in endorsing the Strong Interest Inventory as an effective tool for career and educational counseling.
Athelstan, G.T., & Paul, G.J. (1971). New approach to the prediction of medical specialization: Student-based Strong Vocational Interest Blank Scales. Journal of Applied Psychology, 55(1), 80-86.
Campbell, D.P. & Hansen, Jo-Ida. (1981) Manual for the SVIB-SCII: Strong-Campbell Interest Inventory, Third Edition. Stanford, California: Stanford University Press.
CPP, Inc. (2009). Strong Interest Inventory: Help students find satisfying college majors and careers they can be passionate about.. Retrieved from https://www.cpp.com/products/strong/index.aspx
Cronbach, L.J. & Meehl, P.E. (1955). Construct validity in psychological tests. Journal of Psychological Bulletin, 52(4), 281-302.
Donnay, D.A.C., Morris, M.L., Schaubhut, N.A., & Thompson, R.C. (Eds.) (2004). Strong Interest Inventory Manual. Mountain View, CA: CPP, Inc.
Hansen, J-I.C, & Swanson, J.L. (1983). Stability of interests and the predictive and concurrent validity of the 1981 Strong-Campbell Interest Inventory for College Majors. Journal of Counseling Psychology, 30(2), 194-201.
Johansson, C.B. (1970). Strong Vocational Interest Blank introversion-extraversion and occupational membership. Journal of Counseling Psychology, 17(573), 451-455.
Swanson, J.L, Gore, P.A. Jr., Leuwerke, W., D’Achiardi, C., Edwards, J.H., & Edwards, J. (2006). Accuracy in recalling interest inventory information at three time intervals. Measurement and Evaluation in Counseling and Education, 38, 236-246.
Whitton, M.C. (1975). Same-sex and cross-sex reliability and concurrent validity of the Strong-Campbell Interest Inventory. Journal of Counseling Psychology, 22(3), 204-209.