Last Updated on December 16, 2022
The purpose of this post is to provide a comprehensive overview of racial and ethnic disparities on cognitive and academic tests in the United States. The primary focus is on black and white Americans because most data focuses on comparing these groups, but I’ll also mention disparities for other groups (mainly Hispanics and Asians) when such data is available. I start by reviewing data on the magnitude of racial disparities in cognitive ability. Next, I consider racial disparities in other kinds of tests, including college admissions and academic achievement tests, finding that these disparities are about as large as disparities in cognitive ability. Then, to better contextualize the magnitude of racial disparities in test scores, I compare racial gaps to gaps between other groups, such as students from different countries or different levels of socioeconomic status. Finally, I present data on the ubiquity of test score gaps, showing that the gaps persist across all levels of education, across all geographical units of analysis, and across all parental socioeconomic levels.
This post is mainly a descriptive exercise. I’m focusing on reviewing the most important uncontroversial observable patterns of racial disparities in test scores. In this post, I don’t concern myself with explanatory questions (e.g., what is the cause of group disparities in cognitive ability?) or with prescriptive questions (e.g., how should we address group disparities in cognitive ability?). Of course, these are important questions, but I avoid them in this post for two reasons: (1) addressing these complex questions responsibly would greatly expand the scope of this post beyond readability, and (2) I believe that, before explaining a phenomenon or prescribing actions in response, one should become acquainted with the basic observational data. If one doesn’t understand the basic patterns of a problem, then one will be unable to interpret these patterns, which means one will be ill-prepared to diagnose the causes of the problem or to propose solutions to the problem. I believe one should be aware of the information in this post before even attempting to try to explain, or propose responses to, racial disparities in test scores.
The magnitude of the gaps
In this section, I’ll provide data on the size of racial and ethnic gaps on cognitive ability tests. The first subsection will cite various reviews that summarize basic findings from the literature on racial disparities in these tests. The second subsection will directly cite various studies and meta-analyses for more precise measures of the magnitude of such racial disparities. Next, I present data on the racial disparities over time. Finally, I describe some of the statistical implications of such large racial differences in distributions of cognitive ability test scores.
Research on race and IQ has fairly consistently shown that the average IQ score for blacks in the US is about one standard deviation (about 15 points) lower than the average IQ score for whites. In an authoritative report by the APA surveying of the field of intelligence, Neisser et al. 1996 [archived] (page 93) reported the following racial differences in tests of cognitive ability:
The relatively low mean of the distribution of African American intelligence test scores has been discussed for many years. Although studies using different tests and samples yield a range of results, the Black mean is typically about one standard deviation (about 15 points) below that of Whites (Jensen, 1980; Loehlin et at., 1975; Reynolds et at., 1987). The difference is largest on those tests (verbal or nonverbal) that best represent the general intelligence factor g (Jensen, 1985).
Note that this report was written following the release of the controversial book, The Bell Curve. In response, the Board of Scientific Affairs of the APA established a task force of 11 experts on intelligence to prepare the report. Disputes between experts on the task force were all resolved through discussion, resulting in the report having unanimous support of the entire task force. The report has revealed that Hispanics tend to score between blacks and whites, with particularly low scores on verbal subtests:
In the United States, the mean intelligence test scores of Hispanics typically lie between those of Blacks and Whites. There are also differences in the patterning of scores across different abilities and subtests (Hennessy & Merrifield, 1978; Lesser, Fifer, & Clark, 1965)…Latino children typically score higher on the performance than on the verbal subtests of the English-based Wechsler Intelligence Scale for Children–Revised (WISC-R; Kaufman, 1994).
Similar results were reported in a very brief 3-page statement outlining conclusions regarded as mainstream among researchers on intelligence (Gottfredson 1997 [archived]). The statement was signed by 52 experts in intelligence and allied fields to promote more reasoned discussion of research in the field. The statement reported the following racial and ethnic disparities in cognitive ability (page 14):
Members of all racial-ethnic groups can be found at every IQ level. The bell curves of different groups overlap considerably, but groups often differ in where their members tend to cluster along the IQ line. The bell curves for some groups (Jews and East Asians) are centered somewhat higher than for whites in general. Other groups (blacks and Hispanics) are centered somewhat lower than non-Hispanic whites.
The bell curve for whites is centered roughly around IQ 100; the bell curve for American blacks roughly around 85; and those for different subgroups of Hispanics roughly midway between those for whites and blacks. The evidence is less definitive for exactly where above IQ 100 the bell curves for Jews and Asians are centered.
A review of group differences in intelligence by Loehlin (2000) also reported similar racial and ethnic differences in cognitive ability. Consistent with prior reviews, this review reports that “An average difference on the order of one standard deviation between U.S. individuals of predominantly European ancestry (Whites) and predominantly African ancestry (Blacks) has been evident ever since the advent of mass intelligence testing” (page 179). The review also notes that Asians tend to score higher than whites on visuo-spatial abilities rather than visual abilities, although there is some dispute as to whether Asians have higher overall scores (page 180).
There is some dispute as to whether Asian Americans obtain higher average scores on IQ tests than European Americans or score at about the same level…There is, however, a difference between IQ subtests primarily measuring visuo-spatial skills and those primarily measuring verbal skills. Asian Americans (and Asians in Asia) tend to do relatively better on visuo-spatial tests than on verbal tests. Such differences have, for example, been obtained between Americans of Japanese ancestry and of European ancestry in Hawaii, where they have been found to be stable across two generations despite major changes in the degree of acculturation to American ways (Nagoshi & Johnson, 1987).
Similar to Hispanics and Asians, Native Americans, tended to perform better on visuo-spatial tests than verbal tests. However, their overall scores were lower than that of Asians. Thus, while Native Americans tended to perform similarly to European Americans on visuo-spatial tests, they tended to perform worse on verbal tests.
On the whole, Native Americans tend to perform comparably to European Americans on nonverbal tests – particularly those with a visuo-spatial emphasis…Typically, Native American groups obtain lower verbal IQs. In many of the earlier studies, this was confounded with the fact that the tests were given in English, and English was a second language for the group concerned. However, Lynn restricted his tabulation to groups for which English was the first language, and it still showed verbal IQs averaging some 20 points below visuo-spatial ones.
Similar differences are reported in books reviewing more recent data on racial differences in cognitive ability. For example, in a chapter of group differences of intelligence in the book IQ and Human Intelligence, Mackintosh (2011) reported the following on racial and ethnic differences in cognitive ability (page 332):
There can be no serious doubt that African-Americans obtain on average test scores substantially below the white mean…This difference showed up in the early US Army data, was repeatedly confirmed in subsequent studies between the wars (Shuey, 1966), and has been maintained after the Second World War (Loehlin et al., 1975). The first American restandardizations of the Wechsler tests (WAIS-R and WISC-R) revealed that there was still a difference of some 15 points (Kaufman and Doppelt, 1976; Reynolds et al. 1987).
In the past 25 years, a new group difference has threatened to displace that between blacks and whites as the main focus of American attention and concern. Several commentators have argued that just as blacks lag behind whites, so do whites lag behind one other ethnic or racial group, often referred to as East Asians, more specifically meaning mostly Chinese, and to a much lesser extent (since fewer data are available) Koreans and Vietnamese.
In Human Intelligence, Hunt (2011) also confirmed the prior results on racial differences in cognitive ability (page 411):
In order to make a comparison between the scores of different groups we need to have data from a representative sample of the national population. Table 11.4 presents the results from several such surveys involving battery-type tests. There is some variety in the results, but not a great deal. The African American means are about 1 standard deviation unit (15 points on the IQ scale) below the White means, and the Hispanic means fall in between.
A similar gap was observed in a meta-analysis by Roth et al. (2001) [archived]. As far as I know, this is the most recent meta-analysis reporting the magnitude of racial gaps in cognitive ability. The meta-analysis considered studies on racial disparities in general cognitive ability (g) among adults across educational and employment settings. The black-white gap in g hovered at about 1 standard deviation across different settings. The average gap was 1.1 standard deviations across over 6 million individuals pulled from 105 samples. Table 1 shows the magnitude of the gap in standard deviations (see columns for d) across different testing settings.
There were also large gaps in cognitive ability between Hispanics and whites, although the magnitude of the gaps were typically smaller than the gaps between blacks and whites. The Hispanic-white gap hovered at around .7 to .8 standard deviation across over 5 million individuals from 39 samples.
The latest iteration of the Wechsler Adult Intelligence Scale, WAIS-IV, also shows similar racial disparities in cognitive ability. The following table shows the average IQ by race on the standardization sample of the WAIS-IV as reported in Weiss et al. (2010). The standardization sample used 2,200 people in the United States in 2008 between the ages of 16 and 90 years. Scores are normed to have a mean of 100 points and standard deviation of 15 points.
- For reference, FSIQ = Full-scale IQ which is based on the total combined performance of the VCI, PRI, WMI, and PSI. VCI = Verbal Comprehension Index, PRI = Perceptual Reasoning Index, WMI = Working Memory Index, and PSI = Processing Speed Index.
The FSIQ gaps in standard deviation units are as follows:
- The black-white gap is 103.21 – 88.67 = 14.54 points, or about 14.54/15 = 0.97 standard deviations.
- The Hispanic-white gap is 103.21 – 91.63 = 11.58 points, or about 11.58/15 = 0.77 standard deviations.
- The Asian-white gap is 106.07 – 103.21= 2.86 points, or about 2.86/15 = 0.19 standard deviations.
These gaps are in line with gaps reported in the meta-analysis by Roth et al.
There are also similar gaps reported in the Wechsler Intelligence Scale for Children, although the gaps are typically smaller in magnitude. The following table shows the average IQ by race in the latest iteration of this test, the WISC-V, as reported by Weiss et al. (2015). The standardization sample was based on 2,200 children in the United States in 2014 between the ages of 6 and 16 years. Like the WAIS-IV, scores are normed to have a mean of 100 points and standard deviation of 15 points.
- FRI = Fluid Reasoning Index.
The FSIQ gaps in standard deviation units are as follows:
- The black-white gap is 103.5 – 91.9 = 11.6 points, or about 11.6/15 = 0.77 standard deviations.
- The Hispanic-white gap is 103.5 – 94.4 = 9.1 points, or about 9.1/15 = 0.61 standard deviations.
- The Asian-white gap is 108.6 – 103.5= 5.1 points, or about 5.1/15 = 0.34 standard deviations.
Gaps across time
When considering IQ tests across time, one finds moderate closing of the black-white gap in the 70s and 80s, but mostly stagnation since then. Consider the following black-white disparities (in standard deviation units) on standardized tests between 1970s and the early 2000s as reported in Sackett and Shen (2010) [archived]:
For more context, one should integrate these findings with the previously cited findings on the WAIS-IV and WISC-V. Recall that the WISC-V (2014) showed a black-white gap of about 0.77 standard deviations, which is about the same as the gap of 0.78 standard deviations on the WISC-IV (2002). Also, the WAIS-IV (2008) revealed a gap of about 0.97 standard deviations, which is about the same as previous WAIS tests.
Murray (2006) [archived] also examined black-white gaps in test scores on the standardization samples of different iterations of the Woodcock-Johnson test of cognitive abilities. The following shows the average scores for blacks and whites across three iterations of the WJ tests.
As you can see, there was a large reduction in the gap between WJ1 and WJ2. However, there was no improvement between WJ2 and WJ3. Murray also used this data to calculate the black-white gap by birth year. This analysis more clearly shows substantial reductions until about the 70s.
As Murray states, the gap reduces from about 1.33 to 0.98 standard deviations throughout the 20th century on WJ tests.
The B–W difference among persons born from 1920 to 1939 was 1.33σ. The difference dropped to 1.08σ for those born from 1940 to 1955. When the line begins in 1958, the difference was extremely large, reaching a high of 1.45σ in 1959. The difference dropped steeply throughout the 1960s, reaching its low in 1972, at 0.83σ. For those born most recently, 1987–1991, the difference was 0.98σ
He concludes that gap reductions dissipated in the 1970s:
This analysis has used data from the Woodcock–Johnson standardizations to explore a hypothesis for explaining for the disparate findings in the literature on the B–W difference over time: Narrowing in the B–W difference on highly g-loaded measures did occur during the 20th century, but the difference stopped narrowing for persons born in the 1970s and thereafter.
Cognitive ability scores, as measured by IQ tests, are normally distributed for all racial groups. The standard deviation of IQ is similar for all racial and ethnic groups, usually 13 to 15 points as shown in the WAIS data above. Thus, the shape of the distribution (or the “bell curve”) of IQ scores for all racial and ethnic groups are fairly similar. However, the large differences in mean IQ implies that the distribution of IQ scores for certain racial or ethnic groups is shifted above or below that of others. This means that, if group A has a higher mean IQ than group B, then there will be more individuals from group A that exceeds any given IQ threshold than there will be from group B. However, both groups will have some number of individuals exceeding any given IQ threshold (unless one makes the threshold so high that only rare outliers can exceed that amount). Thus, even if group A has a higher mean IQ than group B, for any individual in group A, there will almost always be some individual from group B with a higher IQ.
Think of racial differences in IQ the same way we think of sex differences in height. Men are taller than women on average, and there are far more men than women who are relatively tall (say, over 6 feet), yet it is common to find individual women taller than individual men. Likewise, whites are more intelligent than blacks on average, and there are far more whites than blacks who are relatively intelligent (say, over 115 IQ), yet it is common to find some black people smarter than some white people. One additional point worth noting is that the black-white gap in IQ is not as large as the male-female gap in height. In the United States, the mean height for men is about 5 inches greater than the mean height for women, which is equivalent to about 2 standard deviations. By contrast, the black-white gap in cognitive ability tends to hover at around 1 standard deviation. Thus, whereas only about 2 percent of women are taller than the average height for men, about 16 percent of blacks have higher IQs than the average IQ for whites.
For a visual illustration of racial differences in IQ distributions, consider the following distributions of IQ scores of blacks and whites in the United States (this graph was pulled from page 279 of The Bell Curve):
If we scaled the distributions to reflect population sizes, it would appear as follows:
One might think that a gap of 1 standard deviation in IQ is not much. After all, its half as great as the difference in height between men and women. However, if we assume that blacks and whites have a mean IQ of 85 and 100, respectively, then this would have the following profound statistical implications:
- Only about 16% of blacks have an IQ above 100, compared to 50% of whites.
- Only about 2% of blacks have IQs above 115, compared to 16% of whites.
- 65% of blacks have an IQ below 90, compared to 25% of whites.
- 35% of blacks have an IQ below 80, compared to less than 10% of whites (these individuals are below the cutoff point for acceptance into the US armed forces).
- 16% of blacks have an IQ below 70, compared to 3% of whites (this constitutes intellectual disability according to the DSM-5 [archived]).
(These percentages can be calculated by using a z-table, which allows one to calculate the percentage of data points from a normal distribution that is greater than (or below) some threshold (measured in terms of standard deviations from the mean)).
I have written elsewhere on the robust predictive validity of cognitive ability on important social outcomes. Cognitive ability is a strong predictor (oftentimes the best predictor) of outcomes such as academic achievement, job performance, educational attainment, job status, income, wealth, crime, and health. The predictive power of cognitive ability has been succinctly illustrated in the following chart reported by Gottfredson (1998) [archived] (page 28):
The above chart shows the probability of various social outcomes at different IQ levels. Note that the data here is only reported for young white adults (to avoid conflating the effects of IQ with the effects of race). As you can see, negative social outcomes are far more common for individuals with lower IQ scores. For example, young white adults with IQs between the 5th and 25th percentile are about 6 times as likely to drop out of high school as young white adults with IQs between the 25th and 75th percentile (35% vs 6%).
Gottfredson (2004) [archived] goes into some detail on the impact of racial differences in cognitive ability by considering the life outcomes associated with a variety of different IQ thresholds:
- An IQ of 75 “signals the ability level below which individuals are not likely to master the elementary school curriculum or function independently in adulthood in modern societies” (page 28). They are likely to be eligible for “financial support provided to mentally and physically disabled adults” by the U.S. government. Such individuals are “difficult to train except for the simplest tasks, so they are fortunate in industrialized nations to get any paying job at all. While only 1 out of 50 Asian-Americans faces such risk, Figure 3 shows that 1 out of 6 black Americans does.”
- An IQ of 85 is another threshold considered because “the U.S. military sets its minimum enlistment standards at about this level” (page 28). The military is often viewed as a last resort by many people, but “this minimum standard rules out almost half of blacks (44%) and a third of Hispanics (34%), but far fewer whites (13%) and Asians (8%).” Individuals with IQs in this range “live at the edge of unemployability in modern nations, and the jobs they do get are typically the least prestigious and lowest paying: for example, janitor, food service worker, hospital orderly, or parts assembler in a factory.”
- An IQ of 105 can be viewed as “the minimum threshold for achieving moderately high levels of success” (page 30). People above this level are “highly competitive for middle-level jobs (clerical, crafts and repair, sales, police and firefighting), and they are good contenders for the lower tiers of managerial and professional work (supervisory, technical, accounting, nursing, teaching).” The percentages of people achieving this threshold of IQ are “53%, 40%, 27%, and 8%, respectively, for Asians, whites, Hispanics, and blacks.”
Thus, the large racial differences in cognitive ability are likely to explain much of the racial inequalities in many important social outcomes. In fact, I’ve written elsewhere demonstrating that controlling for racial differences in cognitive ability mostly or entirely eliminates racial disparities in income, income mobility, educational attainment, job status, and incarceration.
Racial differences in test scores is not limited to formal tests of cognitive ability. Racial gaps are also found on virtually all standardized tests of academic achievement. For example, Sackett and Shen (2010) [archived] also report large gaps on college admissions tests (e.g., SAT, ACT) and school achievement tests. The following table shows average black-white disparities in test scores across a variety of different tests.
In line with the earlier data, the black-white gap seems to hover at around 1 standard deviation. Also in line with earlier data, the Hispanic-white gap (not shown here because the table is too long) hovers at around 0.7 standard deviations.
Similar gaps were reported in Roth et al. (2001) [archived]. These authors show large black-white disparities on a variety of college admissions tests, including graduate admissions (e.g. GRE) tests.
Again, Hispanics also lagged behind whites, although to a lesser extent than did blacks.
While we find racial differences across a wide variety of tests, it’s important to note that racial gaps (particularly black-white gaps) on test scores tend to be greater on more g-loaded tests. This finding has been reported in a meta-analysis by Te Nijenhuis and Van den Hoek (2015) [archived].
The largest assessment of academic achievement from representative samples comes from the National Assessment of Educational Progress (NAEP). The NAEP provides data on results of academic assessments that have been conducted regularly since the early 1970s. Data is published each year for free to the NCES website. Data published in 2021 is available here. This table shows the average reading scale scores by race/ethnicity and age from 1971 to 2020. I converted the data into a line chart in excel to better illustrate the changes over time. Here are the results:
As you can see, racial disparities in test scores have been persistent for decades. There is some sign of narrowing of disparities among 9-year-olds, but the disparities for 13 and 17 year-olds has been mostly stagnant since the late 1980s. Another thing to note is that white 13-year-olds score at about the same level as black and Hispanic 17-year-olds. In fact, the white 13-year-olds have consistently outscored black 17-year-olds since the 1990s.
There are similar findings when examining average mathematics scores. This table shows the average mathematics scale scores by race and age from 1973 to 2020. I converted the data into a line chart in excel to better illustrate the changes over time. Here are the results:
Again, there are similar patterns here as with the reading data. The gaps appear to have been mostly stagnant since the late 1980s. Also, white 13-year-olds have scored at the level of black and Hispanic 17-year-olds fairly consistently since the 1970s (aside from about the mid 80s to the mid 90s). Perhaps more shockingly, white 9-year-olds have begun to match black 13-year-olds in recent years.
One can also view academic achievement gaps by race over time in this link here.
There are also large racial gaps in standardized college admissions tests. For example, blacks scored 5.3 points (22.2 – 16.9) lower than whites on the ACT in 2018 (NCES 2018 [archived]). The standard deviation of scores was 5.8 points, implying that black students scored about 0.92 (5.3/5.8) standard deviations lower white students.
Similar disparities were found on SAT test scores. In 2020, black students scored 177 points (1104 – 927) lower than whites on the SAT (page 3, CollegeBoard 2021 [archived]). The standard deviation of scores was 211 points (page 4), implying that black students scored about 0.84 (177/211) standard deviations lower than white students. These gaps are so large that only 1% of black test-takers scored in the 1400-1600 range (the highest range), compared to 7% of whites and 24% of Asians (page 5). Furthermore, 69% of black test-takers scored below 100 (the lowest range), compared to only 28% of whites and 16% of Asians.
Note: the gaps for the ACT and SAT reported above are slight under-estimates of the gap because I used the total sample standard deviation instead of the pooled standard deviation. Calculating the pooled standard deviation would require data on standard deviation by race, which I wasn’t able to find.
The gaps in ACT and SAT scores are to be expected given racial disparities in cognitive ability, because these tests are good proxy for cognitive ability. In fact, some researchers have concluded that “the ACT is an acceptable measure of general intelligence” (page 158, Koenig et al. 2008 [archived]) and that “the SAT is an adequate measure of general intelligence” (page 377, Frey and Detterman 2004 [archived]).
When looking at racial gaps in standardized test scores across time, there has not been much narrowing of the gap, at least between the late 1980s and early 2000s. Sackett and Shen (2010) [archived] show that the black-white and Hispanic-white SAT gaps in 1987 are virtually identical to the gaps in 2006. In fact, the Hispanic-white math gap grew from 0.58 to 0.71 standard deviations.
For more recent SAT data, see the NCES webpage here. The page shows average SAT critical reading and mathematics scores by race from 1987 to 2015. There is data after 2015 here, but the data should not be used because of a redesign in the SAT in 2016. I have converted both the reading and mathematics scores into line charts in excel to better illustrate changes across time. Here are the average reading scores by year and race/ethnicity:
As you can see, the scores for all groups have been stagnant across this time period except for Asian Americans students who saw large gains, almost catching up to whites. There were also moderate increases for Native American test takers.
Here are the average mathematics scores by year and race/ethnicity:
Again, the scores for all groups have been stagnant except for Asian Americans, who increased their advantage over other groups even further.
Achievement at the extremes
Racial disparities in test scores are so pronounced that blacks (and Hispanics to a lesser degree) are virtually absent at the highest level of achievement. For example, a 2005 article [archived] published by the The Journal of Blacks in Higher Education reported that there are “almost no blacks” among the top scorers on the SAT:
If we eliminate Asians and other minorities from the statistics and compare just white and black students, we find that 5.8 percent of all white SAT test takers scored 700 or above on the verbal portion of the test. But only 0.79 percent of all black SAT test takers scored at this level. Therefore, whites were more than seven times as likely as blacks to score 700 or above on the verbal SAT. Overall, there are more than 39 times as many whites as blacks who scored at least 700 on the verbal SAT.
On the math SAT, only 0.7 percent of all black test takers scored at least 700 compared to 6.3 percent of all white test takers. Thus, whites were nine times as likely as blacks to score 700 or above on the math SAT. Overall, there were 45 times as many whites as blacks who scored 700 or above on the math SAT.
If we raise the top-scoring threshold to students scoring 750 or above on both the math and verbal SAT — a level equal to the mean score of students entering the nation’s most selective colleges such as Harvard, Princeton, and CalTech — we find that in the entire country 244 blacks scored 750 or above on the math SAT and 363 black students scored 750 or above on the verbal portion of the test. Nationwide, 33,841 students scored at least 750 on the math test and 30,479 scored at least 750 on the verbal SAT. Therefore, black students made up 0.7 percent of the test takers who scored 750 or above on the math test and 1.2 percent of all test takers who scored 750 or above on the verbal section.
In an article published by the Brookings Institution, Reeves and Halikias (2017) [archived] showed that Asians are several times more likely to score at the highest levels on the SAT math section than whites, who are several times more likely to do so than blacks and Hispanics.
Perhaps more surprisingly, the SAT math scores show that most (72%) of the lowest scoring test takers (300 to 350 points) are black or Hispanic, and most (60%) of the highest scoring test takers (750 to 800 points) are Asian. In fact, there are more blacks scoring at the lowest level than white and Asian students combined, despite the fact that there are 4-5 times as many white/Asian test takers than black test takers.
Similar results are found when examining NAEP achievement at the highest levels for mathematics and reading. To get the percentage of students at the highest level, I will use the percentage of 12th-graders scoring at “advanced”, as these are the highest levels recognized by the NAEP. Scoring at these levels reflects the ability to perform the following tasks:
- For mathematics [archived], “advanced” indicates the ability to use mathematical knowledge to “solve nonroutine and challenging problems, provide mathematical justifications for their solutions, and make generalizations and provide mathematical justifications for those generalizations”.
- For reading [archived], “advanced” indicates an ability to “describe more abstract themes and ideas in the overall text” and “analyze both the meaning and the form of the text and explicitly support their analyses with specific examples from the text”.
The Data Explorer at the official NAEP website allows us to create custom data tables, which provides information on the percentage of students of each racial group scoring at various achievement levels. In 2019, these were the percentage of 12th grade students scoring at the proficient or advanced level in reading and mathematics:
- Reading: About 9% of whites who scored at or above advanced. This was about 9 times the percentage of blacks (1%) and 3 times the percentage of Hispanics (3%) who achieved the same level. Multi-racial (10%) and Asian (13%) students were slightly more likely than whites to score at the advanced level.
- Mathematics: About 4% of whites scored at or above advanced. This was about 4 times the percentage of Hispanics who achieved the same (1%). Asians were actually about 3 times as likely as whites to score advanced or higher (14%). Multi-racial students were about equally as likely as whites to achieve this performance level (4%). The percentage of blacks who scored at advanced was so low that it was rounded to zero, so all we know is that less than 0.5% of blacks achieved this level. Since the percentage of whites who scored at advanced is at least 3.5%, this implies that whites were at least 7 times as likely as blacks to score at this level (>3.5% vs <0.5%).
Note: some data on the percentage of high-scoring students is presented on the predefined data tables at the NCES website (see here for reading and here for mathematics). But this data does not contain scores for Asians specifically (they are lumped in with “other”) and the reading data does not contain the percentage of students scoring at the highest level (350 points or higher).
In summary, nationally representative data shows massive racial differences in the percentage of students scoring at the highest level of academic achievement. Whites were about equally as likely as multi-racial students to reach the advanced level in both mathematics and reading. Asians were slightly (44%) more likely to reach this level in reading, but were 3-4 times more likely to reach this level in mathematics. Whites were about 3-4 times as likely as Hispanics and 7-9 times as likely as blacks to reach the highest levels in mathematics and reading.
Racial disparities in test scores persist even to the point of admission into graduate school. For example, Dalessandro et al. (2012) [archived] reported data on LSAT performance during the 2005-2006 through 2011-2012 testing years. The data showed racial gaps in LSAT test scores that were similar in magnitude to racial gaps in other tests. During the 2011-2012 testing years, the average score for black test-takers was 10.96 (152.8 – 141.84, Table 4B) points lower than the average score for white test-takers. The standard deviation for test scores during this period was 10.19 points (Table 1), implying that black test-takers scored 1.07 (10.92/10.19) standard deviations below white test-takers (Note: this gap is a slight under-estimate of the gap because I used the total sample standard deviation instead of the pooled standard deviation. I’ll update with the pooled standard deviation later). The following graph shows the LSAT score distribution by race/ethnicity (Figure 14):
Bleske-Rechek and Browne (2014) [archived] reported similar racial disparities among GRE test-takers. In 2007, black test-takers scored 98 points lower than white test-takers on the verbal portion and 143 points lower on the quantitative reasoning portion (Table 3). This corresponds to gaps of about one standard deviation on both tests (the black SD for the verbal and quantitative subscores was 95 and 139 points, respectively; the white SD was 106 and 135 points, respectively). The study concluded noting that “In Verbal reasoning, White and Asian examinees currently score a full standard deviation above Black examinees” and “in Quantitative reasoning, Asian examinees score about a one-third standard deviation higher than White examinees and well over one standard deviation higher than Black examinees” (page 31). These gaps have been present for decades (Figure 3):
Similar racial disparities were also observed on MCAT scores in a 2020 report [archived] published by the Association of American Medical Colleges (AAMC). In one of the studies in this report, the authors examined various group disparities in MCAT scores among test takers in 2017. Figure 1 shows the distribution of scores by demographic group (page 22):
In line with other tests, one finds black and Hispanic test takers at the bottom, and white and Asian test takers at the top.
As one might expect, racial disparities in cognitive ability manifest as racial disparities in basic literacy skills that are highly valuable in an advanced market-based economy. Data by the National Center for Education Statistics [archived] reported large disparities in all assessed forms of literacy as of 2003:
A subject’s scores on these literacy tests are used to classify the subject as “below basic”, “basic”, “intermediate”, or “proficient” (Table 1). Given the racial disparities in average literacy scores, blacks were far more likely to be classified as “below basic” and much less likely to be classified as “intermediate” or “proficient”. For example, the percentage of blacks who were “below basic” in quantitative literacy (47%) was nearly 4 times greater than the same percentage for whites (13%). The percentage of whites who were “intermediate” or “proficient” in quantitative literacy (56%) was over 3 times greater than the same percentage for blacks (17%). Furthermore, on each measure of literacy (prose, document, and quantitative literacy), the percentage of whites who were “proficient” (15-17%) was about 8 times greater than the same percentage for blacks (1-2%):
Cognitive ability is one of the best predictors of job performance. In fact, Gottfredson (1997) [archived] has forcefully asserted that “g can be said to be the most powerful single predictor of overall job performance” and that “no other single predictor measured to date (specific aptitude, personality, education, experience) seems to have such consistently high predictive validities for job performance” (page 83). I have also written on the predictive ability of cognitive ability on job performance elsewhere. Given this, and given racial differences in cognitive test scores, one should also expect large racial differences in job performance. This is exactly what we find.
A meta-analysis by Roth et al. (2003) [archived] measured racial differences in job performance among dozens of samples involving tens of thousands of subjects. The main results of the meta-analysis were summarized in table 2:
The results indicate that whites scored 0.27 standard deviations higher than blacks on overall ratings of job performance. Racial gaps were largest on tests of job knowledge (d = 0.48) and work sample (d = 0.52) tests. Further analysis also reveals larger job knowledge gaps with objective measures (d = 0.55) rather than subjective measures (d = 0.15) (Table 3).
Similar results were reported in later meta-analyses on racial differences in work sample assessments (Roth et al. 2008), job trainability (Roth et al. 2011 [archived]), and larger meta-analyses of job performance (McKay and McDaniel 2006 [archived]). Thus, there are large racial differences in achievement regardless of whether achievement is measured in terms of intellectual/scholastic performance or job performance.
Comparisons with other gaps
To help put the magnitude of racial disparities in test scores into perspective, it may be useful to compare racial disparities to other disparities. I’ll start by considering disparities in mean test scores across different socioeconomic levels. Then I note how racial and socioeconomic disparities in achievement test scores have changed over time for students. Finally, I report how racial disparities in scores on international tests compare to national disparities on the same tests.
When comparing test gaps by race to gaps by educational level, we see that blacks tend to score closer to that of High School dropouts whereas Asians score closer to that of college graduates. For example, Weiss et al. (2010) report WAIS-IV IQ scores for adults by race (Table 4.3) and by educational level (Table 4.1). The mean FSIQ score by race and by educational level are as follows:
Parental socioeconomic gaps
Similar patterns are found when examining test scores for children broken out by parental educational level. That is, black children tend to score at about the same level as children of high school dropouts, whereas Asians tend to score at about the same level as children of college graduates. For example, Weiss et al. (2015) reported WISC-V IQ scores for children (between the ages of 6 and 16 years) by race (Table 5.3) and by parental educational level (Table 5.2). The mean FSIQ score by race and by parental educational level are as follows:
Similar patterns are observed when examining NAEP mathematics scores. The NCES published a variety of pages showing 2019 mathematics scores by race, by parental education and eligibility for free/reduced lunch, and by percentage of a student’s school eligible for free/reduced lunch. The average 8th grade mathematics scores for each group are as follows:
Notes on the chart:
- The red bars on the right show the average mathematics scores for 8th graders based on the student’s eligibility for free/reduced lunch.
- The green bars on the right show the average mathematics scores for 8th graders based on the percentage of the student’s school that is eligible for free/reduced lunch.
As you can see, black students score lower than the most disadvantaged groups from all of the other socioeconomic measures. Black students score lower than the children of high school dropouts, children eligible for free/reduced lunch, and children in schools where >75% of students are eligible for free/reduced lunch. By contrast, Asians outscore the most advantaged group from all of the other measures. Asian students outscore children of college graduates, students not eligible for free/reduced lunch, and students in schools where <25% of the students are eligible for free/reduced lunch.
Racial vs socioeconomic achievement gaps across time
An article by Hanushek et al. (2019) [archived] examined NAEP achievement gaps by socioeconomic status and race throughout the second half of the 20th century. They show that the academic achievement gaps by socioeconomic status have narrowed slightly since 1954. The gap between students in the top and bottom deciles has narrowed slightly from about 1.2 to 1.05 standard deviations from 1954 to 2001. The gap between students in the top and bottom quartiles also narrowed slightly from about 0.9 to about 0.8 standard deviations:
As can be seen in Figure 1, the disparities in achievement between students from the highest and lowest socioeconomic status groups are strikingly persistent throughout the time period. The socioeconomic achievement divide hardly wavers over this half century. In the 1954 birth cohort, the achievement gap between the average of those in the top and bottom deciles of the socioeconomic distribution stood at slightly less than 1.2 standard deviations. For those born in 2001, the gap is only slightly less—about 1.05 standard deviations. That is, the most-disadvantaged students have made the same gains in achievement over the decades as those realized by the most-advantaged students.
The disparity between students in the top and bottom quartiles of the socioeconomic distribution was about 0.9 standard deviations for the 1954 birth cohort. This 75–25 gap falls slightly during the next two decades, settling at barely below 0.8 for the cohort born in 2001.
By contrast, the black-white achievement gap has narrowed fairly greatly from about 1.3 to about 0.8 standard deviations in the 1980s. Since then, the gap has remained stable to 2001 at about 0.8 standard deviations. This is comparable to the gap between students at the top and bottom quartiles, and is greater than the gap between students eligible and ineligible for free lunch.
Figure 2 also shows the white-black achievement gap. While this is not accurately thought of as a socioeconomic gap because of the improvements in black incomes, it represents another potential dimension of continuing societal disparities. As Figure 2 shows, there is a sizable shrinking of the racial gap in the early period but little change across the last two decades.
This data has also been analyzed in more detail in Hanushek et al. (2020) [archived].
Another useful analysis is to compare test scores by race to test scores by country. The most comprehensive standardized tests of scholastic performance comes from PISA. PISA regularly measures proficiency in mathematics, reading, and science of 15-year-old students in dozens of countries. The most recent assessment was performed in 2018, with the results available here. I’ll focus on reports of mathematics literacy (see page 20).
When looking at the country as a whole, the U.S. mathematics scores are fairly mediocre. The average mathematics score for the United States was 478, considerably lower than the average for OECD countries at 489 (see Figure M3). However, when U.S. scores are broken out by race, U.S. Asians and whites perform relatively well internationally whereas U.S. blacks and Hispanics perform relatively poorly. The following chart shows average mathematics scores by country (Figure M3) and by U.S. racial group (page 33).
As you can see, U.S. Asian and white students score fairly well relative to other countries. U.S. Asian students score at about the level of Japanese and Korean students, whereas U.S. white students score at about the level of Swedish and Finnish students. By contrast, blacks and Hispanics score rather poorly. Hispanics score similarly as students from Greece and Ukraine, whereas blacks score at about the level of students from Thailand or Chile.
There are also similar patterns observed when analyzing PIAAC literacy scores by race and country.
The ubiquity of test score gaps
Racial gaps in test scores are ubiquitous. They persist across all places in the country, across age, across educational levels, and across socioeconomic levels.
Racial disparities in test scores of varying magnitudes are found in every state in the country. For example, Racial disparities in 8th grade math scores by state were nicely summarized in the following graph in Vanneman et al. (2009) [archived]:
In this 2007 sample, the greatest mean score among black students was 272 points (DoDEA, Colorado, and Oregon), which was just 1 point greater than the lowest mean score among white students (West Virginia, average white score of 271), and 7 points lower than the second lowest mean score among white students (Mississippi, average white score of 279).
The NCES has also released more recent statistics on the average NAEP mathematics scale scores of 8th-grade public school students by race and state in 2019. To illustrate racial disparities by state, I created a histogram of state-level scores by race for black, white, Hispanic, and Asian students. I excluded the scores for DoDEA and Washington D.C. to focus on the 50 states. I also excluded scores for Pacific Islanders and American Indians due to a low number of data points.
As you can see, there is no overlap between black and white scores in the histogram. The highest average black score was 268 (Virginia), whereas the lowest average score for white students was 273 (West Virginia). Also, there was no state where the average score for black students came close to the common range of whites scores, which seems to be between 284 and 296 points. Furthermore, only Asian Americans have state-level averages above 308 points. In fact, there are only a few states where Asian Americans have lower scores than 304, which is about the highest state-level average for any other group.
Similar large gaps are observed when analyzing test scores from more granular units of data, such as counties and districts. The Educational Opportunity Project at Stanford University has produced a very useful tool showing test black-white achievement gaps by different levels of analysis (e.g. states, counties, districts) for elementary/middle school students on NAEP assessments across the country.
This chart shows the average gap in test scores between black and white students in over 3,000 counties across the country.
As you can see, there are almost no dots above the green line, implying that there are almost no counties where black students outscore white students. The few counties where blacks outscore whites seem to be very small counties where test scores are rather mediocre.
This chart shows the average black-white gap in test scores in over 13,000 districts across the county.
The data for districts is similar to the data for counties. There are almost no dots above the green line, implying that there are almost no districts where black students outscore white students. The few districts where blacks outscore whites seem to be smaller districts where test scores are rather mediocre.
Similar findings were reported in an analysis of racial test score gaps by geography by Reardon and Kalogrides (2019) [archived]. These authors analyze racial achievement gaps in hundreds of metropolitan areas and thousands of school districts in the United States. These authors find that there are almost no school districts where achievement gaps are near zero. The only exceptions tend to be districts where all students are low performing or there are very few minority students. The authors emphasize the following key finding from their research (page 1204):
[O]f the several thousand school districts we analyze, which enroll over 90% of all black and Hispanic students in the United States, there are but a handful in which the achievement gap is near zero. With the notable exceptions of the Detroit, Michigan, and Clayton County, Georgia, school districts, these tend to be districts that enroll few minority students and in which achievement gaps are very imprecisely estimated even in our large data set. And while Detroit and Clayton County do have achievement gaps near zero, this does not appear, at least in the case of Detroit, to be a desirable form of equity: Census data show that both white and black families in Detroit are very poor, on average; and given how low average test scores are in Detroit, the absence of an achievement gap implies that both black and white students are equally low scoring. In other words, there is no school district in the United States that serves a moderately large number of black or Hispanic students in which achievement is even moderately high and achievement gaps are near zero.
Test score gaps are also found across all ages. In fact, studies show that racial gaps emerge before children reach formal schooling. For example, Gottfredson (1997) [archived] reports that “Racial-ethnic differences in IQ bell curves are essentially the same when youngsters leave high school as when they enter first grade” (page 15). This was a claim published in a very brief 3-page statement that outlines conclusions regarded as mainstream among over 50 experts in intelligence and allied fields.
Farkas and Beron (2004) [archived] investigated oral vocabulary scores for black and white children by age using data from the Children of the NLSY79 (CNLSY). Vocabulary is measured using the Peabody Picture Vocabulary Test (PPVT). The PPVT is described as follows (page 473):
The Peabody Picture Vocabulary Test (PPVT) was used to measure oral vocabulary. The PPVT of spoken vocabulary consists of 175 words of generally increasing difficulty. The tester reads the word to the child, and the child points to one of four pictures that best describes its meaning. Testing stops and the child’s score (‘‘ceiling’’) is established when he or she incorrectly identifies six of eight consecutive items. The child’s score is the number of words identiﬁed correctly. (We analyze this variable in its raw score form, since we are controlling the child’s age in months.)
The researchers analyze data on vocabulary of children between the years of 1986 and 2000. Regarding racial disparities, the study found large racial disparities at the earliest observation which persisted throughout the entire study period (page 477):
Beginning with the earliest observation at 36 months of age, Whites average significantly higher scores than African-Americans. This pattern is consistent over the full age span, with the White lead remaining significant through 13 years of age. To see how very substantial this vocabulary gap is, note that Whites cross the 40-word level at approximately 50 months of age, whereas African-Americans do not reach this level until approximately 63 months, which puts them 13 months, or more than one year, behind in vocabulary development.
In the introductory chapter of The Black-White Test Score Gap, Jencks and Phillips (1998) [archived] also report on the black-white gap using the same dataset (page 1):
African Americans currently score lower than European Americans on vocabulary, reading, and mathematics tests, as well as on tests that claim to measure scholastic aptitude and intelligence. This gap appears before children enter kindergarten (figure 1-1), and it persists into adulthood. It has narrowed since 1970, but the typical American black still scores below 75 percent of American whites on most standardized tests. On some tests the typical American black scores below more than 85 percent of whites.
More recent studies also show similar gaps that emerge before children reach formal schooling:
- Cottrell, Newman, & Roisman (2015) [archived] examined cognitive ability of children of 1,364 families who participated in the National Institute of Child Health and Human Development (NICHD) Study of Early Child Care and Youth Development (SECCYD). General cognitive ability/knowledge (g) was measured using the math, vocabulary, and reading ability facets of the Woodcock-Johnson Psycho-Educational Battery-Revised (WJ-R). Researchers find that “Black-White gaps in cognitive test scores are large and pervasive, and are already established by 54 months of age” (page 11). They further report that “between 54 months and 15 years of age, this gap did not significantly increase over time” (page 11). The black-white gap in g ranged from around 1.2 and 1.4 standard deviations during this time period (page 11).
- Quinn (2015) [archived] examined cognitive disparities using data from the Early Childhood Longitudinal Study – Kindergarten Class of 2010-2011. This is a new nationally representative dataset of over 10,000 children who entered kindergarten in 2010. The cognitive and behavioral outcomes of the children were assessed in regularly scheduled follow ups as they progressed through school. Consistent with prior data, there were significant black-white gaps in reading (.32 SDs) and mathematics (.54 SDs) (page 128). Black children also scored .52 SDs worse than whites in working memory (page 128) during the fall of kindergarten. The author notes that this may actually ”underestimate the WM gap” because valid scores were not available for young, low-scoring students (page 128). For comparison, Hispanic students had similar working memory scores as blacks, and Asians had working memory scores about .19 SDs higher than whites.
Across levels of education
This point has already been illustrated before, but it is worth emphasizing that test score gaps persist across all levels of education as well. For example, the meta-analysis by Roth et al. (2001) [archived] shows large racial gaps among high school students, college applicants, college students, and even graduate school applicants. The magnitude of the gaps are all at or above 1 standard deviation, except for the gap for college students which is only about 0.7 standard deviations.
Gaps of similar magnitude are found regardless of whether one analyzes elementary school samples or graduate application samples. A study of racial differences in MCAT scores by Dwight et al. (2013) [archived] included the following table summarizing racial gaps across different samples:
We even find large racial disparities in NAEP mathematics scale scores regardless of the highest level of mathematics completed. See the following data on mathematics scores among 12th grade students by race and highest mathematics course completed:
As the arrows indicate, white students who have completed lower levels of mathematics routinely outscore blacks who have completed higher levels. In fact, whites who have only completed Algebra I or below score at about the same level as blacks who have completed Algebra II (119 vs 121).
Across socioeconomic levels
There are also large racial disparities in test scores across socioeconomic status. For example, in Facing Reality, Murray presented data on average IQ for black, white, and Hispanic subjects across 3 nationally representative datasets. The total sample involved 20,000 subjects from the 1972 National Longitudinal Study and the 1979 and 1997 cohorts of the National Longitudinal Study of Youth. IQ scores were estimated from g-loaded tests taken by subjects when they were in their teens.
As you can see, black-white disparities were about 1 SD or larger in virtually all occupations. The Hispanic-white gap hovered at around 0.7 SDs.
There are similarly large IQ gaps across levels of education and occupational status. For example, Kaufman and Lichtenberger (2005) reported data showing that blacks scored substantially lower than whites on the WAIS-R than whites at every level of education. Surprisingly, the IQ gap was greater at higher levels of education, with the largest FSIQ gap being 15.8 points for subjects with 13+ years of education. In fact, whites with only 9-11 years of education had FSIQs that were several points higher than blacks with 13+ years of education.
In the WAIS-IV, Weiss et al. (2010) performed an analysis to see how much of the FSIQ gap among adults could be eliminated after controlling for socioeconomic status (page 124). Socioeconomic status was measured as educational level, occupational status, median income of subject’s zip code, region, and gender. They found that the black-white IQ gap reduced from 14.88 points to 11.23 points. The Hispanic-white IQ gap reduced from 11.95 points to 6.56 points.
Thus, racial disparities in FSIQ are reduced by about one-third to about one-half using this index for socioeconomic status. These findings were also reported in Weiss and Saklofske (2020) (Table 7).
Across parental socioeconomic levels (individual measures)
The evidence is strong that cognitive ability gaps persist even after adjusting for SES. In this subsection, I’ll focus on reports of gaps after adjusting for individual measures of SES (e.g., parental income OR parental education).
From the introductory chapter of The Black-White Test Score Gap, Jencks and Phillips (1998) [archived] report that “a two-year reduction of the black-white education gap among mothers would only reduce the IQ gap by about a point for children” (page 22).They further state that “eliminating black-white income differences would reduce the IQ gap by about a point” (page 23).
The findings from Jencks and Phillips regarding parental education are corroborated by recent data [archived] on NAEP test scores. (for the source: if using the archived webpage, click “Focus: racial/ethnic groups” on the left sidebar). In 2013, the following racial gaps in test scores were observed among 12th-grade students:
Some important points are worth noting from this figure:
- Controlling for parental education does not succeed in explaining the test score gap between blacks and whites. In fact, the average NAEP mathematics score of black students with parents who graduated from college was equal to the average score of white students with parents who did not finish high school (both groups score 138 points). The exact same pattern is found for reading scores (both groups score 274 points).
- The gap in academic achievement actually increases among children with more educated parents. Among children with parents who did not finish high school, the black-white mathematics test score gap is 16 points. Among children with parents who finished high school, the gap is 24 points. Among children with parents who graduated from college, the gap is 32 points. The same pattern is found for the reading test score gaps. A 1999 report [archived] by the College Board had similar findings (Table 2).
The findings from Jencks and Phillips regarding parental income are corroborated by SAT scores disaggregated by race and income. Black-white differences in income do not explain a significant portion of black-white differences in SAT scores. Consider the following data [archived] on the racial gap in SAT scores in 2005:
Whites from families with incomes of less than $10,000 had a mean SAT score of 993. This is 129 points higher than the national mean for all blacks.
Whites from families with incomes below $10,000 had a mean SAT test score that was 61 points higher than blacks whose families had incomes of between $80,000 and $100,000.
Blacks from families with incomes of more than $100,000 had a mean SAT score that was 85 points below the mean score for whites from all income levels, 139 points below the mean score of whites from families at the same income level, and 10 points below the average score of white students from families whose income was less than $10,000.
The same pattern in SAT scores is shown by Dixon-Roman et al. (2013) [archived] (Table 2) and a 2008 report [archived] (page 11) by The Journal of Blacks in Higher Education. In fact, For example, Dixon-Roman et al. (2013) show that black-white SAT gaps – particularly math gaps – are remarkably stable across all ranges of family income (Table 2):
Note that the test score gap is about the same at every level of parental income. Among families with incomes between $10,000 and $15,000, the black-white SAT math gap was 83 points. Among families with incomes between $40,000 and $50,000, the gap was 79 points. Among families with incomes exceeding $100,00, the gap was 78 points.
The 2008 report [archived] (page 11) has the same findings for 2008 SAT scores:
More recent SAT data by race/ethnicity has been reported here [archived]. Unfortunately, I’ve been unable to find the original data for this source. However, the data is mostly in line with previously cited findings. I cite this data because it’s the most recent data on SAT scores by race that I’ve found and it also includes more racial groups than just blacks and whites.
As you can see, we find the typical racial patterns at every level of parental income. At each level of parental income, Asians and whites are at the top with blacks and Hispanics at the bottom. I added the horizontal red lines to compare the scores of the poorest whites and Asians to the scores of other race/income groups. As you can see, the poorest whites and Asians outperform all but the richest (>$160k) blacks.
Across parental socioeconomic levels (composite measures)
The above findings showed racial differences in test scores after controlling for individual measures of SES (e.g. parental income OR parental education). One also finds that racial disparities persist after controlling for composite measures of SES (e.g., controlling for multiple variables simultaneously). For example, Murray and Herrnstein (1994) showed that the gap in AFQT scores persists after controlling for parental SES, where parental SES. “Parental SES” is measured based on “information about the education, occupations, and income of the parents of NLSY youths” (page 131). In fact, the magnitude of the gap was greatest at the higher SES levels (page 288):
Persistent gaps were also found in the standardization sample of the most recent installment of the WISC test, the WISC-V. Weiss et al. (2015) analyzed how much of the racial gap in FSIQ scores on the WISC-V could be eliminated after controlling for parental education and parental income. They found that controlling for these variables reduced the racial disparity for children from about 12.8 points to about 6.7 points, eliminating about half of the gap.
Recall the study by Reardon and Kalogrides (2019) [archived] cited earlier which analyzed racial achievement gaps in hundreds of metropolitan areas and thousands of school districts in the United States. The authors tested the degree to which district-level racial achievement gaps in test scores could be explained by district-level racial disparities in socioeconomic status. District-level socioeconomic status was measured by median family income, the proportion of families with a parent with a bachelor’s degree, poverty rates, unemployment rates, SNAP recipient rates, and single-mother-headed household rates (page 1190). Researchers found that while “there is a moderate association between socioeconomic disparities and achievement gaps” (page 1197), there were still large achievement gaps “even in the relatively few districts where white and minority students have similar socioeconomic backgrounds and levels of economic isolation”. Thus, the authors conclude, ” racial/ethnic socioeconomic disparities alone do not account for the large racial achievement gaps, despite being highly predictive of the magnitudes of the gaps” (page 1200). The following figures show the association between district-level achievement gaps and corresponding socioeconomic disparities:
Each point in the figures corresponds to a school district; the size of each point is proportional to the average number of black or Hispanic students in the district. The data shows that, even in school districts with no (or even reversed) SES gaps by race, there persists large achievement gaps by race.
One might wonder if adjusting for a broader set of controls will succeed at explaining a larger portion of test scores. One common finding in studies that attempt to do this is that such broader controls can explain test score gaps in early childhood, but they fail to explain gaps as children age.
One study reporting this finding was conducted by Fryer and Levitt (2005) [archived]. These authors analyzed 1998 data from the Early Childhood Longitudinal Study Kindergarten Cohort (ECLS-K), a nationally representative sample of over 20,000 children entering kindergarten (page 5). Children were administered standardized tests in mathematics and reading in the fall and spring of kindergarten, spring of first grade, and spring of third grade. The authors included a variety of controls including gender, age, birth weight, a composite indicator of SES (parental education, occupation, and income), books in the home, a proxy for maternal age at birth, and WIC participation (see page 9). The results revealed that “Black children enter school substantially behind their white counterparts in reading and math.” The authors find that their controls can explain gaps in kindergarten, but not the gaps in third-grade:
The estimates in Table 2 suggest that, controlling for other factors, black students score only slightly worse in math than whites upon kindergarten entry, but their trajectories after entry into school are very different. After controlling for our parsimonious specification, blacks score .099 standard deviations below whites in the fall of kindergarten. This deficit increases to .279 standard deviations by the spring of first grade and .382 by the spring of third grade. Thus, the Black-White test score gap grows by almost .30 percentiles between the fall of kindergarten and spring of third grade. The table also illustrates that the control variables included in the specification shrink the gap a roughly constant amount of approximately .50 standard deviations regardless of the year of testing. In other words, although Blacks systematically differ from Whites on these background characteristics, the impact of these variables on test scores is remarkably stable over time. Whatever factor is causing Blacks to lose ground is operating through a different channel.
Similar findings were reported by Yeung and Pfeiffer (2009). These authors examined data from the Panel Study of Income Dynamics (PSID) and its two waves of Child Development Supplements (CDS) to examine the degree to which various socioeconomic factors (in combination with other controls) could account for racial differences in test scores:
Based on panel data for three age cohorts of children from the Panel Study of Income Dynamics, we examine how early home environment contributes to black–white achievement gaps at different developmental stages and the extent to which early gaps contribute to later racial achievement gaps. We find large black–white test score differences among children of all ages even before children start formal schooling. Except for the oldest cohort, the gaps for all tests widened when children’s cognitive skills were assessed six years later. Racial achievement gaps in applied problem scores by grade three and letter–word scores by grade six, can be accounted for by child’s characteristics, family socioeconomic background, and mother’s cognitive skills. However, these covariates explain an increasingly smaller proportion of the black–white achievement gap as children advance to higher grades. Gaps in early cognitive skills are highly predictive of gaps at later ages, setting off a trajectory of cumulative disadvantage for black children over time. Our results underscore the key role of early home environment and the intergenerational roots of the persistent black–white achievement gap.
Some studies have been able to successfully account for the entire test-score gap for older students, though these studies are rare (I only know of 2 studies that have done this). These studies, among others, were considered in a separate post.
This completes the data reviewing racial disparities in test scores. As I stated in the introduction, this post is concerned with merely describing uncontroversial observations regarding racial disparities in tests in the United States. I do not address any explanatory or prescriptive questions in this post. However, I do believe the observations reviewed in this post have important implications for any proposed explanation of racial disparities in tests. Any adequate explanation of racial disparities in tests must be able to explain the following observations that were noted above:
- The cause of test score gaps must be present within the first few years of life, since this is when test score gaps begin to emerge.
- The cause of test score gaps must be present across all states, counties, and districts within the country, since performance disparities are found across all such units of analysis.
- The cause of test score gaps must persist across all levels of education, since performance disparities are found across all educational levels. This implies that, whatever is causing test score gaps, it must also have similar effects among black and white applicants to graduate school.
- The entire distribution of black test scores is shifted below the distribution of white test scores. Thus, whatever is causing the disparity between low-scoring blacks and whites is likely also causing the disparity between high-scoring blacks and whites. Thus, the cause of test score differences between blacks and whites cannot be limited to something that only impacts individuals on one end of the distribution (e.g., family poverty is likely to be a poor explanation of the gap since this does not explain why, say, the top 10% of blacks score so much worse than the top 10% of whites). The cause of the gap must have similar effects across the entire range of the distribution.
- The gap between high-SES blacks and whites is about the same, if not greater, than the gap between low-SES blacks and whites. Thus, similar to the above point, the cause of test score gaps between blacks and whites cannot be limited to something that only impacts rich or poor blacks. The cause of the gap must have similar effects across the entire range of socioeconomic conditions.
- Most of the test score gap persists after controlling for conventional measures of socioeconomic status (income, education, occupational status). Thus, these conventional measures of SES cannot explain the test score gap. Moreover, the cause of the test score gap cannot be something that correlates highly with these conventional measures (otherwise, controlling for conventional measures of SES would implicitly control for the cause of the test score gap, which should eliminate any such gaps).
I also believe that the data here has important implications on racial inequalities in the United States. I have written elsewhere that many social disparities are mostly or entirely eliminated after controlling for youth scores on standardized tests. Thus, the findings here have a few important implications on racial inequalities in the United States:
- Racial inequalities are likely to persist for the foreseeable future, because racial gaps in tests show no signs of closing any time soon.
- Because test score gaps emerge within the first few years of life, and because racial inequalities are mostly downstream of whatever is causing test score gaps, racial inequalities are likely downstream of factors present within the first few years of life.
- One-time transfers of resources are unlikely to have permanent impacts on racial inequality, because cognitive and achievement gaps persist even after controlling for conventional measures of parental SES. These cognitive and achievement gaps are likely to lead to racial inequalities even among individuals born to parents of similar socioeconomic conditions. This explains why there are such large racial disparities in intergenerational mobility, as I’ve outlined here.
5 comments on The scope of racial disparities in test scores in the United States
“The black-white gap is 103.21 – 88.67 = 14.54 points, or about 14.54/15 = 0.97 standard deviations.”
You are mistaken. You have to use the pooled SD to compute the d values, which are the ones reported elsewhere. The value you are computing here is using the total sample SD, which is a function of the composition of this population, and not an invariant metric. The gap as such is 1.06 d.
True, I’ll see to updating that later
This is the most comprehensive review on this specific subject I’ve ever read on the internet. Thanks for putting this together. It should prove to be a valuable resource for many people.
One question…Does comparing regression to the mean in different populations shed any light on IQ differences?