The early emergence of black-white disparities

Last Updated on December 21, 2021

Most people are aware of the significant disparities between blacks and whites regarding a wide range of important social outcomes, including crime, income, education, poverty, single-mother households, etc. I have written extensively on racial disparities in crime and on the degree to which disparities in IQ explain many of the important racial disparities. In this post, I will review studies and data that show that many of these disparities emerge extremely early in life. Specifically, I will investigate racial disparities in IQ, cognitive skills, misconduct, and self-regulation. Black-white disparities along each of these metrics emerge at preschool age or earlier. The black-white disparities gradually grow as children age until the magnitude of the disparities eventually mirror the gaps found between black and white adults.

Cognitive disparities

In previous posts, I have written extensively on the importance of the racial cognitive ability gap in explaining racial disparities in a variety of important of social outcomes, such as income, educational attainment, and incarceration. Because of the crucial role that cognitive ability plays in explaining many black-white disparities, it is important to examine when cognitive disparities first emerge between blacks and whites. Once we know exactly when the gap emerges, this should help direct us toward the explanatory variables that are responsible for the gap. In this section, I’ll consider studies that document and investigate early cognitive disparities between black and white children.

Traditional cognitive ability tests

The data seems to indicate that the cognitive cognitive ability gap between blacks and whites is in full effect (about one standard deviation) by the time children become 3 to 6 years old. For example Gottfredson (1997) [archived] reports that “Racial-ethnic differences in IQ bell curves are essentially the same when youngsters leave high school as when they enter first grade” (page 15). This was a claim published in a very brief 3-page statement that outlines conclusions regarded as mainstream among over 50 experts in intelligence and allied fields. Consider the following reviews of past data reporting the same finding:

  • Rushton and Jensen (2005) [archived] argue that the size of the black-white IQ gap is one standard deviation by the time children reach age 6 (page 241). They cite Peoples et al. (1995) which finds a black-white IQ gap of one standard deviation among 3-year-olds matched on age, gender, birth order, and maternal education. They also cite Lynn (1996) which finds a black-white IQ gap of 15.06 points in general cognitive ability in a representative sample of 2.5-6 year olds (table 1). The Lynn study was based on data published for the Differential Ability Scale (DAS) which measures general cognitive ability, verbal ability, non-verbal reasoning, and spatial ability.
  • Gottfredson (2003) [archived] cites a long list of studies measuring mean IQ differences between blacks and whites (Table 1). I will cite the studies that report the gap in standard deviations (SDs) for children aged 6 and younger. Shuey (1966) found IQ gaps of 0.6 SDs (1922-1944) and 1.07 SDs (1945-1966) among children aged 2-6 years. Coleman et al. (1966) found black-white IQ gaps of 0.78 SDs (verbal IQ) and 1.07 SDs (non-verbal IQ) among 1st graders in 1965. The Stanford-Binet IV standardization sample found an IQ gap of 0.86 SDs among children aged 2-6 years in 1986. The DAS standardization sample found IQ gaps of 0.77 SDs (for children aged 2.6-3.5 years) and 1.23 SDs (for children aged 3.6-5.11) among children in 1986. Finally, A study investigating the children of NLSY mothers found IQ gaps of 1.2 SDs (for children aged 3-4) and 1.13 SDs (for children aged 5-6) during 1986-1994.

More recent studies tend to have similar findings:

  • Brooks–Gunn et al. (2003) examined black-white test score gaps among young children from two data sets. One of the samples involved 315 premature, low birth weight 3 and 5 year olds from the Infant Health and Development Program (IHDP). The cognitive ability of the children were measured using the Stanford–Binet Intelligence Scale at age 3 and the Wechsler Preschool and Primary Scale of Intelligence (WPPSI) at age 5 (page 242). The standard deviations of scores for the two tests were 16 and 15 points, respectively. The black-white gap was about 20 points on the Stanford-Binet and about 15-18 points on the WPPSI (Table 1). This corresponds to gaps of about one standard deviation (page 244).
  • Cottrell, Newman, & Roisman (2015) [archived] examined cognitive ability of children of 1,364 families who participated in the National Institute of Child Health and Human Development (NICHD) Study of Early Child Care and Youth Development (SECCYD). General cognitive ability/knowledge (g) was measured using the math, vocabulary, and reading ability facets of the Woodcock-Johnson Psycho-Educational Battery-Revised (WJ-R). Researchers find that “Black-White gaps in cognitive test scores are large and pervasive, and are already established by 54 months of age” (page 11). They further report that “between 54 months and 15 years of age, this gap did not significantly increase over time” (page 11). The black-white gap in g ranged from around 1.2 and 1.4 standard deviations during this time period (page 11).
  • Quinn (2015) [archived] examined cognitive disparities using data from the Early Childhood Longitudinal Study – Kindergarten Class of 2010-2011. This is a new nationally representative dataset of over 10,000 children who entered kindergarten in 2010. The cognitive and behavioral outcomes of the children were assessed in regularly scheduled followups as they progressed through school. Consistent with prior data, there were significant black-white gaps in reading (.32 SDs) and mathematics (.54 SDs) (page 128). Black children also scored .52 SDs worse than whites in working memory (page 128) during the fall of kindergarten. The author notes that this may actually ” underestimate the WM gap” because valid scores were not available for young, low-scoring students (page 128). For comparison, Hispanic students had similar working memory scores as blacks, and Asians had working memory scores about .19 SDs higher than whites. See Table 3 here:

Controlling for a variety of variables (parental income, parental education, number of books in the home, age, etc.) reduced the black-white gap in working memory to about .24 SDs in the fall of kindergarten (Table 5).

The data seems to converge on the conclusion that the IQ gap becomes about one standard deviation (about 15 IQ points) sometime before children reach age 3-6 years.

Achievement tests

Many other studies have documenting the early cognitive (non-IQ) gaps between blacks and whites. These studies typically measure achievement gaps in reading and mathematics between black and white children. The general finding is that the gaps are large for these children, although they are often smaller than the IQ gap.

Brooks–Gunn et al. (2003) examined black-white test score gaps among young children from two data sets. The first was a sample of 315 premature, low birth weight 3 and 5 year olds from the Infant Health and Development Program (IHDP). The second was a nationally representative sample of 2,220 3 to 4 year olds and 1,354 5 to 6 year olds from the National Longitudinal Study of Youth-Child Supplement (NLSY-CS). Both datasets included measures of verbal receptive ability tests using the Peabody Picture Vocabulary Test–Revised (PPVT–R) at ages 3 and 5. The standard deviation of scores for the PPVT–R was 15 points (page 242). The black-white PPVT–R score gaps at age 3 and 5 were about 17 to 25 points, depending on the age and sample (Table 2). That corresponds to gaps of between 1.1 and 1.7 standard deviations.

Farkas and Beron (2004) [archived] investigated oral vocabulary (PPVT) scores for black and white children using data from the NLSY-CS. This sample was also studied by Brooks-Gunn et al. (2003), but the benefit of this study is that they report test scores for the children at regular intervals, from 36 months to 13 years. The authors found large gaps in vocabulary that emerged by 36 months of age, which persisted (did not widen or narrow) until 13 years of age (page 477):

Beginning with the earliest observation at 36 months of age, Whites average significantly higher scores than African-Americans. This pattern is consistent over the full age span, with the White lead remaining significant through 13 years of age. To see how very substantial this vocabulary gap is, note that Whites cross the 40-word level at approximately 50 months of age, whereas African-Americans do not reach this level until approximately 63 months, which puts them 13 months, or more than one year, behind in vocabulary development. Similarly, Whites cross the 60-word level at approximately 67 months of age, whereas African-Americans do so at approximately 81 months, and so on.

The paper ended with the following conclusion (page 491):

Where groups are concerned, a significant gross (that is, unadjusted) Black–White vocabulary gap is observed at the very beginning of our data, at 36 months of age. This finding has not been reported before using such a large sample of nationally representative data. At this early age, the magnitude of this gap is already large, being larger than the amount of vocabulary growth achieved by a typical Black 3–4 year old in approximately one full year. Thus one might say that, at 36 months of age, the oral vocabulary of the typical African-American infant in our data was more than one year behind that of the typical White infant. This strongly directs stratification researchers to the period from birth to 36 months, as being the time when this inequality gap first appears. This large race gap in vocabulary knowledge reaches its peak during the preschool years, and ceases to widen thereafter, suggesting that, at least on this measure, inequality in cognitive performance between African-Americans and Whites is attributable to family differences between the groups, rather than to processes that occur in school. This, is consistent with the analyses of Jencks and Phillips (1998), and Guo (1998), and also with Hart and Risley’s (1995, 1999) emphasis on early class and race differences in home oral language instruction within families.

In the introductory chapter of The Black-White Test Score GapJencks and Phillips (1998) [archived] present the following two score distributions showing the stark differences between blacks and whites (page 1):

Fryer and Levitt (2005) [archived] analyzed 1998 data from the Early Childhood Longitudinal Study Kindergarten Cohort (ECLS-K), a nationally representative sample of over 20,000 children entering kindergarten (page 5). In the longitudinal study, children were administered standardized tests in the fall and spring of kindergarten, spring of first grade, and spring of third grade. The results revealed that “Black children enter school substantially behind their white counterparts in reading and math.” The specific gaps for different racial groups were as follows:

  • In the fall of kindergarten, the black-white gap was .66 and .40 standard deviations on the math exams and reading exams, respectively (page 7). By the end of third grade, the math and reading gaps grow to .88 and .77 standard deviations, respectively.
  • Hispanics enter kindergarten with scores similar to blacks, with scores that are about .74 and .45 standard deviations lower than whites in those respective subjects (Table 3). By the end of third grade, these disparities shrink to .54 and .39 standard deviations. 
  • Asians enter kindergarten with scores that were about .11 and .31 standard deviations greater than whites in those respective subjects. By the end of third grade, there is no Asian advantage. 

Murnane, Willett, Bub, and McCartney (2006) [archived] examined the black-white test score gap using two longitudinal datasets – the kindergarten cohort of the Early Childhood Longitudinal Study (ECLS-K) and the National Institute of Child Health (NICHD). The NICHD measured children’s skill in mathematics and English Language Arts (ELA) when they were 54 months, just prior to kindergarten (page 101). Similar measures were taken for the children in the ECLS-K, except that were measured just after the start of kindergarten. The gaps found in the ECLS-K sample were similar in magnitude to the gaps found by Fryer and Levitt, with black-white gaps of about 0.64 and 0.4 standard deviations, respectively, in mathematics and ELA, respectively. In the NICHD sample, the black-white gaps in mathematics and ELA were about 1 standard deviation at 54 months (page 109-110).

  • Standard deviations are in parenthesis.

Fryer and Levitt (2007) [archived] examined the mental function of very young black and white children from two datasets. The primary dataset came from the Early Childhood Longitudinal Study–Birth Cohort (ECLS-B). The ECLS-B was the first nationally representative sample with measures of mental functioning for children under the age of one. The study involved over 10,000 children born in 2001 who were tested (using a shortened version of the Bayley Scale of Infant Development (BSID)) at two waves, one when children were between 8 and 12 months of age and one when children were around 2 years old. The researchers also investigated the Collaborative Perinatal Project (CPP). The CPP was not a nationally representative sample; it consists of over 31,000 women who gave birth to one of 12 medical centers between 1959 and 1965. Mental function was tested at 8 months of age (using the full BSID), at 4 years of age (using the Stanford-Binet), and at 7 years of age (using the Wechsler Intelligence Test).

Data from the ECLS-B revealed black-white gaps of around .06 and .38 standard deviations at age of 9 months and 2 years, respectively. Data from the CPP revealed that black-white larger gaps of around 0.1, 0.79, and 0.85 standard deviations at the age of 8 months, 4 years, and 7 years, respectively. Hispanics and Asians also scored below whites, with the Hispanic-white gap being about as large as the black-white gap and the Asian-white gap being about half as large. The authors summarized the results as follows (page 14):

For purposes of comparison, we reproduce the results for 9 month olds in ECLS in the first two columns of Table 3. The next two columns correspond to these same children tested around age two. Raw gaps of almost .4 standard deviations between Blacks and Whites are present on the test of mental function at age 2. Even after including extensive controls, a Black-White gap of more than .2 standard deviations remains. Hispanics look similar to Blacks. Asians lag Whites, but by a smaller magnitude. The CPP data shows similar patterns. Columns 5 and 6 of Table 3 report results from CPP at age 8 months. These results for Blacks are quite similar to the ECLS estimates at age 9 months in columns 1 and 2 of the table: raw differences of less than .1 standard deviations that shrink and become statistically insignificant with the inclusion of controls. Hispanics actually outperform whites in the raw data in the early wave of CPP (although this appears to be solely an artifact of systematic differences across the set of interviewers who tested Hispanic children relative to those of other races), but do worse with controls. By age four, however, a large test score gap has emerged for Blacks, Hispanics, and “other race.” In the raw data, Blacks lag Whites by almost .8 standard deviations and Hispanics fare even worse. The inclusion of controls reduces the gap to roughly .3 standard deviations for Blacks and .5 standard deviations for Hispanics. The age seven results are generally similar to those at age four.

The following graph illustrates the magnitude of the age-4 cognitive differences quite well (Figure 3B):

Yeung and Pfeiffer (2009) examined data from the Panel Study of Income Dynamics (PSID) and its two waves of Child Development Supplements (CDS). Yeung and Pfeiffer used used a sample of 1794 children, including 856 blacks and 938 whites (page 417). Researchers studied three cohorts of children who were tested at two points in time – once in 1997 and once in 2003. Cognitive skills are measured with the Woodcock Johnson achievement test-revised, with the applied problem score as an indicator of the child’s math skills and the letter–word test as a measure of children’s verbal skills (page 417). The authors note that, “before children start formal schooling, black children score about 0.78 and 0.43 standard deviations lower than whites on applied problem and letter–word tests, respectively” (page 424).

Kreisman (2012) [archived] examined trajectories for language acquisition for black and white children using data from the Early Head Start Research and Evaluation study (EHSRE). The EHSRE was initiated in 1995 to evaluate the impacts of Early Head Start programs on low income families. The children in the sample were assessed for language acquisition at 14, 24, and 36 months after birth. The author analyzes 1,458 of these children to determine and measure when trajectories in language acquisition begin to diverge for white and black children. He finds that large gaps in language acquisition emerge between black and white children by 24 months of age which widen when children reach 36 months of age. Most surprising are the gaps between black and white children from mothers with more than a high school degree, which grows to over 0.5 standard deviations by 36 months of age. In fact, by 24 months of age, black children with mothers with more than a high school education have lower scores than white children with mothers without a high school degree (Figure 1):

In the conclusion, the author notes that, while various family background variables can explain some of the racial disparity, large gaps persist even after controlling for those variables (page 1444):

Results indicate several significant findings. First, the Black–White gap in in the home language environment, measured by a principal factor score from linguistic variables from the HOME and 3-bag evaluations of the child’s learning environment, is over a half a standard deviation at 14 months and remains relatively constant over the 22 month observation period. Second, although at 14 months Black children slightly outperform their White peers, by 24 months a significant gap in language emerges which widens by 36 months; these gaps persist despite a robust set of controls for demographic characteristics, family circumstances and the home language environment. Third, persistent negative coefficients on interactions between age, race and maternal education, and between race and a time-varying measure of the home language environment confirm Farkas and Beron’s (2004) hypothesis that returns to SES and measures of the home language environment accrue differently with respect to race over these ages. Lastly, a decomposition of the gap at each wave indicates that while at 24 months differences in endowments of the covariates explain much of the measured language gap, at 36 months only half of the race gap is due to differences in endowments of the covariates, and half is attributable to differences in returns to these endowments, in particular to differential returns to maternal education and a measure of the home language environment.

It is clear that all studies find large achievement gaps before children even reach kindergarten. The magnitude of the gap varies, with some studies even finding gaps of about 4 standard deviations and others finding gaps exceeding a full standard deviation. Gottfredson (2004) also finds that the achievement gap is smaller than the IQ gap (Table 6). She posits that one explanation for the lower achievement gaps (compared to the IQ gap) is that the correlation between IQ and achievement is not perfect (page 30). Another explanation might be the fact that achievement tests are less g-loaded than IQ tests and black-white gaps tends to be larger on more g-loaded tests (Neisser et al. 1996, page 93; Nijenhuis and Van den Hoek 2016, table 3). One finding that supports this hypothesis is the fact that the mathematics achievement gap appears to be larger than the reading achievement gap in most of these studies. This might be explained by the fact that mathematics tests tend to be more g-loaded than reading tests. Regardless, it is fair to conclude that the achievement gap between blacks and whites is substantial and emerges within the first few years of life.

Early Childhood Longitudinal Study data

For direct data regarding achievement gaps, we can view publicly available data from the Early Childhood Longitudinal Study (ECLS), a program sponsored by the National Center for Education Statistics (NCES). The program includes 4 longitudinal studies that examine children’s knowledge, skills, and socioemotional development throughout elementary school. One particularly interesting study, the ECLS-B, provides information on cognitive and motor skills at 9 months, 2 years, and 4 years of age across a nationally representative sample of approximately 14,000 children born in the U.S. in 2001. This information is disaggregated by race and published on their official website. The results are as follows:

  • At about 9 months of age [archived], white children outperformed black children on every measure of cognitive skill. For example, compared to black children, a greater percentage of white children demonstrated ability to explore objects with a purpose (84.0% vs 80.8%), proficiency in communication through diverse nonverbal sounds and gestures (30.4% vs 27.8%), and proficiency in engaging in early problem solving (3.9% vs 3.3%). In contrast, black children outperformed white children on every measure of motor skill. For example, compared to white children, a greater percentage of black children demonstrated proficiency in being able to use visual tracking to guide hand movements to pick up a small object (91% vs 88.8%), proficiency in ability to engage in various prewalking types of mobility (69.7% vs 63.8%), and proficiency in ability to walk with help and to stand independently (22.8% vs 18.0%).
  • At about 2 years of age [archived], the exact same patterns were found, only this time the cognitive disparities were more pronounced. White children outperformed black children on every measure of cognitive skill. For example, compared to black children, a greater percentage of white children demonstrated ability to recognize and understand spoken words or to indicate a named object by pointing (88.7% vs 79.4%), verbal expressiveness using gestures, words, and sentences (70.7% vs 55.7%), ability to understand actions depicted by a story, in pictures, or by verbal instructions, and proficiency in engaging in early problem solving (42.2% vs 29.9%), and ability to match objects by their properties (e.g., color) or differentiate one object from another (37.1% vs 25.5%). In contrast, black children outperformed white children on every measure of motor skill. For example, compared to white children, a greater percentage of black children demonstrated ability to use fine motor control with hands (58.0% vs 55.8%) and ability to walk up and down stairs (50.2% vs 48.4%).
  • Finally, at about 4 years of age [archived], we again found similar patterns regarding cognitive skills. White children outperformed black children on every measure of cognitive skill. For example, compared to black children, white children achieved higher average early reading scale scores (27.4 vs 22.9) and mathematics scale scores (31.6 vs 26.9), and a greater percentage of white children were able to name the colors of five pictured objects (71.0% vs 55.3%). However, unlike the results from the earlier followups, black children did not outperform white children in motor skills. In fact, white children had a slightly greater average fine motor score (3.5 vs 3.2).

Aside from expressive vocabulary, where blacks perform relatively well, the results for each outcome exhibit the same pattern: Asians and whites score the highest, followed by mixed children, followed by blacks, Hispanics, Pacific Islanders and Native Americans.

A more recent longitudinal study by the ECLS measured child reading and mathematics scores from kindergarten to fifth grade. The results were consistent with the results presented earlier:

In 2010, black kindergarteners had lower reading scores [archived] as early as the fall of kindergarten. Specifically, the black-white reading score gap was 3.1 points (SD=10.5) in the fall of kindergarten. The gap grew to 4.9 points by the spring (SD=13.2). The gap grew to 7.6 points in the first grade (SD=16.5) and remained fairly stable until the fifth grade. Interestingly, black kindergarteners actually outperformed several other ethnic groups in the fall, including Hispanics (53.0 vs 50.8), Pacific Islanders (53.0 vs 52.7), and American Indians/Alaska Natives (53.0 vs 52.7). However, by the spring of third grade, black children had the lowest reading scores of all ethnic groups.

Similar racial gaps of a greater magnitude were found for mathematics scores [archived]. The black-white mathematics score gap was 6.7 points (SD=10.7) in the fall of kindergarten. The gap grew to 9.3 points by the spring (SD = 12.4). The gap grew to 11.7 points in the first grade (SD=14.4) and 15.7 points in the second grade (SD=16.6). The gap remained fairly stable until the fifth grade where the gap was 16.1 points (SD=15.8). Interestingly, black kindergarteners actually outperformed Hispanics (32.0 vs 31.4) in the fall. However, by the spring of kindergarten, black children had lower mathematics scores than Hispanics and every other ethnic group. To show how significant these gaps are, the black-white mathematics gap was nearly a full standard deviation by second grade (the same magnitude as the black-white IQ gap among adults). The most worrying finding here may be that white and Asian third graders outperform black fifth graders in mathematics.

Non-cognitive disparities


I have written extensively on the magnitude of black-white disparities in criminality in past posts. I will give a brief review here.

  • Data from FBI crime statistics (2015) [archived] shows that despite making up only 13% of the US population, black people commit 36% of violent crime in the US. Even worse, they commit over half of the robberies and murders in the country. Among criminals under the age of 18, black youth commit over 60% of the robberies and murders in the country, and over half of the violent crime.
  • In 2003, the Bureau of Justice Statistics [archived] (Figure 4) released a report showing that 1 in 3 black males could be expected to go to prison if the current rates of imprisonment had remained unchanged (fortunately, imprisonment rates have decreased since then).
  • The CDC (2015) [archived] reports that homicide is the leading cause of death for black males aged 15-34, with nearly half of deaths for men aged 15-24 the result of homicide (see page 34). Contrast this with white males of the same age for whom homicide causes roughly between 3-5% of deaths (see page 27). In fact, the death rate due to homicides for blacks aged 20-24 (110.8 deaths per 100,000 population) is over 20 times the rate for similarly aged whites (5.4 deaths per 100,000 population).

I provide more detail on the scope and magnitude of black criminality in a separate post.

With such stark racial disparities in crime, a natural question to ask is how early these disparities in misbehavior emerge. Racial disparities in suspensions suggest that black children display disproportionate levels of misconduct very early in childhood.

2010 report [archived] by the Southern Poverty Law Center was conducted to analyze suspension rates by middle schools in 18 of the nation’s largest school districts. Data on suspension was gathered from the Civil Rights Data Collection (CRDC). The study found that the suspension rate for black males was nearly 3 times the rate for white males (28.3% vs 10%). The suspension rate for black females was over 4 times the rate for white females (18% vs 4%). In fact, black females had higher suspension rates than white males.

Analyzing the suspension rates by race in each district showed that black children had greater suspension rates in every district except for Indianapolis, where the suspension rate for white children was 1 percentage point higher (Table 1a, Table 1c, Table 1e):

Suspension rate for males by race
BlackHispanicWhiteB:W ratio
Palm Beach53%20%18%2.94
Des Moines46%25%24%1.92
San Antonio42%23%19%2.21
Los Angeles32%15%11%2.91
Springfield, MA31%30%17%1.82
Jackson, MS29%0%20%1.45

2018 report [archived] by the U.S. Government Accountability Office on discipline disparities shows that the racial disparity in suspension rate is not accounted for by controlling for level of school poverty. At every level of school poverty, black students are significantly more likely to be suspended than any other race of students. In fact, the out-of-school suspension rate for black students attending schools with the lowest level of poverty (7.5%) is higher than the rate for white students attending schools with the highest level of poverty (7.3%) [Table 14].

The racial disparity in misconduct appears even at preschool. For example, a 2014 report [archived] by the U.S. Department of Education Office for Civil Rights shows that “Black students represent 18% of preschool enrollment, but 42% of preschool students suspended once, and 48% of students suspended more than once” (page 3).

The report [archived] by the U.S. GAO on discipline disparities shows black preschoolers are over 3 times as likely as white preschoolers to be suspended (1.1% vs 0.3%) [Table 17].

Now, some people might accept these facts but deny that it demonstrates early misconduct by black children. One might argue that this demonstrates only that black children are disproportionately punished for alleged misconduct, but it doesn’t show that they disproportionately engage in misconduct. Such a person might argue that racial disparities in suspensions are driven entirely or primarily, not by racial disparities in behavior, by a deep anti-black bias in schools across the country. To this argument, I have a few responses:

  • While black teachers suspend black children at lower rates than white teachers do, they still suspend black children at a rate several times that of white children. For example, white male teachers suspend black male students at 3 times the rate of white male students (15.4% vs 5.2%), and black male teachers suspend black male students at 2.4 times the rate of white male students (12.7% vs 5.3%) [source: see Figure 2] [archived]. In other words, the racial suspension disparity for male students when taught by black male teachers is about 70% of the disparity when taught by white male teacher (disparity of 10.2 percentage points vs 7.4 percentage points). Therefore, if racial differences in suspension rates are due to an anti-black bias, it appears that the same bias is found among black teachers. While it is certainly possible that black teachers exhibit similarly sized anti-black biases as white teachers, this hypothesis is certainly less immediately plausible than the hypothesis that bias by white teachers is driving the racial difference in suspension.
  • Wright et al. (2014) [archived] used data from the Early Childhood Longitudinal Study – Kindergarten class (ECLS-K) to find that “the racial gap in suspensions was completely accounted for by a measure of the prior problem behavior of the student.” More specifically, the study finds that, prior to including any controls, the odds ratio (OR) on suspensions for being black was OR = 3.78 (page 5). After including controls (e.g., gender, grades, poverty status, teacher race, parent-reported delinquency, etc.), the effect of being black reduced to OR = 1.89 (Table 2, model 1), reducing the racial disparity in suspension odds by 68%. After including a control for prior problem behavior, the effect further reduced to OR = 1.20 (reducing the racial differential by over 90%), with the effect of being black no longer statistically significant. This finding is completely unexpected if racial differences in suspension were explained by strong anti-black bias in the schools. After all, if schools possessed such an anti-black bias, why do black and white children with comparable backgrounds and histories have similar suspension rates? Does the anti-black bias magically disappear when comparing similar blacks and whites? No, the simplest explanation here is that the racial disparity in suspensions is not the result of race per se, but rather is the result of certain variables that correlate with race (such as e.g., delinquency, problem behavior, etc.).
  • According to a report [archived] by the National Center for Education Statistics, black third-graders are significantly more likely to report having assaulting other students. In particular, in 2010, the percentage of black children who who reported that they “pushed, shoved, slapped, hit, or kicked other students” (7.9%) was over 6 times greater than the percentage of white children who reported the same (1.2%). A similar report [archived] also reported that black third-graders were more likely to report having been victimized in this manner, although the disparity here was less dramatic (20% vs 12.7%).

Further evidence that disparities in discipline are real (i.e. not entirely driven by racial bias) is the fact that there is data suggesting that racial differences in misbehavior emerge as early as infancy. Bakermans-kranenburg et al. (2004) examined data from the NICHD Early Childcare Research Network to examine differences in attachment security between 142 black and 1,002 white infants. The children were around 24 months of age when differences in attachment security and compliance were assessed. The study found that black children displayed greater levels of misconduct (page 423):

A set of items indicating children’s compliance showed consistently lower mean scores for the African-American group. African-American children appeared to be less compliant to their mothers’ suggestions or requests, and were less inclined to ‘stop misbehavior when told no’. African-American children showed on average also more active and even rough behavior in the context of play. Play materials were more roughly handled and the children became more easily angry with toys.

In a separate post, I provide more detail on why these data points provide evidence of significant disparities in misbehavior between black and white children.


In a recent post, I reviewed an extensive list of studies showing the robust predictive validity of childhood self-regulation. “Self-regulation” is an umbrella term that covers self-discipline, impulse control, self-control, ability to delay gratification, etc. I showed that self-regulation has strong predictive power for a variety of adulthood outcomes, including high school graduation, college attendance and graduation, criminal offending, single-parent childrearing, etc. In fact, childhood self-regulation maintains this predictive power even after controlling for IQ and parental SES (Moffitt et al. 2011 [archived] and Fergusson et al. 2013). Further, many studies demonstrated that various components of self-regulation (e.g. self-control) outdoes IQ in predicting a variety of important life outcomes (Duckworth and Seligman 2005 [archived] and Duckworth et al. 2012 [archived]). I provide more studies and go into more detail on these studies in a separate post. Given the robust predictive validity of childhood self-regulation on adulthood outcomes and given the large black-white disparities in many outcomes associated with self-regulation (e.g. criminal offending, high school graduation, etc.), it will be worthwhile to investigate whether black and white children diverge on measures of self-regulation. In this section, I will review studies which find substantial black-white differences in self-regulation that appear extremely early in life.

Watts et. al (2018) [archived] attempted to replicate the famous marshmallow study by Shoda et. al (1990). Researchers used data from the National Institute of Child Health and Human Development (NICHD) Study of Early Child Care and Youth Development (SECCYD) to examine the influence of various cognitive and behavioral skills at preschool. The researchers investigated the relationship between the ability to delay gratification, cognitive ability, various non-cognitive skills and academic and behavioral outcomes until age 15. The ability to delay gratification was measured by presenting the 54-moths-old children with a treat and giving them two options: they could either (a) wait 7 minutes before eating the treat, in which case they would eat the treat and get a reward, or (b) eat the treat before 7 minutes and get no reward. The measure of delay of gratification was recorded as the number of seconds the child waited (7 minutes being the maximum). The authors also measured self-control using parent- and teacher-reports, and impulsivity and attention (using the Continuous Performance Task) when children were 54-months of age. Before getting into race differences, the results found significant associations between

  • Consistent with other research regarding self-regulation, there was a significant correlation between the ability to delay gratification at age 7 and outcomes at age 15. The results showed that the seconds time that a child waited at age 4 was significantly correlated with grade 1 achievement (r = .28) and age 15 achievement (r = .24) (Table 4). The results suggested that children’s Grade 1 achievement would improve by approximately 0.1 standard deviations for every additional minute waited at age 4. Comparing children who waited 7 minutes to children who waited less than 20 seconds showed larger effects. Compared to children who waited less than 20 seconds, children who waited 7 minutes scored about .72 and .65 standard deviations higher on achievement composites at grade 1 and age 15, respectively (Table 4).
  • One point worth mentioning is that the association between minutes waited and academic achievement was almost entirely accounted for after controlling child characteristics (gender, race, and birth weight), family background (maternal age and education, and family income), quality of home environment, child temperament at 6 months, child vocabulary at 36 months, child cognitive functioning at 24 months and 36 months, and a variety of measures of cognitive and behavioral skills and problems. This does not disprove the predictive validity of the ability to delay gratification. The reason for this is that the predictive power of ability to delay gratification is likely due to its association with the broader cognitive and behavioral abilities that were included in the aforementioned controls.

Black children were found to have lower abilities to delay gratification, lower self-control, lower attention, and higher impulsivity:

  • Black children comprised a disproportionate share of children who had shorter waiting times. Among children of nondegreed mothers, black children constituted 16% of children (Table 1), yet constituted 24% of the children of nondegreed mothers who did not wait 7 minutes and 7% of children who did (Table 3). Among degreed mothers, black children constituted 2% of children (Table 1), yet constituted 5% of children who did not wait 7 minutes and 0% of children who did (Table 3).
  • There was a medium-sized negative association between being black and the time waited during the experiment (r = –.25, Table 7). For comparison, this association was similar in magnitude to the association between being black and scores on different subtests on the Woodcock-Johnson Psycho-Educational Battery Revised (WJ-R) test, for which the correlations ranged between –.18 to –.33 (Table 7). This suggests that a mean difference of about .86 standard deviations between blacks and non-blacks in the ability to delay gratification.
  • There was a small to medium sized negative association between being black and self-control (r = –.16) and attention (r = –.12) (Table 7). Likewise, there was a medium sized positive association between being black and impulsivity (r = .20). This suggests a mean difference of about 0.54, 0.40, and 0.68 standard deviations between blacks and non-blacks in self-control, attention, and impulsivity, respectively.

Note, I calculated mean differences by using equation 9 here to convert from correlation coefficients to standardized mean differences. The equation was calculated assuming that 10% of the sample in this study were black (based on Table 1 which shows that there were 552 children nondegreed mothers [16% black], 366 children of degreed mothers [2% black]). This exact same equation was used by Cottrell, Newman, & Roisman (2015) [archived] to convert from correlation coefficients to standardized mean differences (Table 2), who also examined the NICHD–SECCYD dataset.

Duckworth et al. (2012) [archived] conducted two longitudinal, prospective studies of middle school students to compare the relative impacts of self-control and intelligence on standardized test scores and school grades. The results of the first study were as follows:

  • The first study was a data analysis of a sample of the 1,364 students in the NICHD Study of Early Child Care and Youth Development (NICHD-SECCYD). Self-control and IQ data were collected from participants in the 4th grade. Self-control was measured by reports from the participant’s mother, father, and teacher. IQ was measured using the Wechsler Abbreviated Scale of Intelligence (WASI). Both 4th grade IQ and self-control (particularly teacher-reported self-control) were significantly associated with 8th and 9th grade achievement and GPA.
  • The study also found a significant negative correlation between black and all measures of self-control (table 1). In fact, the negative correlation between black and teacher-reported self-control (the most predictive measure of self-control) was almost as great in magnitude as the negative correlation between black and IQ (r = –0.30 vs r = –0.35). Given that blacks constituted about 13% of the sample, this implies a mean difference of about 0.94 standard deviations in teacher-reported self-control.

The results of the second study were as follows:

  • The second study involved 510 5th through 8th grade students at two public schools in New York City. Self-control was measured using reports from homeroom teachers, parents, and students who completed the Impulsivity Scale for Children (ISC) test with students as targets. IQ was measured using scores on the Raven’s Progressive Matrices test. Again, teacher-reported self-control was more predictive of success than other measures of self-control. In fact, teacher-reported self-control outperformed IQ in predicting spring GPA (r=0.55 vs r=0.40) (table 3).
  • Like study 1, this study also found a significant inverse correlation between black and all measures of self-control (table 3). In fact, the negative correlation between black and teacher-reported self-control was greater in magnitude than the negative correlation between black and IQ (r=–0.18 vs r=–0.12). Given that blacks constituted 35% of the sample, this suggests a mean difference of about 0.38 standard deviations in teacher-reported self-control.

Duncan and Magnuson (2011) [archived] reviewed literature on the “association between early achievement, attention, and behavior and later school achievement and such late-adolescent schooling outcomes as dropping out and college attendance.” As expected, they found that persistent achievement or behavior problems were predictive of outcomes later in life, including, e.g. high school graduation, college attendance, early-adult crime, etc. (see executive summary). Some of the findings are as follows:

  • Using data from the 1979 National Longitudinal Survey of Youth (NLSY79), researchers found significant associations between persistent anti-social and attention problems at ages 6, 8, and 10 and the probability of graduating high school, attending college, and being arrested when subjects were around 20. Now, attention problems were no longer associated with these outcomes after controlling for an extensive list of covariates, including race, gender, poverty, number of siblings, parental marital status, mother academic aptitude, etc. (see footnote of table 3.A9 for the full list of controls). However, persistent anti-social behavior remained significantly associated with these outcomes despite these controls (tables 3.A10 and 3.A12).
  • Significant black-white gaps in both attention problems and anti-social behaviors emerge as early as kindergarten and gradually grow at least until the 5th grade. For example, the black-white gap attention/engagement grows from 0.36 standard deviations (SDs) to 0.44 SDs between the 1st and the 5th grade (Figure 3.3). The black-white gap in anti-social behavior grows from 0.31 SDs to 0.5 SDs between the 1st and the 5th grade (Figure 3.4).

Finally, consider Elder and Zhou (2021) [archived], the one study I found with the explicit purpose of investigating black-white gaps in non-cognitive skills. They analyze data from two cohorts of the Early Childhood Longitudinal Study (ECLS) to measure the racial differences in a variety of non-cognitive skills. One cohort features children who graduated kindergarten in 1999 (ECLS-K:1999) and the other cohort features children who graduated in 2011 (ECLS-K:2011). Non-cognitive skills are measured using 5 composite scales known as Social Rating Scales (SRS) comprising 24 total items. The scales are described as follows (page 108-109):

  • The “externalizing problem behaviors” scale uses information about the frequency with which a child acts impulsively, interrupts ongoing activities, fights with other children, gets angry, and argues.
  • The “approaches to learning” scale is based on information about a child’s attentiveness, task persistence, eagerness to learn, learning independence, flexibility, and organization.
  • The “self-control” scale includes four items that measure a child’s ability to control his or her behavior.
  • The “interpersonal skills” scale uses five items that measure a child’s ability to interact with others.
  • The “internalizing problem behaviors” scale includes four items that rate the presence of anxiety, sadness, loneliness, and low self-esteem.

Data on the non-cognitive scales were extracted from teacher-reports of student behavior from kindergarten to 5th grade. The basic results are as follows:

  • The results showed that, in each grade, “White children outperform Black children in all five measures of noncognitive skills” (page 111). The greatest differences were the scales for self-control scale and externalizing problem behaviors. For example, in the ECLS-K:2011 cohort, from kindergarten to 3rd grade, the black-white gap in self-control grows from .40 to .58 SDs, the gap in externalizing problem behaviors grows from about .40 to .52 SDs, and the gap in approaches to learning grows from 0.33 to 0.52 SDs (Table 2) Similar patterns were observed in the ECLS-K:1999 cohort (Table B), which is consistent with test score gaps which show that “the Black-White test score gap has remained roughly constant since the early 1990s” (page 106).
  • The gaps in non-cognitive skills were partially reduced after controlling for home environment and school environment. The home environment includes paternal education, parental marital status, whether the child lives in a two-parent household, number of books owned by the child, SES composite index, and child birth weight. Controlling for these variables reduced the non-cognitive gaps by about 45 to 66 percent (page 114). Interestingly, most of the gaps in kindergarten persisted after including these controls (Table 2), whereas more of the gap in later grades were explained by the controls.

The authors note that these reported gaps may actually under-estimates of the actual non-cognitive skill gaps because “teachers’ opinions for what constitutes ‘normal’ levels of achievement and behavior are systematically different in schools with disadvantaged student populations as compared to more advantaged schools” (page 118). The authors implemented a number of different methods to adjust for this “reference bias”. Explaining the different methods is a bit complicated, so you’ll have to read the study for more detail and clarity. Regardless, the study found that each of these methods result in much greater estimates of the black-white gaps in non-non-cognitive skills. For example, the initial self-control gap of 0.4 SDs in kindergarten increases to 0.6 – 1.0 SDs, depending on which method is applied (Figure 4):

In third-grade, the initial self-control gap of 0.5 SDs increases to exceed 0.8 SDs across all methods to control for reference bias (Figure 7):

Most surprising is the fact that the black-white gaps, after adjusting for reference bias, are rather large even after controlling for home environment variables mentioned above. For example, prior to adjusting for reference bias, the third-grade gap in self-control was about 0.28 SDs after controlling for home environment. However, after adjusting for reference bias, the gap was about 0.5 standard deviations even after controlling for home environment. Gaps of similar magnitude were found for approaches to learning, externalizing behaviors, and interpersonal skills.