Last Updated on April 13, 2021
In a previous post, I argued that the black-white cognitive ability gap is responsible for many of the important social disparities we find between blacks and whites, e.g. disparities regarding income, crime, education, occupational prestige, etc. Therefore, insofar as it is important that we resolve racial disparities in these social outcomes, it is also important that we resolve racial disparities in cognitive ability. In order to solve a problem, we should know the cause of the problem. Therefore, assuming it’s important to resolve racial disparities in these social outcomes, it is also important that we know the cause of the cognitive ability gap.
I began investigating potential investigating possible environmental causes of the cognitive ability gap in another post. I argued that three common environmental explanations – test bias, schooling, and socioeconomic differences – failed to explain the gap. Because common environmental explanations fail, it may prove useful to consider non-environmental (i.e. genetic) explanations of the gap. I will perform this task in this post. First, I will clarify the meaning of estimates of the heritability of intelligence. Next, I criticize some common poor arguments against genetic explanations of the gap. Then I will consider some arguments for a genetic explanation of the gap that are related to the heritability of intelligence. I conclude that such arguments are not sufficient to make any conclusion either way about the cause of the gap. I end by detailing the kinds of direct data that should be used to confidently conclude whether and to what degree the cognitive ability gap is due to genetic differences.
What is heritability?
The heritability of a trait is the proportion of phenotypic differences between individuals within a population that is due to genotypic differences between those individuals. The mathematical definition of heritability requires understanding variance. To calculate the heritability of a trait, one must split the phenotypic variance of the trait into what Griffiths et al. (2000) calls genotypic variance and environmental variance:
If a trait is shown to have some heritability in a population, then it is possible to quantify the degree of heritability. In Figure 25-3, we saw that the variation between phenotypes in a population arises from two sources. First, there are average differences between the genotypes; second, each genotype exhibits phenotypic variance because of environmental variation. The total phenotypic variance of the population can then be broken into two parts: the variance between genotypic means and the remaining variance. The former is called the genetic variance, and the latter is called the environmental variance.
Now that we have the notions of genotypic variance and environmental variance, we can give a mathematical definition of heritability. The heritability of a trait is simply the genotypic variance divided by the phenotypic variance (where the phenotypic variance is the sum of the genotypic variance and environmental variance). If the environmental variance is zero (e.g., the environments are identical for all members), then all of the phenotypic variance will be due to genetic variance, which means heritability would be 100%. If the genetic variance is zero (e.g., all members of an environment are genetic clones), then all of the phenotypic variance will be due to environmental variance, which means heritability would be 0%. All psychological traits show significant genetic and environmental influence, so the heritability for psychological traits is always greater than 0% and less than 100% (Nisbett et al. 2012, page 132; Plomin et al. 2016). One useful way of thinking about the heritability of a trait is to view it as a measure that allows us to predict the hypothetical variation of a trait if we eliminated all environmental variation. Griffiths et al. (2000) makes the following point:
If all the relevant environmental variation is eliminated and the new constant environment is the same as the mean environment in the original population, then [heritability] estimates how much phenotypic variation will still be present. So, if the heritability of performance on an IQ test were, say, 0.4, then, if all children had the same developmental and social environment as the “average child,” about 60 percent of the variation in IQ test performance would disappear and 40 percent would remain.
Keep in mind that heritability is concerned with the sources of variation of a trait within a population. It is not concerned with the cause of a trait for a given individual. For example, if height is 80% heritable, this does not mean that individuals get 80% of their height from their genetics and 20% from their environment. It would be nonsensical to say that a 70-inch tall man gets 56 inches of his height from his genetics and another 14 inches from his environment. This kind of measure is both impossible and nonsensical, because an individual’s height (or any other phenotype) is the result of an interaction between their environment and their genetics in a way that is not purely additive. Thus, it makes no sense to partition an individual’s phenotype into the component caused by genetics and the component caused by environment as if we could identify and isolate the causes of these two components.
Finally, there is not a single heritability for a given trait. Heritability may differ for different populations and times. This is because the environmental variance (and possibly genetic variance) is not identical across different times and places. In fact, the heritability of a trait will be higher in countries with more egalitarian environments. For example, in the book IQ and Human Intelligence, Mackintosh (2011) notes (page 66):
There is no answer, therefore, to the question of what is the heritability of IQ. It may differ in different societies, or in the same society at different times. Since some people profess to see sinister consequences in the possibility that the heritability of IQ might be quite high, and impute sinister motives to those who claim that it is, it is worth suggesting that the heritability of IQ may well have increased in Western societies in the past 100 years or so, and that this is a consequence of ·some modest improvements in the conditions of those societies. Other things equal, in a society where the heritability of IQ is low, this must be because that society permits significant differences in those environmental circumstances that affect IQ…Any moves towards equality of health care or of educational opportunity, however imperfect, will probably reduce the total variance of IQ in the population, but increase the proportion of that variance attributable to genetic differences between members of that population. In other words, the changes in this direction that have occurred in most industrialized countries in the past 150 years have probably been associated with an increase in the heritability of IQ. And any future increases in such equality of opportunity will presumably increase the heritability of IQ further. A low heritability for IQ could be regarded as a mark of an unjust society.
Heritability of intelligence
There is scientific consensus that intelligence is substantially heritable. The current estimates of heritability range somewhere between 40% and 80% for adults in developed countries (Nisbett et al. 2012, page 132; Plomin and Deary 2015; Plomin et al. 2016). A common finding has been what is called the “Wilson Effect”, which is that the heritability of intelligence increases from childhood to adulthood (Neisser et al. 1996, page 85; Bouchard 2013; Plomin et al. 2016). For example, Plomin and Deary 2015 [archived] report that “for intelligence, heritability increases linearly, from (approximately) 20% in infancy to 40% in adolescence, and to 60% in adulthood. Some evidence suggests that heritability might increase to as much as 80% in later adulthood but then decline to about 60% after age 80.”
As stated earlier, phenotypic variance is due to two sources: genetic variance and environmental variance. This environmental variance can be further broken down into two components: shared environmental effects and non-shared environmental effects. The ACE model is commonly used to separate the proportion of variation in intelligence due to differences in genes, differences in shared environment (between-family differences), and differences in non-shared environment (within-family differences). The shared environment includes the environmental factors that operate at the family level which make children in the same household similar (e.g., household income, parental education, etc.). The non-shared environment operates at the individual level which makes children in the same household dissimilar (e.g., peer groups, personal experiences, etc.). The non-shared environment has often been called random environmental effects because they cannot be easily controlled. These three components – genetics (A), shared environment (C), and non-shared environment (E) – can be estimated using twin studies as Haworth et al. (2010) notes:
The twin method uses MZ (identical) and DZ (fraternal) twin intraclass correlations to dissect phenotypic variance into genetic and environmental sources. MZ twins are 100% genetically similar, whereas DZ twins are on average only 50% similar for segregating genes. Environmental variance can be dissected into shared environmental effects (that is, environmental effects that make members of the same family more similar) and non-shared environmental effects (that is, environmental effects that do not make members of the same family similar). These genetic and environmental effects are commonly represented as A, C and E. ‘A’ is the additive genetic effect size, also known as narrow heritability. Heritability can be estimated by doubling the difference between MZ and DZ twin correlations. Shared environment (C, for effects common to family members) refers to variance that makes MZ and DZ twins similar beyond twin similarity explained by additive genetic effects. C can be estimated by subtracting the estimate of heritability from the MZ correlation. In addition, non-shared environmental influences (E) can be estimated from the total variance not shared by MZ twins; non-shared environmental influences are the only influences deemed to make MZ twins different. E also includes measurement error. Twin intraclass correlations were calculated that index the proportion of total variance due to between-pair variance. Rough estimates of genetic (A) and environmental influences (C and E) can be calculated from these twin correlations.
A common finding is that, as children age, the proportion of variance in intelligence due to genetics (heritability) increases, the proportion due to non-shared environmental factors remains constant, and the proportion due to shared environmental factors declines dramatically (Briley and Tucker-Drob 2013, Figure 2). The result is often that the shared environment is responsible for the smallest proportion of the variance of intelligence for adults in developed countries (Haworth et al. 2010, Figure 1). Some studies have even found that between-family differences (differences due to shared environment) have zero influence on intelligence differences by the time individuals reach adulthood (Neisser et al. 1996, page 85). This is similar to many other psychological traits, for which the environmental sources of variation are mostly the result of differences in the non-shared environment (or within-family differences). For example, Plomin et al. 2016, report that “although environmental effects have a major impact” for psychological traits, “the salient environmental influences do not make siblings growing up in the same family similar.”
There are a variety of methods used to estimate heritability. In the book Human Intelligence, Hunt (2011), mentions the traditional methods for estimating heritability, which include family studies, twin studies, and adoption studies. I will summarize the estimates of heritability using these methods here:
- The correlation of IQ scores among identical twins (0.86 for twins raised together, 0.76 for twins raised apart) is larger than the correlation among fraternal twins (0.55 for twins raised together, 0.35 for twins raised apart). If one assumes that the environments of identical twins are no more similar the environments of fraternal twins, then the higher correlation of IQ scores found among identical twins can be attributed to their higher level of genetic similarity (identical twins share all of their genes whereas fraternal twins share only 50% of their genes). Using twins raised together, heritability can be estimated as 2 × (0.86 – 0.55) = 0.6. Using twins raised apart, heritability can be estimated as 2 × (0.76 – 0.35) = 0.82.
- The correlation of IQ scores among identical twins, fraternal twins, and non-twin siblings raised apart is 0.76, 0.35, and 0.24, respectively. If one assumes that these positive correlations are due to genetic similarity (rather than environmental similarity), these figures can provide estimates of heritability. Using identical twins raised apart, the estimated heritability would simply be 0.76 (identical twins share all of their genes). Using fraternal twins raised apart, heritability can be estimated at 2 × 0.35 = 0.7 (fraternal twins share half of their genes). Using siblings raised apart, heritability can be estimated at 2 × 0.24 = 0.48 (siblings share half of their genes).
- The correlation of IQ scores between biological parents and children raised apart is 0.24. If one assumes that these correlations are completely due to the genetic similarity of the parent and child (rather than any environmental similarity), then these figures can provide estimates of heritability. Heritability can be estimated at 2 × 0.24 = 0.48.
These figures were all pulled from Hunt (2011) (Table 8.2) (as of March 2021, this book is available for free at the internet archive). Using these methods, estimates for the heritability of intelligence range between 0.5 and 0.8. See Griffiths et al. (2000) and Hunt (2011) for more detail regarding these estimation procedures. Also, see Plomin and Spinath (2004) for correlations of IQ among pairs of individuals with different degrees of genetic relatedness (Figure 1).
These methods for estimating heritability have been heavily criticized due to their assumptions. One criticism is that identical twins may be more similar in their environment than fraternal twins. If so, it is not valid to assume that the higher correlation in IQ among identical twins is due entirely to their higher genetic similarity. Another criticism is that children reared apart are often raised in a narrow selection of environments. For example, adoptive parents tend to have a relatively high socioeconomic status and they may exhibit different parenting practices (e.g., they may be more motivated, more patient, etc.) compared to the average population. If the range of environments for children raised apart is not reflective of the range of environments in the general population, then we cannot use such studies to estimate heritability in the general population. See Mackintosh (2011) (Chapter 3) and Hunt (2011) (Chapter 8) for more discussion of these criticisms.
While these criticisms are fair, at best they suggest that the estimates of heritability are not as high as indicated here and/or that we may be unable to provide precise estimates of the heritability of intelligence using these methods. We are still warranted in inferring that the heritability of intelligence is substantial (even if the exact value is difficult to determine). There are a few reasons for this.
- As Plomin et al. 2016 note, twin and adoption designs “generally converge on the same conclusion, despite making very different assumptions, which adds strength to these conclusions.”
- It is implausible to suggest that the correlation of IQ (of around r=0.76) among identical twins reared apart (sometimes separated shortly after birth) is mostly due to their environmental similarity (rather than their genetic similarity) because the correlation of IQ among siblings reared together is only r=0.47 and the IQ of unrelated children raised together is only r=0.04-0.26 (depending on when their IQs are measured) (see Table 8.2 of Hunt 2011).
- Some twin studies find large correlations between IQ among twins even when minimizing the effect of environmental similarity between twins. For example, Bouchard (1990) found a high correlation of IQ in adulthood among identical twins reared apart (r=0.7) even though many of the twins were separated earlier than 6 months after birth and there was no evidence that any environmental factors (e.g., parental education, material possessions in the home, etc.) explained the high correlation (pages 224-225).
A final criticism is that the high IQ correlation among twins (both fraternal and identical) is the fact that they shared a womb at the same time, meaning they have very similar prenatal environments. There are a few reasons to believe that prenatal environmental similarity cannot entirely explain the high IQ correlation among twins. Firstly, identical twins have far higher IQ correlations than fraternal twins, despite the fact that both sets of twins share a womb at the same time, suggesting that the higher IQ correlation among the former group is primarily caused by higher genetic similarity. Secondly, siblings raised apart tend to have much higher IQ correlations (r=0.24) than unrelated adults raised in the same home (r=0.02), despite the fact that none of these individuals share a womb at the same time, suggesting that the higher IQ correlation among the former group is primarily caused by their higher genetic similarity.
Finally, new methods involving molecular genetics enable us to estimate the heritability of intelligence without any of the assumptions involving twin or adoption designs. For example, genome-wide association studies studies (GWAS) (Davies et al. 2011, Davies et al. 2015) and genome-wide complex trait analysis (GCTA) (Trzaskowski et al. 2014, Kirkpatrick et al. 2015) have shown substantial heritability of intelligence. Plomin and Deary 2015 given the following useful description of GCTA:
Like other quantitative genetic designs such as the twin design, GCTA uses genetic similarity to predict phenotypic similarity. However, instead of using genetic similarity from groups differing markedly in genetic similarity such as monozygotic and dizygotic twins, GCTA uses genetic similarity for each pair of unrelated individuals based on that pair’s overall similarity across hundreds of thousands of single nucleotide polymorphisms (SNPs) for thousands of individuals; each pair’s genetic similarity is then used to predict their phenotypic similarity. Even remotely related pairs of individuals (genetic similarity greater than 0.025, which represents fifth-degree relatives) are excluded so that chance genetic similarity is used as a random effect in a linear mixed model. The power of the method comes from comparing not just two groups like monozygotic and dizygotic twins, but from the millions of pair-by-pair comparisons in samples of thousands of individuals.
GCTA studies can only provide underestimates for the actual heritability of intelligence. One reason for this is that GCTA can only detect additive effects of common SNPs, which ignores gene-gene and gene-environment interactions that may impact intelligence. Another reason is that GCTA may suffer from imperfect tagging of causal SNPs. Also, GCTA is limited because it requires several thousand individuals to extract genetic similarity across the genome. See Plomin and Deary 2015 for more information on the limitations of GCTA. Despite these limitations, Plomin et al. 2016 report that “GCTA has consistently yielded evidence for significant genetic influence for cognitive abilities.” There is no longer any doubt that intelligence is substantially heritable (even if more research is required to determine the precise quantification) and that genes have a substantial influence on intelligence.
Poor arguments against genetic explanations of the gap
Before investigating some of the arguments for genetic explanations of the gap, I would like to consider some poor arguments against such explanations. These problem with these arguments is that they do not appeal to any direct evidence of an environmental explanation of the gap.
Race is a societal construction with no biological basis
There are a variety of ways expressing this argument. One might say “race is not real”, “race is a social construct“, “race is not rooted in biology”, etc. The problem is that all of these claims may very well be true, but they say nothing about the possible genetic cause of the cognitive ability gap. It may be true that race is a social construct while simultaneously being true that all of the cognitive ability gap between blacks and whites is due to genetic differences. Even environmentalists who investigate the causes of the race and IQ gap do not accept this argument as evidence against the possibility that genetic differences explain IQ differences. For example, in his book IQ and Human Intelligence, Mackintosh (2011) writes (page 150):
Human beings form a single interbreeding species and no serious geneticist or anthropologist today would subscribe to a view of genetically distinct ‘races’. There is no single genetic marker common to all white groups and absent in blacks, or vice versa; all human genes are found in both groups. Some writers (e.g. Gould, 1986) have attempted to argue from this that there could not be genetic differences for IQ between blacks and whites. The argument seems curious, for it is clear enough that blacks and whites do, on average, differ in the distribution and frequency of certain genes, and the genetic hypothesis needs nothing more than an average difference in the distribution of the no doubt vast array of genes affecting IQ (Jones, 1996)
These a priori arguments have even been rejected by Richard Nisbett, a much more staunch environmentalist. For example, Nisbett (2005) argued that “the evidence most relevant to the question indicates that the genetic contribution to the Black–White IQ gap is nil.” Despite his environmentalist leanings, in his book Intelligence and How to Get It, Nisbett (2009) wrote (page 94):
Laypeople differ markedly in whether they think race differences in IQ have a partly genetic origin or a purely environmental origin—and so do behavioral scientists. Some laypeople I know— and some scientists as well—believe that it is a priori impossible for a genetic difference in intelligence to exist between the races. But such a conviction is entirely unfounded. There are a hundred ways that a genetic difference in intelligence could have arisen— either in favor of whites or in favor of blacks. The question is an empirical one, not answerable by a priori convictions about the essential equality of groups. As it turns out, there is a great deal of empirical evidence on the question.
There are no large genetic differences between blacks and whites
Again, there are a variety of ways to express this argument. Some common expressions include “there is more variation within races than between races“, “all humans share 99.9% of their DNA with each other“, “pairs of individuals from different populations are often more similar than pairs from the same population”, etc. Some have even argued against genetic explanations of the cognitive ability gap by noting that humans migrated into Africa only 40,000 years ago, which is apparently too recent to plausibly produce genetic differences that might cause differences in intelligence.
The problem with all of these arguments is that small differences are sufficient to cause large effects on observable characteristics. For example, humans and chimpanzees share about 98-99% of DNA, yet these “small” differences are what allow us to build civilizations. Even among humans, there are vast differences between individuals that are sometimes due to genetic differences despite the fact that humans share 99.9% of our DNA. For example, men tend to be several standard deviations taller than women despite the fact that men and women share 99.9% of all DNA. Finally, we know that there are significant genetic differences between members of different racial groups (e.g., because of differences in skin color, hair texture, etc.), so we cannot out rule out a priori that genetic differences may also produce differences in intelligence.
In fact, genetic data can be used to determine someone’s self-identified race/ethnicity (Tang et al. 2004) or continent of origin (Bamshad et al. 2003, Witherspoon et al. 2007) with near perfect accuracy. Further, cluster analysis of genetic data from a diverse range of individuals has been shown to partition individuals into clusters that correspond to major geographic regions on the planet (Rosenberg 2002, Rosenberg 2005), suggesting that genetic markers can be used to determine geographic ancestry. Since one’s racial identity typically corresponds to their geographic ancestry, this suggests that genetic markers can be used to determine racial identity.
Regardless, since racial groups obviously have mean genetic differences on a variety of physical characteristics (e.g., skin color, hair texture), there is no reason to assume a priori that racial groups do not also have mean genetic differences with respect to psychological characteristics such as, e.g., intelligence. In his book Human Intelligence, Hunt (2011) has also written (page 408) about the problems with these kinds of a priori arguments:
While it is true that within-group genetic variation is greater than between-group variation, the amount of genetic variation between groups is quite sufficient, statistically, to make accurate assignments associating an individual with one of the three largest groups of origin in the United States – African, European, and East Asian. There is also a high level of agreement between self-identification as White, Black, East Asian, or Latino and assignment of a person to clusters based upon similarities of their genomes. While there is no one defining characteristic, other than self identification, that assigns someone to a particular racial/ethnic group, identification of a person as a member of a racial/ethnic group will probabilistically tell us something about that person’s standing on a variety of social and biological variables.
It should be clear that no a priori argument can be given against genetic explanations of the cognitive ability. In order to disprove the genetic hypothesis, one needs to actually provide evidence that contradicts the hypothesis.
Arguments for genetic explanations of the gap
The heritability of intelligence
One argument for a primarily genetic explanation of racial IQ differences runs as follows: IQ is highly heritable, which means that most individual differences in IQ are due to genetic differences. Differences between groups are just aggregated differences between individuals. Therefore, most of the differences between groups are also due to genetic differences. This argument is flawed for several reasons.
First, experts currently debate the applicability of the heritability estimates for individuals in deprived environments. For example, a meta-analysis by Tucker-Drob and Bates (2016) found that the heritability of IQ is substantially lower in low-SES population within the United States. The study reports that “genetic variance in intelligence increases from 0.24 at 2 standard deviations below the mean SES to 0.61 at 2 standard deviations above the mean SES.” Also, see Nisbett et al. (2012) [archived] for discussion of evidence that IQ heritability is lower in low-SES populations (pages 132-134).
Second, heritability estimates measure within-group heritability, but within-group heritability says nothing about between-group heritability. Within-group heritability indicates the proportion of the differences between individuals within a population that can be attributed to individual genetic differences between individuals. Between-group heritability indicates the proportion of the average difference between populations that can be attributed to genetic differences between those populations. The reason within-group heritability says nothing about between-group heritability is that it is possible for a trait to have high within-group heritability while simultaneously having low between-group heritability between any two given populations. Consider the following examples:
- Using height as an example can again be useful. Imagine you have two populations A and B with genetically identical populations of people. In population A, all individuals have equally excellent nutrition. In population B, all individuals have equally terrible nutrition. As expected, the average height of people in group A is larger than the height for people in group B. We would find that the heritability of height will be high for both groups, because there is very little environmental variation between individuals within either group to explain height differences. However, the height gap that we find between the two groups will be attributed 100% to environment differences.
- A real-world example of this is North and South Koreans. Johnson et al (2010) [archived] report that “height is on the order of 90% heritable, yet North and South Koreans, who come from the same genetic background, presently differ in average height by a full 6 inches” (for context, one standard deviation for male height is about 3-4 inches). In general, between-group heritability can be much lower than within-group heritability if the relevant environment feature(s) (e.g. nutrition) that differ between the groups do not differ widely within the groups. So high heritability of individual differences within the two groups does not imply that average differences between the two groups are due to genetic differences.
More information on heritability can be found Griffiths et al. (2000) [archived]. Further discussion about why the heritability of IQ can be misleading regarding group differences in IQ is given by Block (1996) [archived]. Biologist Richard Lewontin (1970) [archived] advanced a similar point to argue that the high heritability of IQ does not suggest that racial differences in IQ are due to genetic differences (pages 7-8). Heritability estimates indicate the role that genetic differences play in explaining individual differences in IQ. But we are interested in knowing the role that genetic differences play in explaining group differences in IQ. This is a question that is not answered by heritability estimates. Even Rushton and Jensen (2005) [archived] – two of the most prominent advocates of a prominent genetic explanation of the IQ gap – have recognized this point (page 239):
The cause of individual differences within groups has no necessary implication for the cause of the average difference between groups. A high heritability within one group does not mean that the average difference between it and another group is due to genetic differences, even if the heritability is high in both groups. However, within-groups evidence does imply the plausibility of the between groups differences being due to the same factors, genetic or environmental. If variations in level of education or nutrition or genes reliably predict individual variation within Black and within White groups, then it would be reasonable to consider these variables to explain the differences between Blacks and Whites. Of course, independent evidence would then be needed to establish any relationship.
This seems correct. The high heritability of IQ makes it more plausible that genetic differences play a large role in explaining group differences in IQ. After all, if the heritability of IQ was 0% (genetic differences did not explain any of the individual differences in IQ), then it would be very implausible (impossible, even) that genetic differences could explain group differences. However, as Jensen and Rushton admit, we need independent evidence to truly establish any conclusions regarding the role that genetic differences play in the IQ gap. It should be noted that even though the high heritability of IQ suggests the plausibility that genetic differences explain group differences in IQ, this does not make it implausible that group differences in IQ may be explained entirely by environmental differences.
In fact, we have empirical evidence that there can be large IQ differences between two populations, despite high heritability estimates within both populations – the Flynn Effect. The Flynn Effect refers to the substantial increase in IQ during the 20th century. Throughout many countries around the world, the average IQ increased by about 3 points every 10 years, or about 15 points (a full standard deviation) every 50 years. However, the heritability of IQ throughout the 20th century has been consistently shown to be high. Does this mean that the best explanation of the Flynn Effect is genetic inferiority of our ancestors? Of course not. Compared to our ancestors, we have superior environments, not superior genes. Flynn (2018) [archived] states that some sources of superior environment include better home environments, more years in school, and more cognitively demanding jobs (page 6).
Another heritability-related argument for a primarily genetic cause of the black-white IQ gap focuses on the fact that the black-white IQ gap is larger on more heritable IQ tests and subtests heritability. Rushton and Jensen (2005) [archived] (page 250) argue:
Heritability data are especially informative when the hereditarian and the culture-only models make opposite predictions. For example, the hereditarian model predicts race differences will be greater on those subtests that are more heritable within races, whereas culture-only theory predicts they will be greater on subtests that are more culturally malleable (i.e., those with lower heritabilities) on which races should grow apart as a result of dissimilar experiences. Analyses of several independent data sets support the genetic hypothesis.
The problem with this argument is that a primarily environmental explanation of the IQ gap is perfectly consistent with the finding that the gap is larger on more heritable subtests. To see why, first note that the black-white IQ gap tends to be greater on more g-loaded tests (Nijenhuis and Van den Hoek 2016). The g-loading of a test, to some degree, reflects its cognitive complexity (Gottfredson 2002, page 28). For example, if someone is told a series of digits, repeating the digits in order will have a far lower g-loading than repeating the digits in reverse order. Now, does the fact that black-white IQ gaps are larger on more g-loaded tests show that the gap is due to genetic differences? No, this fact only indicates that blacks lag behind whites on IQ tests with greater cognitive complexity. But this only tells us about differences between blacks and whites; it doesn’t tell us anything about the causes of such differences. In fact, Halpern and Turkheimer (2012) [archived] argued that an environmental explanation of the gap would predict larger gaps differences on more cognitively complex tests (page 504):
He [Rushton] believes that a genetic hypothesis about the origin of the racial IQ gap would predict this pattern of larger differences for more heritable, heavily g-loaded items, and that environmental ones would not. This belief is mistaken. The construct of g would have no significance if it were not a measure of cognitive complexity. If a group is environmentally disadvantaged, its performance in comparison to non-disadvantaged groups will be greater on more complex tasks than on less complex ones.
Now, note that highly g-loaded tests tend to have higher heritability than less g-loaded tests. This implies that an environmentalist explanation of the gap would also predict larger gaps on more heritable tests (since an environmental explanation of the gap predicts larger gaps on more g-loaded tests). Therefore, the fact that the gap is larger on more heritable tests is not evidence that the black-white IQ gap is due to genetic differences. Flynn (2010) [archived] has presented a similar argument (age 365):
(1) g would be of no interest were it not correlated with cognitive complexity. (2) Given a hierarchy of tasks, a worse performing group (whatever the cause of its deficit) will tend to hit a “complexity ceiling” — fall further behind a better group the more complex the task. (3) Heritability of relevant traits will increase the more complex the task. (4) Thus, the fact that group performance gaps correlate with heritability gives no clue to the origin of group differences.
Originally, Jensen argued: (1) the heritability of IQ within whites and probably within blacks was 0.80 and between family factors accounted for only 0.12 of IQ variance — with only the latter relevant to group differences; (2) the square root of the percentage of variance explained gives the correlation between between-family environment and IQ, a correlation of about 0.33 (square root of 0.12= 0.34); (3) if there is no genetic difference, blacks can be treated as a sample of the white population selected out by environmental inferiority; (4) enter regression to the mean — for blacks to be one SD below whites for IQ, they would have to be 3 SDs (3 × .33= 1) below the white mean for quality of environment; (5) no sane person can believe that — it means the average black cognitive environment is below the bottom 0.2% of white environments; (6) evading this dilemma entails positing a fantastic “factor X”, something that blights the environment of every black to the same degree (and thus does not reduce within black heritability estimates), while being totally absent among whites (thus having no effect on within-white heritability estimates).
This is a compelling argument. This presents two implausible options for the environmentalists. Either they posit that the average environment for blacks is below the bottom 0.2% of the average environment for whites, or they posit the mysterious “factor X.” Neither options is particularly attractive for the environmentalist:
- First, it is extremely implausible to posit that the average environment for blacks is below the bottom 0.2% of the average environment for whites. This suggests that 99.8% of white people have superior environments to the average black person regarding cognitive development. There is no empirical evidence to support this idea. For example, the average racial gaps in, say, income, education, school quality, etc. are far less than 3 standard deviations; the gaps are often around one standard deviation or less. Furthermore, the correlation between many of these variables and cognitive ability are not great enough to explain the large racial differences in cognitive ability.
- Second, the idea of a “factor X” is also implausible. Lewotin (1970) [archived] famously argued that high within-group heritability does not suggest high between-group heritability by using an example of two handfuls of the same stock of seeds planted in radically different environments (page 7). This is supposed to exemplify a case of large within-group heritability with no between-group heritability. The problem here is that there is no empirical evidence that similar patterns apply to blacks and whites in the United States. There is no empirical evidence that there is some factor (or set of factors) that harms all black people to the same degree, which has no effect on white people, which nevertheless can explain the racial cognitive ability gap.
Regarding the “factor X”, even Flynn (2018) himself – an environmentalist – has criticized this explanation of the gap:
Needless to say, in the real world, no one could find a Factor X. Any factor that varied between the black and white groups also varied within black and white groups (e.g., family quality, schooling quality, jobs, leisure quality – even the impact of discrimination). Thus, despite trying to answer Jensen’s case that the IQ gap between races could not be explained environmentally, Lewontin made Jensen’s triumph inevitable.
Some consider “systemic racism” to be the “factor X”. There are a few reasons why this is implausible. Firstly, even if we posit that systemic racism exists, we need a way to test the hypothesis that it is responsible for the black-white cognitive ability gap. This requires empirical evidence that accurately measures systemic racism and quantifies the degree to which it impedes the cognitive development of black people. Secondly, even if such empirical evidence exists (which is unlikely), we would need to show that systemic racism has an equally harmful effect on all black children, which is implausible. Lastly, any effect that systemic racism has on the cognitive development of blacks is almost certainly due to its effect on intermediate variables that also vary significantly within both blacks and whites. Flynn (1980) [archived] has made the same point when addressing the plausibility of “systemic racism” as a “factor X” (page 60):
Racism is not some magic force that operates without a chain of causality. Racism harms people because of its effects and when we list those effects, lack of confidence, low self-image, emasculation of the male, the welfare mother home, poverty, it seems absurd to claim that any one of them does not vary significantly within both black and white America. Certainly there are some blacks who have self-confidence, enjoy a stable home, a reasonable income, good housing; and certainly we all know whites who have a poor self-image, suffer from emasculation, or suffer from poverty.
So how do we respond to Jensen’s fork? Should we conclude that genetic differences explain a significant portion of the black-white gap cognitive ability? Not quite. Flynn (2010) [archived] has responded to Jensen’s fork by referring to the Flynn Effect (page 364):
I used the Flynn Effect to break this steel chain of ideas: (1) the heritability of IQ both within the present and the last generations may well be 0.80 with factors relevant to group differences at 0.12; (2) the correlation between IQ and relevant environment is 0.33; (3) the present generation is analogous to a sample of the last selected out by a more enriched environment (a proposition I defend by denying a significant role to genetic enhancement); (4) enter regression to the mean — since the Dutch of 1982 scored 1.33 SDs higher than the Dutch of 1952 on Raven’s Progressive Matrices, the latter would have had to have a cognitive environment 4 SDs (4 × 0.33= 1.33) below the average environment of the former; (5) either there was a factor X that separated the generations (which I too dismiss as fantastic) or something was wrong with Jensen’s case.
Flynn uses the Flynn Effect as an example of large environmental differences in IQ between groups without positing a “factor X” and without positing that the groups differ in environment by 3+ SDs, thereby avoiding both horns of Jensen’s fork. Flynn argues that large environmental differences in IQ between groups can be explained by positing “that the usual environmental factors are more potent between- than within-groups” (page 5). See Flynn (2010) for more details on his model. Note that the Flynn Effect is not sufficient to conclude that the black-white IQ gap is due to environmental differences. Flynn (2010) himself has pressed this point in asserting that “casual explanation of IQ gains does not provide the key to the black/ white IQ gap” (page 364), counter what many other environmentalists have claimed. Flynn does not claim that the Flynn Effect has “causal relevance” to the black-white IQ gap; rather he claims that it has “analytic relevance” (page 364). Flynn’s arguments only demonstrate the possibility that environmental differences can account for the black-white IQ gap, but this is not sufficient to conclude anything about the black-white IQ gap. He reiterates this point in Flynn (2018) (page 3):
Even if the gap between the generations is an intelligence gap, and even if it larger than the intelligence gap between the races, providing an environmental explanation for the former cannot substitute for providing an environmental explanation for the latter. It is quite possible that the environmental factors that separate the generations are quite unlike those that separate the races.
So in order to demonstrate that the black-white IQ gap is due to environmental differences, we need to directly test and confirm specific environmental hypotheses. We cannot merely appeal to the Flynn Effect to do the work. Similarly, in order to demonstrate that the IQ gap is due to genetic differences, we need to directly test and refute specific environmental hypotheses. We cannot merely appeal to considerations about heritability to do the work. Flynn (2010) has expressed this point well (page 365):
American blacks are not in a time warp so that the environmental causes of their IQ gap with whites are identical to the environmental causes of the IQ gap between the generations. The race and IQ debate should focus on testing the relevant environmental hypotheses. The Flynn Effect is no shortcut; correlations offered by Rushton and Jensen are no shortcut. There are no shortcuts at all.
Testing environmental hypotheses
Both hereditarians (e.g., Rushton and Jensen) and environmentalists (e.g., Flynn) would agree that analogical arguments regarding the heritability of IQ or the Flynn Effect are not sufficient to determine the degree to which genetic differences or environmental differences are causally responsible for the cognitive ability gap. Rather, we need independent evidence to determine whether the gap is due to genetic differences or environmental differences. This involves testing various environmental hypotheses. Different environmental hypotheses posit different environmental factors as being causally responsible for the gap. To establish that a candidate environmental factor X is causally responsible for the gap, we need to establish the following two conditions:
- Correlation: this requires establishing that racial differences in the posited factor X correlates with racial differences in the cognitive ability gap. This requires establishing that racial differences in cognitive ability are reduced when racial differences in X are reduced (all else equal). If the proposed environmental hypothesis posits that X accounts for the entirety of the gap, this requires establishing that racial differences in cognitive ability are eliminated when racial differences in X are eliminated (all else equal).
- Non-spuriousness: establishing the correlation between racial differences in X and racial differences in cognitive ability is not sufficient to indicate the causal effect of X. We also need to establish that the correlation is not spurious. This is because, for any given two variables which are correlated, there may be any number of possible explanations for the correlation. Of course, the most obvious possible explanation is that there is a causal relationship between the two variables. But another plausible explanation is that there is a third variable (a “confounding variable”) that causes both variables, resulting in both variables being correlated without either being causal. Therefore, in order to show that a correlation is due to the causal effect of X, we need to rule out alternative plausible explanations, i.e. rule out plausible confounders that might be causally responsible for the correlation.
Another condition important for establishing a causal effect of X is establishing a mechanism that explains the relationship between X and the cognitive ability gap. While this is important, this is not considered to be necessary to establish causation (e.g. see this chapter [archived] of a social science textbook on causation and research design). That said, I will now consider the legitimacy of different methods to determine whether genetic differences or environmental differences are causally responsible for the cognitive ability gap.
The value and limitations of statistical explanations
Some people attempt to determine whether environmental (rather than genetic) differences are causally responsible for the cognitive ability gap by using the following strategy. First, take a dataset that contains cognitive ability scores for a (ideally representative) sample of blacks and whites, then statistically control for the proposed environmental factor(s), and finally measure the magnitude of the gap that persists after controlling for that variable. In other words, this strategy attempts to determine whether (and to what extent) environmental differences cause the cognitive ability gap by quantifying the portion of the gap that is statistically explained after controlling for the proposed environmental factor(s). However, there are two deep flaws with using this strategy:
- This approach may underestimate the role that environmental differences play in explaining the gap. This is because there may be some uncontrolled environmental factors that account for the cognitive ability gap. For example, statistically controlling for differences in SES between blacks and whites may not adequately control for differences in parenting practices which may have a large impact on cognitive differences. In order to accurately estimate the genetic component of cognitive differences, we would need to statistically control for all relevant environmental factors. But this is implausible for a variety of reasons. First, our measurements of the proposed environmental factors may be unreliable (e.g., precisely measuring “parenting practices” may be difficult). Secondly, we may lack data on all relevant environmental factors (e.g., many studies control for family SES, but not parenting practices). Thirdly, we might be unaware of which environmental factors are relevant.
- This approach may overestimate the role that environmental differences play in explaining the gap. This can happen if there is a spurious relationship between proposed environmental factors and the gap. For example, if the entirety of the IQ gap was due to genetic differences, we would expect that the IQ gap would reduce statistically after controlling for SES. This is because higher-IQ parents are more likely to produce high-SES environments, which means that controlling for SES also controls for parental IQ differences, which in turn controls for genetic differences. Therefore, even if controlling for SES reduced the IQ gap by some amount, we cannot infer that this reduction was caused by controlling for SES. This mistaken inference has been called the “sociologist’s fallacy” by hereditarians. This point has been advanced by hereditarians – e.g. Rushton and Jensen (2005) (page 267) – and has been acknowledged by environmentalists – e.g. Nisbett et al. (2012) (page 7) and Neisser et al. (1996) (page 94).
This second point has been articulated nicely in a context that is directly relevant to the topic of this post. Consistent with the points I’ve made, Magnuson and Duncan (2006) [archived] emphasize that a causal relationship between racial differences in SES and racial differences in children’s test scores is not guaranteed just because controlling for the former reduces the latter (page 386):
Simply documenting that SES accounts for about .4–.5 of standard deviation of the black–white test score gap does not prove that differences in SES have caused differences in children’s test scores. For example, although Fryer and Levitt (2004) are able to account for virtually all of the racial and ethnic gaps in kindergarten achievement using measures of family background, they lack any measure of genetic endowments and are thus unable to discount the possibility that what appear to be family socioeconomic effects are really caused by other family characteristics.
Despite these points, statistical explanations are useful when it comes to evaluating the merits of specific environmental factors as proposed explanations of the gap. There are two reasons for this:
- Statistical explanations can falsify the hypothesis that certain environmental factors are explanations of the gap. For example, the fact that parental income/education fails to statistically explain a large portion of the cognitive ability gap (as shown in a previous post) is sufficient to falsify the hypothesis that differences in parental income/education are responsible for most of the cognitive ability gap.
- Statistical explanations can establish that certain environmental factors are viable candidate causal explanations of the gap. For example, if the cognitive ability gap is eliminated after controlling for racial differences in parenting practices, this establishes that parenting practices is a viable candidate explanation of the gap (of course, this does not verify those parenting practices are responsible for the gap, due to possible issues with spuriousness that I mentioned earlier).
I review studies that attempt to statistically explain the cognitive ability gap in a later post.
Establishing non-spuriousness using direct data
In order to establish that environmental factors are causally responsible for the black-white cognitive ability gap, we must establish a non-spurious association between such factors and the gap, i.e. an association that persists after controlling for possible confounders. This requires establishing an association between environmental factors and the cognitive ability gap that persists after controlling for genetic confounding. In order to control for genetic confounding, I believe we must rely on direct data, i.e. data that measures the cognitive ability of black children raised in sufficiently enriched environments. This kind of data can establish a non-spurious relationship between environment and cognitive ability differences.
For example, imagine an experiment where we acquire a representative sample of black and white infants at birth. For each infant, let’s say we randomly assign them to be raised by either a random black family or a random white family immediately upon birth. Then we measure the cognitive ability of the children when they reach adulthood. I believe that such data allows us to isolate the relative effects of genetic differences vs environmental influences on the black-white cognitive ability gap, enabling us to quantify the degree to which the cognitive ability gap is due to genetic differences vs environmental differences. Any cognitive ability gap that obtains between black children raised by black families vs white families can be attributed to differences in the environments associated with the two families. Additionally, any cognitive ability gap that obtains between black children and white children raised by white families can be attributed to genetic differences.
Obviously, this is an idealized experiment that cannot be performed practically or ethically (e.g., it’s impractical/unethical to get a random sample of infants, to assign them to families immediately upon birth, to assign them to a perfectly random black or white family, etc.). Even though no actual experiments can realize this ideal, we can use actual experiments that approximate this ideal as much as possible (e.g., transracial adoption studies). Note, however, that even these ideal experiments cannot control for all relevant environmental factors (e.g., they cannot control for the prenatal environment). Nevertheless, I believe this kind of direct data (e.g., transracial adoption studies) is the best data to establish an association between environmental differences and cognitive ability differences while controlling for possible genetic confounds, which is necessary to establish that environmental differences are causally responsible for cognitive ability differences. Other writers have also remarked on the usefulness of this kind of data. For example, Mackintosh (2011) argues that the “critical experiment” for determining whether genetics are responsible for the gap involves a method where we “take a random sample of black and white children at birth and bring them up in carefully matched adoptive homes or other comparable environments and measure their IQ scores at age 10 or so” (page 150). Loehlin (2000) has also remarked (page 185) on the usefulness of transracial adoption studies:
When infants of one racial group are reared by parents of another, if the children tend to display the characteristics of the adopting group, it is prima facie evidence for postnatal environmental effects, and if they tend to display the characteristics of the group from which they came, it is prima facie evidence for the genes or prenatal effects.
That establishes what I mean by “direct” data. “Indirect” data is any data that is not direct. I have already criticized one kind of indirect data, i.e. analyses that involve measuring the proportion of the cognitive ability gap that can be statistically explained by some proposed set of environmental factors. Another example of indirect data would be data regarding racial differences in other traits that correlate with intelligence. For example, Rushton and Jensen (2005) appeals to racial differences in brain size as evidence that genetic differences are responsible for some portion of the black-white cognitive ability gap. I ignore this kind of indirect data for the following reasons:
- It is not clear what proportion of racial differences in brain size are due to genetic versus environmental differences, and
- It is not clear to what degree racial differences in brain size cause cognitive differences. The correlation between brain size and intelligence is only 0.3-0.4 according to a meta-analysis by McDaniel (2005) [archived], and we have examples of groups with mean brain size differences (even after controlling for body sizes) with no mean IQ differences (e.g. men and women).
These criticisms apply to most other instances of indirect data used by hereditarians. Finally, another benefit of using direct data is that it allow us to quantify the degree to which black-white differences in cognitive ability are due to genetic differences vs environmental differences. To illustrate this, return to the idealized, perfect experiment I mentioned earlier. In this ideal experiment, if we find a black-white IQ gap of X points among children raised by random white families, then we can infer that X points of the IQ gap is due to genetic differences.
In conclusion, we should not use indirect data to determine whether the black-white cognitive ability gap is explained by genetic differences or environmental differences. Instead, we should use direct data, i.e. data that measures the cognitive ability of random samples of black children raised in sufficiently enriched environments (e.g., transracial adoption studies). The use of direct data allows us to quantify the degree to which racial differences in cognitive ability are due to genetic differences vs environmental differences. I attempt to review all relevant direct data in a later post.
- James Flynn (1980). Race, IQ, and Jensen [archived].
- Ulric Neisser et al. (1996). “Intelligence: Knowns and unknowns” [archive]. Inspired by the heated debate regarding intelligence following the release of The Bell Curve, the Board of Scientific Affairs of the American Psychological Association established a task force of 11 experts on intelligence to prepare an authoritative report surveying the current state of the field. The report was continually revised and discussed until the report received unanimous support from each member of the task force.
- Arthur R. Jensen (1998). The g factor.
- Christopher Jencks and Meredith Phillips (1998). The Black-White Test Score Gap. A comprehensive collection of works outlining the history, causes, and impacts of Black-White test score gaps. The first chapter can be found here [archive].
- J. Philippe Rushton and Arthur R. Jensen (2005). “Thirty years of research on race differences in cognitive ability” [archive]. Rushton and Jensen present the case for a “hereditarian” model of the cause of the black-white gap in cognitive ability, i.e. that genetic differences explain at least 50% of the gap. They give their interpretation of an ample amount of data across a wide range of domains to support their argument.
- Richard E. Nisbett et al. (2005). “A Commentary on Rushton and Jensen” [archive]. An environmentalist objection to the argument from Rushton and Jensen’s article “Thirty years of research”.
- J. Philippe Rushton and Arthur R. Jensen (2005). “Wanted: More Race Realism, Less Moralistic Fallacy” [archive]. A response to the objections from Nisbett in “A Commentary on Rushton and Jensen” and other environmentalist criticisms.
- James Flynn (2010). “The spectacles through which I see the race and IQ debate” [archived].
- Earl Hunt (2011). Human Intelligence. A comprehensive survey of our scientific knowledge about human intelligence. See chapters “The genetic basis of intelligence” and “The demography of intelligence” for an overview of research relevant to the heritability of intelligence and the nature/causes of racial differences in intelligence, respectively.
- Nicholas Mackintosh (2011). IQ and Human Intelligence. An authoritative overview of the main issues in intelligence research. See chapters “The heritability of IQ” and “Group differences for an overview of research relevant to the heritability of intelligence and the nature/causes of racial differences in intelligence, respectively.
- James Flynn (2018). “Reflections about intelligence over 40 years” [archived].
For arguments from prominent advocates of the environmentalist model, see works by James Flynn and Richard Nisbett. For arguments from prominent advocates of the hereditarian model, see works by Philippe Rushton and Arthur Jensen.