Regional analyses of racial disparities in violent crime in the United States

Last Updated on August 6, 2023

In this post, I analyze racial disparities in violent crime across different regions in the United States. Relying on publicly available data published by the CDC, the FBI, and the Census, I perform different analyses on racial differences in violent crime at the state-level, county-level, and city-level. For example, I consider differences in violent crime commission or victimization rate, percentage of violent crime offenders or victims by race, correlations by racial demographics and violent crime rate, and regression analyses to estimate the association between racial demographics and violent crime controlling for socioeconomic variables.

None of the findings here are too surprising. The findings here mostly reinforce my previous posts discussing the ubiquity of racial disparities in violent crime in the United States, showing that blacks have by far the highest incidence with violent crime, followed by Hispanics, whites, and Asians (in that order). A much more in-depth analysis of racial disparities in crime by state and county was conducted by Random Critical Analysis [archived] in 2015.

Sources


The main sources I used for this post are described in this section.

Homicide death data (CDC)

The CDC has published several free databases under their CDC WONDER program. These databases allow the public to analyze statistics on various health-related data in the United States, such as births, deaths, cancer, etc. The Underlying Cause of Death database is particularly relevant for my purposes. This database allows users to query data from death certificates between 1999 and 2020 and filter/breakout the data by variables such as race, gender, cause-of-death, place of residence, etc. In this post, I rely heavily on racial disparities in deaths due to homicides by region (e.g. state or county).

Crime offense data (FBI)

The FBI’s Uniform Crime Reporting (UCR) program gathers data on arrests, offenses, offender characteristics, victims characteristics, etc. reported by law enforcement agencies in the United States. Data on crime from 1995 to 2019 is available on the FBI’s official website. More recent data is available at the FBI’s Crime Data Explorer (CDE) which allows for more interactive data investigation. What is particularly useful about the CDE is that it allows users to filter for data at the level of particular police departments. I will rely on this functionality to estimate rates of crime at the city-level.

Both the FBI data and the CDC data have their own advantages and disadvantages. Some advantages for the CDC data are as follows: the CDC data is more complete than the FBI data, as it relies on all death certificates in the country instead of voluntary publication by law enforcement agencies; the CDC allows for querying for more specific filters (e.g., separating Hispanic whites from non-Hispanic whites). Some advantages for the FBI data are as follows: the FBI reports data on more crimes (whereas the CDC is limited to homicide deaths alone), the FBI reports on data on victim characteristics and offender characteristics (whereas the CDC only has data on homicide victims), the FBI has data available at the level of individual police departments (whereas the lowest level of granularity for the CDC data is the county-level).

Demographic and socioeconomic data (Census)

The demographic and socioeconomic data was gathered from the U.S. census. Depending on the particular data that I wanted, I sometimes used the official website for Census data (e.g., this link shows racial/ethnic demographic data by state) and I sometimes used the Census QuickFacts website as this was quicker to collect data. These sources were used to collect data on racial/ethnic demographics, median income, poverty, single motherhood, etc. by state and city.

National data


To start, I consider national data on racial disparities in violent crimes. This data should not be surprising given my previous posts. But a quick refresher on the national data can be useful.

First, let us start by considering racial disparities in homicide death rates as reported by the CDC. The following table shows the number and rate of homicide deaths by race/ethnicity in 2020.

Here is the query criteria used to generate this table.

The data shows large differences in homicide by race and ethnicity. The non-Hispanic black homicide death rate is over 10 times the rate for non-Hispanic whites (30.6 vs 3.2) and about 5 times the rate for Hispanics (30.6 vs 6.2). Moreover, blacks constituted about 55% of homicide deaths in 2020 (13,594 out of 24,545) despite constituting about 13% of the population. Among the major racial/ethnic groups, the homicide rate is by far the lowest for non-Hispanic Asians and Pacific Islanders at 1.7, about half the rate for non-Hispanic whites. The homicide death rate for non-Hispanic blacks is about 18 times the rate for non-Hispanic Asian and Pacific Islanders.

The data homicide death rates reported by the CDC is in line with the arrest data reported by the FBI. The following table shows the number and percent of arrests by race in 2019.

As you can see, blacks are highly overrepresented in each of the crimes reported here, with the largest disparities for the most serious crimes, such as violent crimes and specifically homicides. For example, despite constituting just 13.6% of the population (see demographic data here [archived]), blacks were arrested for about 30% of property crimes, 36% of violent crimes, 33% of aggravated assaults, 51% of homicides, and 53% of robberies.

Consistent with the CDC homicide data, Asian Americans are highly underrepresented in each of the crimes reported here. Despite constituting 6.3% of the U.S. population, Asians were arrested for just 1.9% of property crimes, 2.3% of violent crimes, 1.6% of homicides, and 1.1% of robberies.

The FBI’s Crime Data Explorer tool also provides estimates of the percent of offenders/victims by race as reported by law enforcement agencies. Here are the findings for 2021:

So blacks are about 38% and whites 56% of victims of violent crimes. However, blacks and whites are each about 44% offenders for violent crimes according to law enforcement agencies. Again, Asians are highly underrepresented as both victims and offenders of violent crimes, constituting only 1 to 2% of incidents.

States


In this section, I analyze racial disparities in crime commission and victimization at the state level.

To start, I compiled the percentage of offenders of violent crimes who were black in 2021 according to the CDE. The data and calculations are stored in my Google spreadsheet here. The following table shows violent crime data in the 20 states with at least 10% of the population black. The “General population” column indicates the percentage of the state’s population that is black according to the U.S. Census. The next 5 columns indicate the percentage of the corresponding category of crime with a black offender (Note: the sample used for these percentages is restricted to cases where the race of the offender is known; the race of the offender was known in about 90% of the crimes).

Note: The last column indicates the percentage of the state’s population that was covered by the law enforcement agencies that reported data. This provides some information about the likelihood that the CDE data represents the actual crimes known by law enforcement. Caution should be used when interpreting findings for states with a low figure here. For example, in Florida, only 17% of the population was covered by the law enforcement agencies that reported data.

State#General populationViolentMurderRapeRobberyAggravated AssaultPopulation covered by CDE
Mississippi137.6%70%86%51%87%69%59%
Louisiana232.0%64%79%47%79%65%73%
Georgia331.2%70%79%54%87%70%89%
Maryland429.4%70%69%50%84%63%65%
South Carolina526.6%62%80%47%77%62%99%
Alabama626.5%69%79%51%85%67%82%
Delaware721.6%67%77%42%84%64%98%
North Carolina821.1%65%70%45%80%64%97%
Virginia918.8%57%72%37%74%55%100%
Tennessee1016.6%60%70%41%81%56%100%
Florida*1115.3%36%N/A36%60%56%17%
Arkansas1215.3%48%60%28%69%49%97%
New York*1314.3%56%57%34%73%53%20%
Illinois1414.0%70%77%40%86%63%64%
Michigan1513.6%58%72%33%77%60%93%
New Jersey*1612.7%57%80%38%69%56%48%
Ohio1712.2%64%73%42%79%61%91%
Texas1811.8%46%52%27%60%43%98%
Missouri1911.4%49%65%31%72%46%95%
Pennsylvania*2010.6%74%78%54%83%71%36%

* These states had data from law enforcement agencies that covered only a minority (<50%) of the population. Interpret their numbers with caution. In particular, Florida only reported data for 80 violent crime offenses.

To summarize the findings:

  • Violent crime: blacks committed >45% of violent crimes in 19 out of the 20 states. They committed >50% of violent crimes in 16 out of the 20 states. They committed >60% of violent crimes in 11 out of the 20 states.
  • Homicide: blacks committed >50% of homicides in each of the states with data (Florida had no data on homicides in the CDE). They committed >70% of homicides in 14 out of the 19 states.
  • Robbery: blacks committed >60% of robberies in each of these states. They committed >70% of robberies in 16 out of the 20 states.
  • Aggravated assault: blacks committed >40% of aggravated assaults in each of these states. They committed >50% of aggravated assaults in 17 out of the 20 states. They committed >60% of aggravated assaults in 12 out of the 20 states.

Basic charts

In this subsection, I will present some basic charts illustrating racial disparities in crimes across states.

The previous subsection had a table with the percentage of violent crime committed by blacks compared to the percentage of the population that is black by state. The following chart illustrates the relationship between these variables more intuitively (the yellow line indicates what we would expect if the percentage of violent crime offenders matched the percentage of the state’s population that is black).

The next chart shows similar data, except the y-axis is the percentage of homicide deaths (instead of violent crime offenders) that are black according to the CDC.

As you can see, as the percentage of the state that is black increases, the percentage of homicide victims that are black increases much more rapidly. In fact, blacks begin to constitute about 50% of homicide victims once they constitute about 12 to 13 percent of a state’s population. Furthermore, in all states where blacks are at least 13 percent of the population, blacks also constitute at least 50% of homicide victims.

So far, we’ve looked at the percentage of crimes committed by blacks. Now let us analyze racial disparities in crime rates across states. When analyzing crime rates by state, I will rely mainly on homicide death data published by the CDC instead of the FBI data, since the CDC has more complete data as explained earlier.

The following histogram shows the counts of state-level homicide death rates by race between 2015 and 2019 (I picked the most recent 5 years in the CDC’s database to avoid problems with yearly fluctuations; the chart was created before the 2020 data was available).

As you can see, the homicide death rate for blacks is much more variable and is much higher than the homicide death rates for both whites and Hispanics. In fact, there is no overlap in homicide death rates between blacks and whites. The lowest black homicide death rate (8.3 per 100,000 in Rhode Island) is greater than the highest white homicide death rate (5.9 per 100,000 in New Mexico) for any state.

The next histogram illustrates the state-level black/white homicide death rate ratios.

The smallest ratio is 3.8 (New Mexico) and the largest is 19.8 (Illinois). In the typical state, the black homicide death rate is about 7 to 8 times the rate for whites, although there is high variance with many states having ratios greater than 10.

The following chart shows the correlation between the homicide death rate of a state and the percentage of the state’s population that is black.

As you can see, there’s a very strong relationship between the percentage of a state’s population that is black and the state’s homicide rate. The states with the highest concentration of black people have homicide death rates around 10 per 100,000, which is about 2 to 3 times greater than the homicide death rates of states with the lowest concentration of black people. The R^2 is 0.551, which corresponds to a correlation coefficient of r = 0.74, which is rather large in social sciences.

Correlations

Now, let us consider correlations between state-level crime rates and various predictors of crime, such as income, poverty rate, education, etc. This provides a gauge for the strength of the correlation between race and crime.

When comparing the percent of a state that is black to other predictors of crime, percent black is one of the best predictors of a state’s crime rate. The following table shows correlations between different measures of crime and different predictors:

The demographic and socioeconomic data was mostly pulled from Census websites mentioned above. You can see the sources and calculations in my spreadsheet here.

The chart above shows that, relative to other variables commonly taken to be strong predictors of crime such as poverty or education, percent black has a very high association with crime. This is particularly true for homicide rates, as the correlation between homicide rate and percent black is greater than the correlation between homicide rate and any other predictor except for single motherhood rate. The one surprising finding is the negative correlation between the percent black and rape rate of a state.

Note: it cannot be assumed that these correlations are entirely or even primarily due to a causal effect of the predictors in question. For example, there are likely traits that confound the relationship between poverty and crime, as the very traits that lead to poverty (e.g., low intelligence, impulsivity, etc.) likely also contribute to crime. Moreover, there is likely to be reverse causation as well. For example, as crime increases in a region, this is likely to increase poverty and single motherhood rate in that region as well (e.g., there are fewer non-incarcerated and desirable men to marry).

Now let us consider the association between race-specific homicide death rates and race-specific measures of socioeconomic status. For this table, I must rely only on the death data by the CDC, as the FBI does not report race-specific crime rates anywhere and it would be a pain to calculate manually.

The findings here are mostly in line with what one would expect regarding whites and blacks. For both racial groups, homicide death rates are associated with poverty, single motherhood, and education in the expected directions. However, the patterns are somewhat strange for Hispanics. Among Hispanics, homicide has very little association with poverty, single motherhood, or education.

Another interesting finding is that the variables that most strongly correlate with homicide death rates vary widely between groups. For whites, the variables with the highest correlations with homicide death rates are poverty, education, income, and gun ownership (which all have absolute correlations in the 0.67 to 0.74 range). However, for blacks, the correlations for single motherhood rate and bachelor’s degree are much higher than any of the other variables. For Hispanics, none of the variables have correlations greater than 0.48.

Because of the high poverty-crime correlation and poverty-race correlation, it may be worth analyzing whether the race-crime correlation is explained by poverty. The following chart shows the relationship between race-specific poverty and race-specific homicide rates across states.

As you can see, there is a correlation between poverty and homicide rates within racial groups (particularly for whites and blacks; you can see a positive slope for their lines). However, poverty still leaves much of the racial disparities in homicide unaccounted for. That is, at each level of poverty, blacks have higher homicide rates than whites and Hispanics with similar poverty rates. Furthermore, blacks in states with relatively low poverty rates still have homicide rates much higher than whites and Hispanics in states with higher poverty rates.

Regressions

The previous subsection merely reported bivariate correlations between various variables with crime at the state level. Now let us consider regression analyses in order to simultaneously model the association of multiple independent variables on crime. This allows us to see how well multiple variables can predict crime together. And it allows us to roughly estimate the association of each independent variable on crime while taking into account the association of each other independent variable.

However, one large limitation of this approach is multicollinearity. These greatly decreases the reliability for the estimates of the effect of each independent variable. The following paragraph from the Wikipedia page on multicollinearity describes some of the problems quite well:

In statisticsmulticollinearity (also collinearity) is a phenomenon in which one predictor variable in a multiple regression model can be linearly predicted from the others with a substantial degree of accuracy. In this situation, the coefficient estimates of the multiple regression may change erratically in response to small changes in the model or the data. Multicollinearity does not reduce the predictive power or reliability of the model as a whole, at least within the sample data set; it only affects calculations regarding individual predictors. That is, a multivariable regression model with collinear predictors can indicate how well the entire bundle of predictors predicts the outcome variable, but it may not give valid results about any individual predictor, or about which predictors are redundant with respect to others.

Multicollinearity will be an issue for many of the analyses here because many of the independent variables are highly correlated with one another. For example, the different variables to measure race/ethnicity (e.g. percent white vs percent black) are necessarily highly correlated because the percent of a state of each race/ethnicity must sum to 100%. Also, various socioeconomic variables such as poverty and single motherhood will also be highly correlated. Thus, one should not place too much stock into the coefficients of the independent variables. More attention should be placed on the overall explanatory power of the models considered.

Now, in order to perform the regression analyses, I decided to use this free online tool instead of learning to use more rigorous statistical software (firstly, because this isn’t a ton of data to analyze, and secondly because I don’t feel like learning new software).

First, I modeled the effects of a state’s racial/ethnic demographics on the homicide rate. The independent variables were percent black, percent white, percent Hispanic, and percent Asian. The results of the analysis are reported as follows:

There are two iterations because the first iteration included each independent variable in the model, whereas the second iteration excluded percent black because it did not have a statistically significant association with the dependent variable (p-value was well over 0.05). This means that adding percent black to the model doesn’t provide much incremental predictive validity once percent white, percent Hispanic, and percent Asian are already included in the model.

  • Note: this does not mean that percent black has no effect on a state’s homicide rate. Instead, this is likely the result of multicollinearity as mentioned earlier (note the really high VIF values). For example, if you remove “Asian” from the model, the “black” variable has a very high effect in the new model whereas “white” and “Hispanic” both lose significance. These highly erratic changes of significance despite small changes in the model are signs of multicollinearity. Furthermore, the fact that the standardized coefficient for white is greater than 1 is also indicative of multicollinearity.

Anyway, looking at the final iteration, the R^2 of the model is 0.74 (not shown above) and the adjusted R^2 is 0.72. The coefficient of multiple correlation (R) is 0.86. These values indicate that a state’s racial/ethnic demographics are a very good predictor of the state’s homicide rate. The “Coeff” column tells you the estimated association of each independent variable on crime, holding fixed the other independent variables. For example, the -29.6 in column 1 indicates that a 100 percentage point increase in a state’s white population was associated with about 29.6 fewer homicides per 100,000 population. In other words, a 10 percentage point increase in the percent of a state’s population that is white is associated with about 2.96 fewer homicides per 100,000 population. But again, the coefficients for each individual variable should be treated with caution given multicollinearity.

Next, let us consider the association of socioeconomic variables on a state’s homicide rates. In this model, I included the state’s poverty rate, median household income, percent of 25- to 35-year-olds with high school degrees, percent of 25- to -35-year-olds with bachelor’s degrees, and inequality (measured by the Gini coefficient).

As you can see, in the first iteration, the only statistically significant variable was poverty, which had a p-value of 0.0152. All other variables had p-values above 0.05. After iteratively re-running the analysis after removing variables with the highest p-values, high school degree eventually became statistically significant, leaving us with a model with only two variables:

This implies that once one knows the poverty rate and rate of high school degree attainment for a state, learning about household income, bachelor’s degree attainment, or Gini coefficient provides little incremental validity for predicting homicide rates (again, due to multicollinearity, we cannot assume that poverty and high school degree attainment have a much stronger causal impact on homicides than the other variables).

In the final iteration, The R^2 was 0.56 and the adjusted R^2 was 0.54. The coefficient of multiple correlation (R) was 0.75. This implies that the socioeconomic variables considered here are strong predictors for the homicide rate of a state, though they are not quite as strong as racial/ethnic demographics considered earlier. When comparing the standardized coefficients for poverty and high school degree attainment, both variables seem to have similarly sized independent associations with homicides.

Just for fun, I also decided to see if gun ownership provided incremental validity over the previous socioeconomic variables.

Adding gun ownership did not increase the predictive validity of the model. In fact, the adjusted R^2 of the first iteration here is lower than the adjusted R^2 of the first iteration of the model considered earlier without gun ownership. In the final iteration, the only remaining statistically significant predictors were poverty and high school degree attainment, which had the same values that were reported previously.

I also decided to check if single motherhood rate provided incremental validity over the previous socioeconomic variables.

Looking at the values here, it appears that single motherhood does provide incremental validity to predict the homicide rates of a state, over that of the previous socioeconomic variables alone. In fact, single motherhood remained a statistically significant variable in the final iteration after removing the non-significant variables.

Note that after adding single motherhood into the model, bachelor’s degree attainment and Gini coefficient were significant in the final model whereas poverty and high school degree attainment were no longer significant. This change likely doesn’t reflect anything too meaningful (e.g., it’s probably not that the causal effects of high school degree attainment are confounded by or mediated through single motherhood, whereas bachelor’s degree attainment is not). This is probably just noisy changes due to multicollinearity. However, the fact that the explanatory power of the model is greater after adding single motherhood shows that the association between single motherhood and homicides is not shared with these other variables in the model.

Anyway, in the final iteration, the R^2 was 0.70 and the adjusted R^2 was 0.69. The coefficient of multiple correlation (R) equals 0.84. So a state’s socioeconomics and single motherhood rate predicts a state’s homicide rate about as well as the state’s racial/ethnic demographics (although, it could be that a state’s single motherhood rate is basically just a proxy for the state’s racial/ethnic demographics).

Finally, I combined all of the variables that have been considered thus far and threw them into one giant model to see how much variance could be explained by both racial/ethnic demographic data and socioeconomic data. The final iteration of this analysis was as follows:

The R^2 was 0.82 and the adjusted R^2 was 0.80. The coefficient of multiple correlation (R) equals 0.91. This is just slightly better than the values reported when just considering racial/ethnic demographics. Thus, if one already knows the racial demographics of a state, adding socioeconomic variables such as poverty, income, education, etc. only provides only slight incremental validity for predicting homicide rates.

Again, due to multicollinearity, I wouldn’t pay much attention to which specific variables survived to the final iteration. For example, if one removes the Gini coefficient from the model, poverty no longer survives and instead bachelor’s degree survives. Which variable survives to the final iteration is unlikely to be meaningful. Moreover, the specific magnitudes of the coefficients are also unlikely to be meaningful due to multicollinearity. The more meaningful finding is the overall explanatory power of the different models.

Counties


In this section, I analyze racial disparities in homicide deaths by county. For this section, I only use CDC data because the FBI does not publish data at the county level (as far as I can see). Thus, I rely on the CDC’s Underlying Cause of Death database, which reports data on death in 3,147 counties throughout the United States. The homicide death rates here are based on the aggregated number of homicides from 1999 to 2020 and number of person-years over this time period. For a much more in-depth analysis of racial disparities in crime by county, I highly recommend the post by Random Critical Analysis [archived] on the topic in 2015.

Note: I decided to include all 21 years of data instead of the past 5 years in order to have enough data to include more counties, as the CDC does not report homicide rates for counties with fewer than 20 homicides over the time period (the CDC reports “unreliable” for these counties instead of a numerical homicide rate).

I focused specifically on analyzing deaths due to homicide in these counties disaggregated by race/ethnicity. First, I will report data on homicide deaths for victims of all ages. Next, I will focus on homicide deaths specifically for young victims (between the ages of 15 and 24).

Homicide deaths of all-aged victims

First, let us consider racial disparity in homicide deaths for victims of all ages. The data and charts for this subsection are found in my spreadsheet here. For this subsection, I limited the data to only include counties with at least 50 homicides between this time period (which resulted in just 900 counties) in order to avoid having to process too many counties with little to no impact on the aggregate trends.

The following chart shows the county homicide death rate by race/ethnicity, with the counties sorted by poverty rate.

As you can see, regardless of the poverty rate of the county, there are stark racial/ethnic disparities in homicide rate. In particular, the black homicide rate is typically several times higher than that of Hispanics and whites. In fact, there are very few counties where the Hispanic or white homicide rate approaches 20 per 100,000 people, even in relatively poor counties, yet the black homicide rate regularly exceeds this threshold. There are even some counties where the black homicide rate exceeds 40 per 100,000 people, which is multiples of the highest homicide rate for whites or Hispanics. Also, the Hispanic homicide rate seems to hover just over the white homicide rate, but the differences are not very stark compared to the black rates.

The following histogram shows county-level homicide rates by race/ethnicity across these 900 counties.

As you can see, the median white homicide rate is somewhere between 2 to 4 homicides per 100,000 people, with the upper range falling off in the low teens. For Hispanics, the distribution is slightly shifted to the right, with the median somewhere between 6 and 10 homicides per 100,000 people. However, for blacks, the median seems to be somewhere between 18 and 24 homicides per 100,000 people. Moreover, the range of homicide death rates is much greater for blacks than either whites or Hispanics. For example, it is more common to find counties where the black homicide rate is above 40 per 100,000 than it is to find counties where the white homicide rate is above 10 per 100,000 people. Furthermore, there are virtually no counties where the black homicide death rate is between 2 to 4 per 100,000 people, which is the typical range for whites.

The following histogram shows the county-level black/white homicide death rate ratio across these counties.

The typical county has a black/white homicide rate ratio between 4 to 6. In the 900 counties analyzed, there were no counties where the black homicide death rate was lower than the white homicide death rate (i.e. no county where the ratio was below 1). There were just a few where the ratio was below 2. There are even some counties where the black homicide death rate is over 20 times that of white rate.

This chart shows the relationship between the percent of a county that is black vs the total homicide rate of that county.

As you can see, there is an incredibly strong relationship between the percent of a county that is black and the homicide rate of the county. An R^2 value of 0.591 corresponds to a correlation coefficient of about r = 0.77. In counties with near zero percent black people, the typical homicide rate is about 3 to 4 per 100,000 people. However, the typical homicide rate in counties that are 20 percent black is about twice that. In counties that are 50% black, there are virtually no counties with a homicide rate below 10 per 100,000 people.

The following chart shows the relationship between the percent of a county that is black vs the percent of homicides with a black victim. The yellow line indicates what we would expect if the percentage of homicides with a black victim matched the percentage of the county’s population that is black.

As you can see, as the percentage of a county that is black increases, the percentage of homicide victims that are black increases much more rapidly. In fact, blacks begin to constitute about 50% of homicide victims once they constitute about 18 to 19 percent of a county’s population. In counties where blacks are about 50% of the population, blacks constitute about 80% of the homicide deaths, with the range somewhere between 70% and 90%. To put this into perspective, males constitute 50% of the U.S. population, yet they constituted only about 78% of the of murder victims in 2019 (FBI).

Homicide death of young victims

Now let us consider the racial disparity in homicide deaths for young victims (aged 15 to 24). The data and charts for this subsection are found in my spreadsheet here. All the charts presented here are analogous to the charts presented in the previous subsection, with the only difference being that the data is restricted to young victims. Also, because this data restriction reduces the sample size, I considered counties with at least 20 homicides (instead of 50, as above) during the time period (1999 to 2020). The results here mostly mirror those of the previous subsection, but the magnitudes of the differences are much greater.

Let us start by considering the following chart which shows the county-level youth homicide death rate by race, sorted by county poverty rate.

Again, blacks have far greater homicide death rates than whites and Hispanics regardless of county-level poverty. In fact, the black youth homicide death rate in counties with the lowest poverty rates is still several times greater than that of the white youth homicide death rate in counties with the highest poverty rates. The Hispanic vs white homicide gap here for youth is greater than the gap among all aged victims. It seems that there is a constant gap of about 10 homicides per 100,000 people between white vs Hispanic youths across counties.

The following histogram shows counts of county-level youth homicide rates by race.

This chart shows the vastly different realities that young blacks and whites live in. For whites, the youth homicide death rate in virtually all counties is between 10 per 100,000 people. For blacks, there is no county where the youth homicide death rate is that low. Instead, the most common homicide death rate seems to be somewhere around 40 per 100,000. Again, there is also huge variation in black homicide death rates, with some counties having homicide rates at below 20 per 100,000 (which is still substantially higher than the typical rate for whites) and some counties having homicide rates exceeding 100 per 100,000. This is a completely different world than what virtually all white youths experience.

This chart also shows a sizable difference in homicide rates for Hispanics and whites too. For Hispanics, the homicide death rate is higher, with the median somewhere between 15 and 20 per 100,000 people.

The next histogram shows counts of the county-level black/white youth homicide rate ratio throughout the country.

As you can see, the youth black homicide rate is at least twice as high as the white homicide rate in every county. And it is rare to find a county where the homicide rate ratio is below 4. The median homicide rate ratio seems to be around 10, though some counties see homicide rate ratios of over 30. Thus, there are some counties where black youth are 30 times more likely to be murdered than are white youth.

Now, let us consider the relationship between the percent of a county’s youth that is black vs the county’s youth homicide rate.

As you can see, there is an incredibly strong relationship between the percent of a county’s youth that is black and the homicide rate of the county’s youth. An R^2 value of 0.517 corresponds to a correlation coefficient of about r = 0.72. In counties with near zero percent black people, the typical youth homicide rate is near zero. However, the typical youth homicide rate in counties where the youth is 20 percent black is over 10 homicides per 100,000. In counties that have around 50% black youth, there are no counties with a homicide rate below 10 per 100,000 people, with the typical homicide rate being around 20 per 100,000.

The following chart shows the relationship between the percent of a county’s youth that is black vs the percent of youth homicides with a black victim.

As you can see, as the percentage of a county’s youth that is black increases, the percentage of youth homicide victims that are black increases much more rapidly. In fact, blacks begin to constitute about 50% of youth homicide victims once they constitute about 10 percent of a county’s youth population. In counties where blacks are about 50% of the youth population, blacks constitute well over 85% of the homicide deaths. Again, this is greater than the percentage of homicide victims that are male in the country.

Large cities


In this section, I analyze racial disparities in crime in the major cities of the United States. This focuses on crime data aggregated over all large cities.

Violent crime in large cities

The CDE tool by the FBI provides estimates of the number of arrestees and victims of violent crimes in U.S. cities by size. I checked the estimates for cities with a population of 250,000 or greater in 2021. The findings are as follows:

So blacks are about 48.5% and whites 48.3% of victims of violent crimes. However, blacks are about 58.6% and whites 39.0% of arrestees for violent crimes. I performed similar calculations for the specific category of violent crimes – murder and non-negligent manslaughter, rape, aggravated assault, and robbery. Here are the results for arrestees:

Violent crime arrestees by race in cities 250,000 population or greater in 2021

Black arresteesWhite arresteesTotal arresteesBlack %White %
Rape1,2781,2552,58749.4%48.5%
Murder2,7791,0603,87071.8%27.4%
Robbery13,5026,47520,26366.6%32.0%
Aggravated Assault46,57033,77581,48557.2%41.4%
Violent64,46642,888110,01258.6%39.0%

Here is the same data but for victims instead of arrestees

Violent crime victims by race in cities 250,000 population or greater in 2021

Black victimsWhite victimsTotal victimsBlack %White %
Rape8,61815,44525,02934.4%61.7%
Murder6,0092,5868,74868.7%29.6%
Aggravated Assault176,919146,389332,92253.1%44.0%
Violent235,927234,765486,55248.5%48.3%

Thus, blacks are involved in an extremely large portion of violent crimes in large U.S. cities, particularly for murder and robbery, where they constitute about 70% of arrestees. In fact, blacks constitute about half or more of the arrestees for each category of violent crime in these large cities. Keep in mind that the black/white disparity seen here is actually understated because the “white” figure includes Hispanics. If the FBI provided data to compare non-Hispanic blacks and non-Hispanic whites, the disparity would be even greater.

The CDC Wonder tool also reports data on homicides by urbanization. This is not the same as data on homicides in large cities, but it is the closest approximation available. The largest urbanization category is “Large Central Metro” which the CDC describes as “counties in metropolitan statistical areas (MSAs) of one million or more population that have been identified by NCHS classification rules as central because they contain all or part of a principal city of the area”. The following table lists the number and rates of deaths due to homicide in large central metros in 2020.

Here is the query criteria to generate this table:

The findings show racial/ethnic disparities in homicides that are in line with all data that has been presented thus far. Non-Hispanic blacks have by far the highest homicide death rates, followed by Hispanics and American Indians, and then whites and Asians. In fact, blacks constituted about 65% (6,832 out of 10,483) of homicide victims in 2020, which is in line with the data published by the FBI for 2021.

Predictors of crime

Now let us analyze compare race to other predictors of crime in the 100 most populous cities in the United States. To start, I calculated the correlation between different measures of violent crime and various demographic and socioeconomic variables. The demographic and socioeconomic data were pulled from the Census QuickFacts website. Crime data were pulled from 2019 FBI data as reported in the Wikipedia page on “List of United States cities by crime rate” [archived] The calculations for the correlations are present in my spreadsheet here. Here are the main findings:

The findings here are mostly what one would expect. Lower crime rates are predicted by higher concentrations of Asians, higher household incomes, and higher levels of education. Higher crime rates are predicted by higher concentrations of black people and higher poverty rates. In fact, across every measure of crime, the two consistently leading predictors of crime are percent black and poverty rate. The one exception concerns rape, where percent Asian and percent American Indian are top predictors as well. For each crime, percent black and poverty rate are similarly strong predictors, except for murder where percent black seems to be a considerably stronger predictor (r = 0.76 vs r = 0.61).

Just as I did for states, I will now consider regression analyses to quantify the combined explanatory power of these variables. The dependent variable in each of these analyses was the city murder rate.

First, I ran analyses to calculate the association between a city’s racial/ethnic demographics and the city’s murder rate. The results are as follows.

As you can see, none of the independent variables achieved a p-value below 0.05, although black and American Indian were close. Both of these variables become statistically significant after removing Asian, White, and Hispanic from the model.

In the final iteration of the analysis, a linear model with just percent black and percent American Indian explains 59.5% of the variation of a city’s murder rate. The coefficient of multiple correlation (R) was 0.78.

Next, I considered a model with socioeconomic variables like poverty, income, and education. I also included population density since it was available on the Census website.

Only poverty was statistically significant in the initial iteration. After removing statistically insignificant variables, the final iteration was as follows:

In the 4th iteration, all statistically insignificant variables except for H.S. degree were removed, which was nearly significant with a p-value of 0.052.

In the final iteration with just poverty, the adjusted R^2 was 0.35 (I’m not sure why it isn’t shown above the table). The coefficient of multiple correlation (R) was 0.60. Consistent with the state-level data, the socioeconomic variables explain less of the variation in city-level murder rates than do racial/ethnic demographic variables. Even if we use the 4th iteration of the model which includes both poverty and the nearly significant H.S. degree, the R^2 is still lower than the R^2 of percent black and American Indian.

Finally, I ran an analysis including all of the racial/ethnic and socioeconomic variables considered above. In the final iteration, the only statistically significant independent variables were percent black, percent American Indian, and poverty rate.

Recall that the adjusted R^2 for the model including just black and American Indian was 0.595. After including poverty in the model, this rises to only 0.617. Thus, it appears that introducing socioeconomic variables such as poverty, income, or education provides very little incremental validity to predict a city’s murder rate once one already knows the racial demographics of the city.

Violent crime in top 30 cities


I this section, I consider racial disparities in violent crime in the top 30 cities in the United States. Instead of looking at aggregated data across each city (like in the previous subsection), I will consider the data from each city individually. To start, I will report the findings from an existing study that analyzed racial disparities in homicide victimization in the 30 largest U.S. cities. Next, I will perform my own analysis of violent crime more generally in these cities.

Crime rates by race

Schober et al. (2021) analyze homicide victimization rates by race in the 30 largest cities in the United States. Homicide victimization rates were measured using data from the CDC’s National Vital Statistics System which collects data on causes of death throughout the country. Data were analyzed across two time periods: 2008 to 2012 (T1) and 2013 to 2013 (T2). The researchers focused specifically on the death rates for non-Hispanic blacks and non-Hispanic whites. They also excluded cities with fewer than 20 deaths for either whites or blacks in either time period. This left 26 total cities to analyze (cities removed: El Paso, TX; Washington D.C.; San Jose, CA; Boston, MA; page 3).

The researchers found substantially higher homicide victimization rates among blacks in each of the 26 cities analyzed in both time periods:

All the 26 cities had higher Black homicide rates than White homicide rates at both time periods as well as statistically significant Black-to-White mortality rate ratios. At T1, the smallest disparity was seen in Houston (rate ratio=1.8, 95% CI=1.6, 2.1) and the biggest in San Francisco (rate ratio=20.1, 95% CI=14.4, 28.1). At T2, the smallest disparity was seen in Detroit (rate ratio=3.2, 95% CI=2.3, 4.4) and the biggest in Chicago (rate ratio=26.4, 95% CI=21.0, 33.2).

The data for each city were presented in the following table:

Perhaps the most surprising findings here are the high rates of black homicides in cities not traditionally known for high rates of violent crime (e.g., Seattle, San Francisco, Portland, etc.). Even in these cities, the homicide rate for blacks is several times that of the typical homicide rate for whites. In fact, in some time periods, the black homicide rates in these cities are not far off from the black homicide rates in cities that are more notoriously associated with violence (e.g., Detroit, Chicago, Baltimore, etc.). For example, in the first time period, the black homicide rate was highest in San Francisco, higher than in cities like Detroit, Chicago, Baltimore, etc.

The following figure shows each city’s homicide rate vs black-white homicide rate ratio in time period T2. The 4 quadrants were constructed based on the national homicide rate and national homicide rate ratio.

From this graph, only two cities (Austin and San Diego) experienced both homicide rates and racial homicide rate ratios below the national averages.

Percent of crimes committed by blacks

I decided to perform a similar analysis above, except I focus on violent crime generally in addition to homicide specifically. Also, instead of calculating and reporting rates of crime, I decided to report the percentage of violent crimes committed by blacks. I picked the top 30 cities based on 2020 Census data, which I pulled from the Wikipedia page on List of United States cities by population [archived]. The largest cities listed here are identical to the largest cities reported in Schober et al. (2021). Demographic data was pulled from the Census website.

To calculate the percentage of violent crimes committed by blacks, I used data from the FBI’s Crime Data Explorer (CDE). Note, the glossary at this site defines violent crime as “composed of four offenses to include murder and nonnegligent manslaughter, rape, robbery, and aggravated assault.” The explorer allows users to filter data by state and by agency. For each of the top cities, I checked the offender data reported by that city’s police department for 2021. Unfortunately, 7 of the cities did not have data reported by their respective police departments (New York City, Los Angeles, Phoenix, San Jose, Jacksonville, San Francisco, Baltimore). For these cities, I collected data via other means (see bottom of table).

The following table shows the percentage of offenders of violent crime reported by the police department for each city. I bolded figures where blacks constitute over 50% of crimes (excluding unknown).

City
% black
of general
population
Violent crime offendersHomicide offenders
% black% unknown race% black excluding unknown% black% unknown race% black excluding unknown
New York City, NY*24%55%62%
Los Angeles, CA**9%37%
Chicago, IL29%62%28%86%60%29%85%
Houston, TX23%59%8%64%55%20%69%
Phoenix, AZ***7%26%
Philadelphia, PA41%75%7%81%58%29%82%
San Antonio, TX7%19%17%23%18%26%24%
San Diego, CA6%29%7%31%24%16%29%
Dallas, TX24%58%11%65%55%26%74%
San Jose, CA**3%13%18%
Austin, TX8%38%7%41%53%9%58%
Jacksonville, FL***31%74%
Fort Worth, TX19%53%10%59%46%13%53%
Columbus, OH29%70%11%79%77%13%89%
Indianapolis, IN29%67%7%72%65%13%75%
Charlotte, NC36%78%6%83%83%4%86%
San Francisco, CA***5%55%
Seattle, WA7%40%21%51%44%28%61%
Denver, CO9%33%14%38%37%8%40%
Washington DC45%68%29%96%50%49%98%
Nashville, TN27%60%13%69%46%43%81%
Oklahoma City, OK14%45%15%53%45%28%63%
El Paso, TX3%14%5%15%35%4%36%
Boston, MA24%54%22%69%61%29%86%
Portland, OR6%35%8%38%50%9%55%
Las Vegas, NV12%54%4%56%44%2%45%
Detroit, MI77%91%2%93%89%1%90%
Memphis, TN64%74%20%93%64%5%67%
Louisville, KY24%64%3%66%61%17%73%
Baltimore, MD***62%92%
Average24%54%12%57%53%18%63%

* New York City data was not present in the CDE database. The percentages were calculated from this source [archived] on crime in 2021. I calculated the values for violent crime by aggregating the number of blacks who were arrested for murder (page 1 of 16), rape (page 2), robbery (page 4), and felonious assault (page 5).

** San Jose data was not present in the CDE database. The percentages were calculated from this source [archived] on arrests in 2021. I calculated the values for violent crime by counting the number of blacks who committed homicide, forcible rape, robbery, and assault (page 3).

*** I was unable to find official data on violent crimes committed by race for 5 cities (Los Angeles, Phoenix, Jacksonville, San Francisco, Baltimore). The police departments for these cities did not report any data to the CDE or any other publicly available page that I could find. Instead, I was able to find data on the percentage of homicides by race from various articles. Some articles reported data on homicide victims whereas other articles reported data on homicide offenders. The difference should not matter much, since homicides tend to be intraracial (e.g., in 2019, the FBI reports that at least 2,574 out of 2,906, or 88.6%, homicides with a black victim involved a black offender). When both figures are reported, I reported values for offenders. The sources for these 5 cities are here: Los Angeles [archived], Phoenix [archived], Jacksonville [archived], San Francisco [archived], Baltimore [archived].

I also found some alternative sources with crime data by race in Los AngelesSan Francisco, Jacksonville, and Portland.

Summarizing the findings

The previous table shows that blacks committed over 50% of violent crime (among known race offenders) in 18 of the 25 cities (72%) for which we have data on violent crime. Furthermore, blacks were involved with over 50% of homicides (among offenders/victims with known race) in 22 of the 30 cities (73%). On average, blacks committed about 60% of violent crimes and homicides in these cities.

In fact, in each of the 17 cities with >10% black population (for which we have violent crime data), blacks commit at least 50% of the violent crime. And in each of the 14 cities (except for New York) with >20% black population (for which we have violent crime data), blacks commit at least 60% of the violent crime and homicides. Moreover, blacks commit >50% of the violent crime and/or homicides even in a few cities where they constitute 5 to 8% of the population: Austin, San Francisco, Seattle, and Portland.

In the cities where blacks commit <50% of the violent crimes and homicides (Los Angeles, Phoenix, San Antonio, San Diego, San Jose, Denver, El Paso), the following pattern seems to hold true:

  • These are all cities with relatively small black populations (3% to 9%).
  • For 5 of these 7 cities (Los Angeles, Phoenix, San Antonio, San Jose, El Paso), Hispanics constitute a large percentage [archived] (between 31 to 82%) of the population of these cities. More importantly, Hispanics commit either the majority of violent crimes or nearly half of the homicides in each of these cities. For San Antonio and El Paso, see CDE data showing most violent crime involving Hispanics (although for El Paso data, we have to rely on victim data which shows that 79% of the victims of violent crime are Hispanic). For San Jose, see this source on arrests by ethnicity which shows Hispanic constituting about 57% of arrests for violent crime (see page 3, violent crimes are murder, forcible rape, assault, and robbery). Data on homicides by ethnicity were reported in the articles cited above for Los Angeles and Phoenix which show Hispanics as victims of nearly half of homicides.
  • For the 2 cities not addressed in the previous point (San Diego, Denver), there are also relatively large Hispanic populations (~30% in both cities [archived]). According to CDE data, Hispanics constitute 37% and 28% of violent crime offenders, respectively, in these two cities. This rises to about 37% and 41%, respectively, if we restrict the sample to violent crime offenders where the ethnicity is known.

Thus, it seems that, among the most populous U.S. cities, we can expect blacks to commit the majority of violent crime and homicides. When there are exceptions, this is because the cities have low black populations and large Hispanic populations that tend to be responsible for most or almost most of the crimes.

Conclusion


The above findings show that racial demographics are highly associated with violent crime rates. In particular, blacks are consistently overrepresented in rate of both victimization and offending for violent crimes. In fact, at every level of granularity (e.g. state vs county vs city), we consistently find that blacks begin to commit about 50% of violent crimes and/or homicides in a region once they constitute around 15% of that region’s population, which is almost exactly identical to the ratio found at the national level. Moreover, we find similar racial disparities in rates of violent crime in virtually every state, county, and city. The disparities are so great that it is fair to infer that blacks and whites (particular young blacks and whites) live in different realities with respect to exposure to violence. These disparities are not adequately explained by traditional socioeconomic variables such as poverty, income, education, or inequality.

There are also differences in violent crime for groups other than blacks and whites, although not nearly as stark. For example, Hispanics experience higher rates of homicide victimization that non-Hispanic whites, particularly for younger individuals. Asians also have considerably lower rates of exposure to violence than whites.

2 comments on Regional analyses of racial disparities in violent crime in the United States

  1. When you do regression with compositional predictors, you should leave out White, as the reference group, and add all the others. This way, the betas for the other groups reflect the difference between substituting that group for 100 % White. You will get nonsensical values if you try to include all or nearly all of them, especially when your dataset is small. That’s why your state-level regression is nonsense.

  2. What’s up with Detroit having such a high white homicide rate? Even if you add hispanics and arabs to the white mix it still is pretty high. Las Vegas also has a high white rate which partially can be explained by it being a large tourist area with more transients.

Leave a Reply