Skip to main content

Section 4.7 Racial Gerrymandering and Statistics

Although the 15th Amendment in 1870 established that U.S. citizens cannot be denied the right to vote “on the basis of race, color, or previous condition of servitude,” it was not effectively enforced until almost 100 years later with the passage of the Voting Rights Act (VRA) of 1965 in the Civil Rights movement. In the meantime, many Black citizens (particularly in the South) had faced tremendous obstacles to voting, including literacy tests, poll taxes, intimidation, and violence. The VRA now made it illegal for states to pass any laws that discriminate against racial or language minorities through vote denial or vote dilution. This includes anything that “diminishes the ability to elect their candidate(s) of choice” which would be applied to district plans and serves as the legal basis for racial gerrymandering.

Motivating Questions: Does racially polarized voting exist in a certain area? How can statistics help us decide?

In 1986, the Supreme Court case Thornburg v. Gingles (1986) further established what constitues an illegal racial gerrymander by adding a set of criteria known as the Gingles test:

  • a racial minority group forms a numerical majority of voting-age population in a compact geographic area

  • the minority group is “politically cohesive” (members of the group vote similiarly)

  • the racial majority group is also politically cohesive but for a different political party (their preferred candidate defeats minority group's preferred candidate)

The last two criteria are collectively known as racially polarized voting. If all three criteria are met, the law requires states to draw a majority-minority district in that geographic area. This means that the majority of a district's population are of a minority race, ethnic, or language group. These required districts are often referred to as VRA districts.

To determine if the conditions are met, ask the following questions:

  1. Is it possible to draw a geographically compact district that includes the majority of the racial or language minority’s members?

  2. Does the racial or language minority tend to vote as a bloc and back the same preferred candidate?

  3. Does the remaining population also generally vote as a bloc and in doing so defeat the candidate backed by the racial or language minority?

Satisfying the first of the Gingles criteria requires analyzing maps and demographic data, and determining if racially polarized voting exists requires the use of statistics. In this section, we follow an approach similar to that outlined in [4.12.44].

Subsection 4.7.1 Inference

Voting in the United States is done by secret ballot, so although we cannot know the votes of individuals, we can use statistics to infer the voting behavior of different groups based on election results in various precincts. In the following example, we consider two different voting scenarios that have the same aggregate data (i.e. the same vote totals).

Suppose a city's voting population is 55% White and 45% non-White, and suppose that in an election, a candidate from the Heart Party receives 60% of the votes and a candidate from the Club Party receives 40% of the votes. We can summarize this information in a two-way table as shown below.
Table 4.7.2.
\(\heartsuit \) voters \(\clubsuit \) voters totals
White voters ?? ?? 55
non-White voters ?? ?? 45
totals 60 40 100

The challenge is that due to the secret ballot, there are many possibilities for the inside of the table. Here we consider two different possibilities. For each table below, fill in the missing values. (Note that once we choose one value, the rest are determined.)

Table 4.7.3.
\(\heartsuit \) voters \(\clubsuit \) voters totals
White voters 45 55
non-White voters 45
totals 60 40 100
Table 4.7.4.
\(\heartsuit \) voters \(\clubsuit \) voters totals
White voters 30 55
non-White voters 45
totals 60 40 100
Hint.
To complete the table, use the information on White voters who voted for the Heart candidate to complete the first row of data. In the first table, we have 45 White voters for Heart out of 55 total White Voters. That means there must be 10 White voters who voted for the Club candidate. In a similar way, fill in the other values in the table.
Answer.
Table 4.7.5.
\(\heartsuit \) voters \(\clubsuit \) voters totals
White voters 45 10 55
non-White voters 15 30 45
totals 60 40 100
Table 4.7.6.
\(\heartsuit \) voters \(\clubsuit \) voters totals
White voters 30 25 55
non-White voters 30 15 45
totals 60 40 100
Solution.

In the first scenario, when 45 of the White voters voted for the Heart candidate, we had 10 White voters vote for the Club candidate (since the total number of White voters is 55). Furthermore, since there are 60 total Heart voters and 45 of them are White voters, we must have \(60 - 45 = 15 \) non-White Heart voters. To complete the second row of data, note that we now have 15 non-White voters who voted for the Heart candidate and 45 total non-White voters, so we must have \(45 - 15 = 30\) non-White voters who chose the Club candidate.

In the second scenario, we have 30 of the White voters choosing the Heart candidate. Thus we have \(55 - 30 = 25\) White Club voters (total White voters - White Heart voters = White Club voters). Similarly, since there are 60 total Heart voters and 30 of them are White voters, we must have \(60 - 30 = 30 \) non-White Heart voters. To complete the second row of data, note that we now have 30 non-White voters who voted for the Heart candidate and 45 total non-White voters, so we must have \(45 - 30 = 15\) non-White voters who voted for the Club candidate.

Notice that in the first table, \(45/55\text{,}\) or \(82\%\text{,}\) of White voters voted for the Heart (\(\heartsuit \)) candidate while only \(15/45 = 33 \%\) of non-White voters voted for the Heart candidate, suggesting that there is a significant difference in the voting patterns by racial group(s). The second table tells a different story: \(30/55 = 55 \%\) of White voters voted for the Heart candidate and \(30/45 = 67 \%\) of non-White voters voted for the Heart candidate. The closeWindow()ness of these percentages suggests that there is no relationship between race and political party in this scenario.

Given that both of the scenarios above are possible (knowing only the vote totals), how can we determine potential racially polarized voting? First, we can analyze homogeneous precincts, precincts where the vast majority of people are of the same racial group. For example, consider a voting precinct in which \(90 \%\) of the voters are White and \(95 \%\) of the votes went to the Heart (\(\heartsuit \)) candidate in a recent election, as shown below.

Table 4.7.7.
\(\heartsuit \) voters \(\clubsuit \) voters totals
White voters ?? ?? 90
non-White voters ?? ?? 10
totals 95 5 100

In this scenario, there is a much narrower range of possibilities as we know that at least 85 of White voters voted for the Heart candidate in this precinct. (Why? Check: since there are only 5 total votes for the Club candidate, there could be at most 5 votes for the Club candidate that came from White voters.) That means at least \(85/90 = 94 \%\) of White voters voted for the Heart candidate, so it appears that White voters have a strong preference for the Heart candidate. We could then investigate precincts with a small percentage of White voters to estimate the vote choice of other racial groups.

Subsection 4.7.2 Scatterplots and linear regression

Another statistical tool is to create a scatterplot with all precincts and perform a linear regression to estimate voter support by race. We can place the percentage of voters of a particular racial group on the horizontal axis and place the percentage of voters of a particular political party on the vertical axis and then the information from each voting precinct becomes a point on the graph.

The table below shows hypothetical data from the 15 voting precincts in a given town. For each precinct, the percentage of White voters and the percentage of votes for the Heart candidate are shown. Each of those pairs becomes a point on the scatterplot.
Table 4.7.9.
precinct % White % \(\heartsuit \)
1 80 72
2 45 55
3 64 69
4 78 65
5 50 47
6 59 57
7 61 62
8 35 46
9 41 52
10 24 32
11 52 60
12 28 42
13 46 60
14 37 45
15 55 60

Now we plot the points. Note that this can be done using any spreadsheet program. Click the Evaluate button to display the scatterplot.

Next we determine the line of best fit, a line that approximates the data points that we have entered. This is done through a process called linear regression, in which we find the line that minimizes the total distance from the data points to the line. We are looking for an equation of the form \(y = ax + b\text{,}\) where \(y\) represents the percentage of votes for the Heart candidate and \(x\) represents the percentage of voters in the precinct who are White. We use technology to determine the values of \(a\) and \(b\text{.}\) Here we use Sage, although any spreadsheet program is able to do the same analysis. Our code below follows this Sage tutorial by Brandon Curtis (opens in new tab) 1 .

How do we interpret the output? In our example, we found \(a \approx 0.6\) and \(b \approx 24.7\text{,}\) so our line of best fit is given approximately by \(y = 0.6x + 24.7\text{.}\) In other words,

\begin{equation*} \text{(\% of votes for Heart) } = 0.6 \text{ (\% of voters who are White) } + 24.7. \end{equation*}

We can use this to predict the voting patterns of each group. Suppose in a given precinct, 100% of voters are White. Then using our model, our prediction is that

\begin{equation*} \text{(\% of votes for Heart) } = 0.6 (100) + 24.7 = 84.7\%. \end{equation*}

Now consider a precinct in which 0% of voters are White. Then using our model, our prediction is that

\begin{equation*} \text{(\% of votes for Heart) } = 0.6 (0) + 24.7 = 24.7\%. \end{equation*}

In other words, we predict that the Heart candidate is the candidate of choice for 24.7% of the non-White voters. Do the White voters appear to be a politically cohesive group? What about the non-White voters?

Subsection 4.7.3 Caution

Additional statistical analysis, including calculation of correlation and confidence intervals, can be done to further analyze the situation. We note again that all individual ballots are secret, so we are relying here on group totals (aggregate data) to draw conclusions about voting patterns of different groups. Technology tools are available to perform calculations, but any interpretation should be done carefully, with a solid understanding of the VRA criteria and careful consideration of the data used for analysis.

sage.brandoncurtis.com/data-fitting.html