Skip to main content

Section 4.8 Outlier Analysis

In the previous section on partisan gerrymandering, each measure compared election results to some version of proportional representation, but in many states proportionality may not be the appropriate baseline.

Consider the recent history of congressional districts in the state of Massachusetts. As of 2021, Massachusetts has 9 House districts, all of which have been represented by Democrats for over 25 years. In fact, the last time a Republican represented Massachusetts was in 1994 and Democrats have won 125 consecutive House elections! Despite the lopsided nature of the state's representation, their map is not controversial and has not been challenged for any type of gerrymandering.

Why is this result not controversial? Republicans do exist in Massachusetts and regularly receive between 30 and 40 percent of the vote in statewide elections (President, Governor, Senate). Proportional representation would give Republicans 3 of the 9 congressional districts; however, even though the possible number of valid districting plans exceeds the number of particles in the universe, it is impossible to come up a plan that yields even one Republican district! [4.12.35] This is simply due to the geography and the fact that there are no large Republican enclaves within the state. Due to the natural distribution of their voters, they do not form a majority in any region of Massachusetts, leading to their prolonged lack of representation in Congress.

Figure 4.8.1. These figures are taken from [4.12.35] (Figure 4). As described there, they show the voting pattern for Republicans George W. Bush in the 2000 presidential race (left, by town) and Kenneth Chase in the 2006 senate race (right, by precinct). Regions shaded dark red voted in favor of the Republican candidate, while regions in orange and yellow were the most "Republican favorable" available to collect enough population to create a congressional district. Even in assembling districts this way, the Democratic candidate would have won more of the vote share.

Motivating Question: If proportionality isn't necessarily an appropriate benchmark, what should we measure a district plan against? Can we determine if a plan produces an unlikely result?

ensemble

Definition 4.8.2.

An outlier is an observation that is unusual within a given context. In a large set of possible district plans, those with unlikely outcomes relative to the rest may be outliers, hence the term outlier analysis. There are many different criteria for determining outliers (or potential outliers) in statistics; we will not cover those in depth here.

Let's suppose a state with 7 districts has a proposed district plan in which the Heart Party is favored to win 6 of the 7 districts. A computer program generates 1,000,000 possible district plans and in only 100 of the plans is the Heart Party favored to win as many as 6 of the districts. Due to the unlikely nature of this outcome (occurring in only 0.01% of plans), there is evidence that the proposed plan may have been manipulated to benefit the Heart Party. That is, it represents a potential outlier.

Consider the following distribution of voters.

Table 4.8.3.
\(\heartsuit \) \(\clubsuit \) \(\clubsuit \) \(\clubsuit \) \(\clubsuit \)
\(\heartsuit \) \(\clubsuit \) \(\heartsuit \) \(\heartsuit \) \(\clubsuit \)
\(\clubsuit \) \(\heartsuit \) \(\heartsuit \) \(\clubsuit \) \(\heartsuit \)
\(\clubsuit \) \(\clubsuit \) \(\clubsuit \) \(\clubsuit \) \(\heartsuit \)
\(\clubsuit \) \(\clubsuit \) \(\clubsuit \) \(\clubsuit \) \(\clubsuit \)

Suppose we want to create 5 districts of 5 voters each. Some quick math shows the best outcome for the Heart Party (with 8 total voters) is to win 2 districts and it would take some intentional packing and cracking to achieve that outcome. In the example below, we create plans that lead to this outcome.

Draw district lines for the map given above so that the Heart party wins two disticts. Create 5 districts of 5 voters each.

Hint.
Note that since there are 5 voters in each district, the Heart party needs at least 3 voters in a district to win the district. How can you draw the lines so that the Heart party wins two districts?
Answer.
Table 4.8.5.
\(\heartsuit \) \(\clubsuit \) \(\clubsuit \) \(\clubsuit \) \(\clubsuit \)
\(\heartsuit \) \(\clubsuit \) \(\heartsuit \) \(\heartsuit \) \(\clubsuit \)
\(\clubsuit \) \(\heartsuit \) \(\heartsuit \) \(\clubsuit \) \(\heartsuit \)
\(\clubsuit \) \(\clubsuit \) \(\clubsuit \) \(\clubsuit \) \(\heartsuit \)
\(\clubsuit \) \(\clubsuit \) \(\clubsuit \) \(\clubsuit \) \(\clubsuit \)
Solution.
Note that there are multiple possibilities! As an alternative to the plan given as the “answer”, we could also use the plan given here.
Table 4.8.6.
\(\heartsuit \) \(\clubsuit \) \(\clubsuit \) \(\clubsuit \) \(\clubsuit \)
\(\heartsuit \) \(\clubsuit \) \(\heartsuit \) \(\heartsuit \) \(\clubsuit \)
\(\clubsuit \) \(\heartsuit \) \(\heartsuit \) \(\clubsuit \) \(\heartsuit \)
\(\clubsuit \) \(\clubsuit \) \(\clubsuit \) \(\clubsuit \) \(\heartsuit \)
\(\clubsuit \) \(\clubsuit \) \(\clubsuit \) \(\clubsuit \) \(\clubsuit \)

While the Heart Party achieves their best possible outcome, can we say that gerrymandering is present? In other words, is the outcome of this plan an outlier relative to other possible plans?

We can look at this from a couple of different perspectives. We turn to technology to help us, using the MGGG grid tool found here: MGGG grid tool (opens in new tab) 1 . It turns out that there are 4,006 ways to divide a 5x5 grid into 5 contiguous districts of 5 blocks each. The MGGG grid tool determines all 4,006 plans with the same location of voters shown here:

Figure 4.8.7. Image of Hearts and Clubs voter distribution grid from MGGG tool, matching distribution given in Table 4.8.3.
It then provides the results shown in the histogram below.
Figure 4.8.8. Bar chart showing number of plans in which Heart party wins particular number of seats.
As shown in the histogram, the Heart party wins 0 districts (seats) in 961 plans (24%), 1 district in 2,421 plans (60%), and 2 districts in 624 plans (16%). (Our event, the Heart party winning 2 districts, falls in the green bar to the right.)

Alternatively, there are 1,081,575 different ways to place 8 Hearts and 17 Clubs in a 5x5 grid. This time, we consider all of these arrangements with the same district plan given above in Table 4.8.5 (the answer to Example 4.8.4), shown here. (Different colors denote different districts.)

Figure 4.8.9. District plan matching boundaries in Table 4.8.5. In the image, colors are used to distinguish different districts.
The MGGG tool provides the following analysis:
Figure 4.8.10. Bar chart showing number of plans in which Heart party wins certain number of seats, given 8 Heart voters and a district plan.
In this case, the Heart party wins 2 districts in 120,450 out of the possible 1,081,575 plans, which is 11% of plans.

Either way, even though the Heart Party has maximum representation (2 seats), something happening more than 10% of the time is not all that uncommon. In research that uses statistics, 5% (1 in 20) is the most common threshold for determining whether an occurrence is considered an outlier. (Sometimes lower values such as 1% are used to capture only “stronger” outliers.)

Now suppose a ninth Heart is added to the district plan above (replacing any one of the Clubs). Mathematically, it is now possible for the Heart Party to win 3 districts, but even if they do, would it be considered an outlier and potential gerrymandering?

Go to the MGGG grid tool (opens in new tab) 2  and scroll down to the section with bar graphs. Create a grid that matches Table 4.8.3, then change one of the Clubs to a Heart so that there are exactly 9 Hearts. Then create a district plan (using “Build a Plan \(\mathcal{D}\)”) in which Hearts win 3 of the 5 districts. Observe the distributions to determine if your plan is an outlier.

Solution.
The specific information depends on which square in the grid is changed to a Heart. Choose different possibilities to see the different results! As one example, place a Heart in row 5, column 5. This is shown below:
Figure 4.8.12. Vote distribution grid from MGGG tool, modified from Table 4.8.3 to have 9 Hearts. Plan drawn so that Hearts win 3 seats.
Then 38 plans out of 4,006 possible plans, or 0.9%, yield 3 seats for the Heart party, as shown in histogram below:
Figure 4.8.13. Bar chart showing number of plans in which Heart party wins some number of seats; reveals chance of winning 3 seats is small.
This outcome of Hearts winning 3 seats is an outcome occuring in less than 5% of the possible outcomes - in fact, less than 1% of possible outcomes. It appears to be an outlier when compared to the ensemble of all possible outcomes.

Although we can analyze all possible plans and distributions on a 5x5 grid, this is nowhere near possible with precincts on a real map. However, there are algorithmic techniques which can quickly generate thousands or millions of valid districting plans for the purpose of comparing a particular district plan of interest to an ensemble.

mggg.org/metagraph/5x5.html
mggg.org/metagraph/5x5.html