For the next two exercises use Bayes’ theorem and the Colab notebook.
7.
For this collaborative assignment, the class will look at racism & policing across a variety of cities.
We will use the code cells below. Your professor will also post a link to a shared document where you can post your results.
We’ll look at data from the following 10 cities:
Burlington, VT
Camden, NJ
Little Rock, AR
Long Beach, CA
Louisville, KY
Madison, WI
New Orleans, LA
Oakland, CA
Philadelphia, PA
Raleigh, NC
We’re going to look at the probabilities around the race of stopped motorists, and whether they were issued a citation.
Choose two cities to work on and add your name to the list of contributors for those cities in the shared document. You should calculate the following probabilities for each racial group listed in the table:
The probability a stopped motorist is of that race.
The probability that a stopped motorist of that race gets a citation, P(citation|race).
The probability that a motorist who gets a citation is of that race, P(race|citation).
The first two probabilities can be calculated from the tables, and the final probability can be calculated using Bayes’ theorem. You also will need to add a graph of the racial breakdown of stopped motorists, and find the actual racial breakdown of the city’s population (or the metropolitan area, since motorists who are stopped may not live in the city itself).
Here’s an example of Hartford, CT so that you can see how it is done.
First, we’ll calculate the probabilities we need for Hartford. Let’s get the data from the table:
To calculate the probability that a stopped motorist is of a particular race, we need to add up all the categories to see how many stops there were total. In this case, there were
\begin{equation*}
176+7104+5072+29+6054=18435
\end{equation*}
Now we can calculate the probability that a stopped motorist was of a particular race. For example, let’s look at Hispanic motorists - what is the probability that a motorist who was stopped is Hispanic?
\frac{5072}{18435} = 0.2751 27.51% of stopped motorists in Hartford were Hispanic. If you look online, you’ll see that the Hartford metropolitan area is 16% Hispanic, so the percentage of stopped motorists who are Hispanic is much higher than you would expect.
We can complete these calculations for the other racial groups listed. We get 0.95% Asian American and Pacific Islander (AAPI), 38.54% Black, 0.15% Other, and 32.84% White. How do these compare to the racial demographics of the Hartford metropolitan area?
Now let’s look at our next calculation - the probability that someone of a particular race receives a citation if they are stopped. Here, since we’re looking at a conditional probability, we only want to divide by the total number of stopped motorists who were of a particular race:
\begin{equation*}
P(citation|White) = \frac{4831}{6054} = 0.7980
\end{equation*}
There is a 79.8% chance that a white motorist who has been stopped receives a citation. We can repeat this for the other groups, and we get 73.3% for AAPI motorists, 58.48% for Black motorists, 52.80% for Hispanic motorists, and 58.62% for motorists of other races. These percentages are all pretty different - what could explain the differences?
Finally, we need to calculate the \(P(race|citation)\text{.}\) For this, we’ll use Bayes’ Theorem.
\begin{equation*}
P(Black|citation) = \frac{P(citation|Black)P(Black)}{P(Citation|AAPI)P(AAPI) + P(Citation|Black)P(Black) + P(Citation|Hispanic)P(Hispanic)+P(citation|Other)P(Other) + P(citation|White)P(White)}
\end{equation*}
\begin{equation*}
P(Black|citation) = \frac{0.5849*0.3854}{0.7330*0.0095 + 0.5849*0.3854 + 0.5280*0.2751 + 0.5862*0.0016 + 0.7980*0.3284} = 0.3518
\end{equation*}
So 35.18% of stopped motorists who receive citations are Black. We can repeat this calculation for the other groups as well, getting 1.09% for AAPI motorists, 22.68% for Hispanic motorists, 0.14% for other racial groups, and 40.91% for White motorists.
Each member contributing to a particular city should do or check all of the calculations for that city. You can also check the answers and offer to help your fellow students working on the other cities. After everyone is done with their calculations, the class will go over the calculations and discuss the results. The contributors from each group will discuss the results they found, and then the class will discuss as a whole what this tells us about racism and policing stops in these cities.
For the discussion in class, you may wish to research other information about your city, like the racial makeup of the city or other facts about the city.
Here is a code cell which loads in the data for the other cities. There is a lot of data in these files, so delete the lines for the cities that you are not using to save yourself time!