Skip to main content
Logo image

Math for the People

Exercises 8.9 Exercises

For the following three exercises, edit the blocks of code in Section 8.5 to show results for stopped drivers in Philadelphia rather than in Hartford.

1.

What was the mean age of stopped drivers in Philadelphia during this time? What about the median? Create a histogram of the age variable from the philadelphia data.

2.

Create a table and bar chart of the sex variable from the philadelphia data.

3.

In Subsubsection 8.5.2.2 we analyzed how often white and Black drivers were searched after they were stopped, and how often contraband was found among those who were searched, for stopped Hartford drivers. Apply the same techniques to the Philadelphia drivers. Were stopped white or Black drivers searched at a higher rate? Was contraband found at a higher rate among white or Black searched drivers?
For the next two exercises use Bayes’ theorem and the Colab notebook.

4.

Recent data has estimated the worldwide percentage of Spam emails as 28.5% [8.10.155]. A new software company states that their product can detect 98% of emails as spam. Sometimes (2%) of the time, the filter incorrectly labels non-spam emails as spam (false positive). With these percentages in mind, what is the true probability that an email, if labeled spam, is actually a non-spam email? (Hint: there are many ways you can approach this, but it may make sense to use A to model an event that an email is labeled spam, and B to represent that the email actually is spam.)

5.

Are white motorists more likely to have a warning issued than Hispanic motorists? Use the Colab notebook to answer this question.

6.

The code below gives the tables of how many motorists who were stopped in Hartford were of each race, and how likely those individuals were to have contraband:

(a)

Find the probability that a motorist who is stopped is black, \(P(black)\text{,}\) and the probability that a motorist who is stopped is white, \(P(white)\text{.}\)

(b)

Find the probability that a stopped motorist who is white has contraband, \(P(contraband|white)\)

(c)

Find the probability that a stopped motorist who is black has contraband, \(P(contraband|black)\)

(d)

Use your answers from the previous problems and Bayes’ Theorem to find the probability that someone who has contraband is white, \(P(white|contraband)\) and the probability that they are black, \(P(black|contraband)\text{.}\)

7.

For this collaborative assignment, the class will look at racism & policing across a variety of cities.
We will use the code cells below. Your professor will also post a link to a shared document where you can post your results.
We’ll look at data from the following 10 cities:
  • Burlington, VT
  • Camden, NJ
  • Little Rock, AR
  • Long Beach, CA
  • Louisville, KY
  • Madison, WI
  • New Orleans, LA
  • Oakland, CA
  • Philadelphia, PA
  • Raleigh, NC
We’re going to look at the probabilities around the race of stopped motorists, and whether they were issued a citation.
Choose two cities to work on and add your name to the list of contributors for those cities in the shared document. You should calculate the following probabilities for each racial group listed in the table:
  • The probability a stopped motorist is of that race.
  • The probability that a stopped motorist of that race gets a citation, P(citation|race).
  • The probability that a motorist who gets a citation is of that race, P(race|citation).
The first two probabilities can be calculated from the tables, and the final probability can be calculated using Bayes’ theorem. You also will need to add a graph of the racial breakdown of stopped motorists, and find the actual racial breakdown of the city’s population (or the metropolitan area, since motorists who are stopped may not live in the city itself).
Here’s an example of Hartford, CT so that you can see how it is done.
First, we’ll calculate the probabilities we need for Hartford. Let’s get the data from the table:
To calculate the probability that a stopped motorist is of a particular race, we need to add up all the categories to see how many stops there were total. In this case, there were
\begin{equation*} 176+7104+5072+29+6054=18435 \end{equation*}
Now we can calculate the probability that a stopped motorist was of a particular race. For example, let’s look at Hispanic motorists - what is the probability that a motorist who was stopped is Hispanic?
\frac{5072}{18435} = 0.2751
27.51% of stopped motorists in Hartford were Hispanic. If you look online, you’ll see that the Hartford metropolitan area is 16% Hispanic, so the percentage of stopped motorists who are Hispanic is much higher than you would expect.
We can complete these calculations for the other racial groups listed. We get 0.95% Asian American and Pacific Islander (AAPI), 38.54% Black, 0.15% Other, and 32.84% White. How do these compare to the racial demographics of the Hartford metropolitan area?
Now let’s look at our next calculation - the probability that someone of a particular race receives a citation if they are stopped. Here, since we’re looking at a conditional probability, we only want to divide by the total number of stopped motorists who were of a particular race:
\begin{equation*} P(citation|White) = \frac{4831}{6054} = 0.7980 \end{equation*}
There is a 79.8% chance that a white motorist who has been stopped receives a citation. We can repeat this for the other groups, and we get 73.3% for AAPI motorists, 58.48% for Black motorists, 52.80% for Hispanic motorists, and 58.62% for motorists of other races. These percentages are all pretty different - what could explain the differences?
Finally, we need to calculate the \(P(race|citation)\text{.}\) For this, we’ll use Bayes’ Theorem.
\begin{equation*} P(Black|citation) = \frac{P(citation|Black)P(Black)}{P(Citation|AAPI)P(AAPI) + P(Citation|Black)P(Black) + P(Citation|Hispanic)P(Hispanic)+P(citation|Other)P(Other) + P(citation|White)P(White)} \end{equation*}
\begin{equation*} P(Black|citation) = \frac{0.5849*0.3854}{0.7330*0.0095 + 0.5849*0.3854 + 0.5280*0.2751 + 0.5862*0.0016 + 0.7980*0.3284} = 0.3518 \end{equation*}
So 35.18% of stopped motorists who receive citations are Black. We can repeat this calculation for the other groups as well, getting 1.09% for AAPI motorists, 22.68% for Hispanic motorists, 0.14% for other racial groups, and 40.91% for White motorists.
Each member contributing to a particular city should do or check all of the calculations for that city. You can also check the answers and offer to help your fellow students working on the other cities. After everyone is done with their calculations, the class will go over the calculations and discuss the results. The contributors from each group will discuss the results they found, and then the class will discuss as a whole what this tells us about racism and policing stops in these cities.
For the discussion in class, you may wish to research other information about your city, like the racial makeup of the city or other facts about the city.
Here is a code cell which loads in the data for the other cities. There is a lot of data in these files, so delete the lines for the cities that you are not using to save yourself time!