1) Foundations of Probability

Vitor Kamada

econometrics.methods@gmail.com

Last updated 9-13-2020

1.1) What is the probability of getting a Head (H) or Tail (T), when you flip a coin?

Let’s visualize the concept of mutually exclusive, using Venn diagrams.

First, let’s import Python libraries to draw Venn diagrams.

from matplotlib import pyplot as plt
from matplotlib_venn import venn2

Let’s plot the events Head (H) and Tail (T) with respective probabilities:

\[P(H)=\frac{1}{2}\]
\[P(T)=\frac{1}{2}\]
venn2( (1/2, 1/2, 0), set_labels = ('Head', 'Tail') )
<matplotlib_venn._common.VennDiagram at 0x16c51254848>
_images/1)_Foundations_of_Probability_6_1.png

Disjoint events are mutually exclusive, if they have no outcomes in common.

When you flip a coin, the events Head (H) and Tail (T) are mutually exclusive.

If \(H\) and \(T\) are disjoint events, then

\[ P(H\cup T) = P(H) + P(T) \]
\[ = \frac{1}{2} + \frac{1}{2}\]
\[ = 1 \]

1.2) A fair coin is tossed 3 times. Find the probability that the sequence (Tail, Head, and Tail) is obtained.

Two events A and B are said to be independent if the outcome of event A doesn’t affect the outcome of event B and vice versa. Each time that a coin is tossed, there is no reason to believe that the result of one toss can affect the result of other toss. Therefore, the 3 events are independent from each other. In this case, we multiply the probability of each event.

\[P(T) \cdot P(H)\cdot P(T)\]
\[\frac{1}{2} \cdot \frac{1}{2} \cdot \frac{1}{2}=\frac{1}{8}\]

1.3) A coin is tossed twice at random. What is the probability of getting at least one Tail?

The possible outcomes are HH, HT, TH, TT.

Number of outcomes having at least one Tail: HT, TH, TT.

The probability of getting at least one Tail is \(\frac{3}{4}\).

1.4) In a regular deck, what is the probability of getting a diamond card or face card?

For any two events A and B, the probability that one or the other occurs is the sum of the probability minus the probability of their intersection:

\[ P(A\cup B) = P(A) + P(B) - P(A\cap B) \]
venn2( (10, 9, 3) )
<matplotlib_venn._common.VennDiagram at 0x16c512e0c48>
_images/1)_Foundations_of_Probability_14_1.png
\[ P(\diamondsuit) + P(face\ card) - P(\diamondsuit \cap face\ card)\]
\[ \frac{13}{52} + \frac{12}{52} - \frac{3}{52} \]
sum = 13/52 + 12/52 - 3/52
round(sum, 2)
0.42

1.5) In a regular deck, is the event getting a heart card independent from the event getting an Ace?

Two events are independent if the occurrence of one does not affect the chances for the occurrence of the other.

There is no reason to believe that getting a heart card can affect the probability of getting an Ace, and vice-versa.

Two events A and B are independent if the probability that both A and B occur is the product of the probabilities of the two events:

\[ P(A\cap B) = P(A)\cdot P(B) \]
\[ P(\heartsuit \cap Ace) = P(\heartsuit)\cdot P(Ace) \]
\[ \frac{1}{52} = \frac{1}{4}\cdot \frac{1}{13} \]

\(P(\heartsuit \cap Ace) = P(Ace\ of\ Hearts) = \frac{1}{52}\)

\(P(\heartsuit) = \frac{13}{52} = \frac{1}{4}\)

\(P(Ace) = \frac{4}{52} = \frac{1}{13}\)

Therefore, these two events are independent and can be written:

\[ P(\heartsuit) \perp P(Ace)\]

1.6) According with Venn diagram below, events A and B are independent?

venn2( (1/2, 1/2, 0) )
<matplotlib_venn._common.VennDiagram at 0x16c51339588>
_images/1)_Foundations_of_Probability_21_1.png

No. Events A and B are dependents.

If I have the information that event A occurred, I know for sure that event B didn’t occur, and vice-versa.

It was given that \(P(A) = 0.5\) and \(P(B) = 0.5\).

\[0 = P(A\cap B) \neq P(A)\cdot P(B) = \frac{1}{4}\]

1.7) When P(A|B) = P(A)?

By definition, the probability of event A given that the event B occurs is:

\[P(A|B) = \frac{P(A\cap B)}{P(B)}\]

If A and B are independent, then \(P(A\cap B) = P(A)\cdot P(B)\)

\[P(A|B) = \frac{P(A)\cdot P(B)}{P(B)}\]
\[=P(A)\]

Intuition: Knowledge of event B doesn’t affect the probability of event A.

1.8) What is the probability that the total of two dice will be greater than 9, given that the first die is a 5?

Let A = first die is 5.

Let B = total of two dice is greater than 9.

\[P(A)=\frac{1}{6}\]

Possible outcomes for A and B: (5, 5), (5, 6):

\[P(A \ and \ B)=\frac{2}{36}=\frac{1}{18}\]

The asked probability is:

\[P(B|A)=\frac{P(A \ and \ B)}{P(A)}\]
\[=\frac{1/18}{1/6}=\frac{1}{3}\]

1.9) Why do you believe that the probability of getting a Tail in a fair coin is equal to \(\frac{1}{2}\)?

Probability = Long Run Relative Frequency

The Law of Large Numbers (LLN): The relative frequency of an outcome converges to a number, as the number of observed outcomes increases.

LLN assumes that events are independent.

If you toss a coin thousands of times, the proportion of Tails will be likely closer to \(\frac{1}{2}\).

Let’s toss a coin 10,000 times and count how many Tails:

# Library to generate random numbers
import random

# Simulation
result= []

for toss in range(10000):
   coin = random.choice(['Head', 'Tail']) 
   result.append(coin)

result.count('Tail')
5015

1.10) What is the number of trials for the Law of Large Numbers (LLN) to kick in?

The question asked when the relative frequency of an outcome will be “enough closer” to the theoretical probability. It depends on your specific problem. In the case of flipping a coin, above 500 tosses, a “reasonable” approximation to the theoretical probability (1/2) can be obtained. See the simulation below.

# Simulation
n=1
prob=[]
flip=[]

while n < 2000:
    head = 0
    tail = 0

    for i in range(n):
        if random.randint(0,1) == 0:
            head+=1 
        else:
            tail+=1

    k = head/(head+tail)
    prob.append(k)
    flip.append(n)
    n+= 1
# Plot chart
import plotly.express as px
fig = px.scatter( x = flip, y = prob,
             labels={"x": "Number of Flips",
                     "y": "Probability of Heads"} )

# Horizontal Line
fig.update_layout( shapes = [
    dict(
      type= 'line',
      yref= 'y', y0= 0.5, y1= 0.5,
      xref= 'x', x0= 0, x1= 2000    )])
                
                
fig.show()

1.11) If a female tests positive for breast cancer, what is the probability that she in fact has cancer?

The question asked what the probability of a female has breast cancer conditional on she tested positive:

\[P(Cancer|Test\ Positive)\]
\[= \frac{P(Cancer\cap Test \ Positive)}{P(Test \ Positive)}\]

Based on data from Banks et al. (2004), the probability of a mammography detects cancer is 0.87. Among females without cancer, the probability for a negative result is 0.97. The overall incidence of breast cancer is 0.003.

We can calculate that:

\[P(Cancer\cap Test \ Positive)\]
\[= P(Cancer)\cdot P(Test \ Positive|Cancer)\]
\[= 0.003\cdot 0.87\]
\[= 0.00261\]
\[P(No \ Cancer\cap Test \ Negative)\]
\[= P(No \ Cancer)\cdot P(Test \ Negative\ |No \ Cancer)\]
\[= 0.997\cdot 0.97\]
\[= 0.96709\]

It is easy to organize the information in a table:

Positive

Negative

Total

Cancer

0.00261

0.003

No Cancer

0.96709

0.997

Total

1

We can infer the other values of the table, as the sum of the joint probabilities (inside) is equal to the marginal probability on the border.

Positive

Negative

Total

Cancer

0.00261

0.00039

0.003

No Cancer

0.02991

0.96709

0.997

Total

0.03252

0.96748

1

\[P(Cancer|Test\ Positive)\]
\[= \frac{0.00261}{0.03252}\]
\[=0.0803\]

The probability that a female, who tested positive, has breast cancer is only 8%. This result is counter-intuitive, because humans do not process well events with very low probability. Unless, a person is educated in the probability laws, she might believe that she was cured by a divine force.

1.12) Is the test result independent of the disease?

Let’s create a dataframe (df).

disease = ['Cancer', 'No Cancer']
positive = [0.00261, 0.02991]
negative = [0.00039, 0.96709]

columns = {'Disease':disease, 'Positive': positive,
           'Negative': negative}

import pandas as pd
df = pd.DataFrame(columns)

df['Total'] = df['Positive'] + df['Negative']
df
Disease Positive Negative Total
0 Cancer 0.00261 0.00039 0.003
1 No Cancer 0.02991 0.96709 0.997

The test result is not independent of the disease, because:

\[P(Positive)=0.03252\]
\[P(No\ Cancer)=0.997\]
\[P(Positive) \cdot P(No\ Cancer)=0.03242244\]
\[P(Positive \cap No \ Cancer)=0.02991\]

Just one counter example is enough to prove that the test result and disease are dependent events.

\[P(Positive) \cdot P(No\ Cancer) \neq P(Positive \cap No \ Cancer)\]
0.03252*0.997
0.03242244

Let’s get additional intuition, transforming the table. Let’s take the joint probabilities and divide by the marginal probabilities.

Without information about the test result, an educated guess is that the probability of random female has breast cancer is 0.3%. But if I receive the additional information that this female was tested positive, I would update her probability of having cancer to 8%. In this case, the conditional probability \(P(Cancer|Positive)\) is bigger than the marginal probability \(P(Cancer)\). Bigger this difference, more relevant is the new information and higher is the degree of dependency between test result and disease.

# joint probabilities / marginal probabilities
df['Negative'] = df['Negative']/df['Negative'].sum()
df['Positive'] = df['Positive']/df['Positive'].sum()
df
Disease Positive Negative Total
0 Cancer 0.080258 0.000403 0.003
1 No Cancer 0.919742 0.999597 0.997

Let’s assume a situation, where the test result is useless, that is, the test result is independent from the disease. We would see the cooked numbers below in the table.

df['Negative'] = df['Total']
df['Positive'] = df['Total']
df
Disease Positive Negative Total
0 Cancer 0.003 0.003 0.003
1 No Cancer 0.997 0.997 0.997

Exercises

1| Disjoint events are not equal independent events. Define in plain English the difference between both concepts.

2| Assume a fair coin: P(H) = 0.5 and P(T) = 0.5. Is the event getting a Tail independent from the event getting a Head? Justify your answer rigorously.

3| Let A = {The roll of a die is odd} and B = {The roll of a die is even}. Are the two events mutually exclusive? Are the two events independent? Justify.

4| Vitor gave his class two tests. 25% of the class passed both tests and 42% of the class passed the first test. What percent of those who passed the first test also passed the second test?

5| Write a code to roll a die 60,000 times and count how many times the face 3 show up.

6| Let A = {student is absent} and C = {student has coronavirus}. Decide if each statement is True or False. Justify your choice.

a) If the probability for a student to be absent is greater than the probability for a student has coronavirus, then \(P(A|C) > P(C|A)\).

b) The probability that a student has coronavirus when it is known that the student is absent is equal to the probability that a student is absent when it is known that the student has coronavirus.

c) If the probability for a student to be absent is greater than the probability for a student to be sick, then A and C are dependent events.

d) The probability of a student being absent is greater than the probability that the student is absent given that the student has coronavirus.

7| Events A and B are independent. Draw both events, using the function ‘venn2’. Use the numbers of your Venn diagram to prove that A and B are independent.

Reference

Adhikari, A., Pitman, J. (2020). Probability for Data Science.

Banks, Emily et al. (2004). Influence of personal characteristics of individual women on sensitivity and specificity of mammography in the Million Women Study: cohort study. BMJ (Clinical research ed.) vol. 329,7464: 477. doi:10.1136/bmj.329.7464.477

Diez, D. M., Barr, C. D., Çetinkaya-Rundel, M. (2014). Introductory Statistics with Randomization and Simulation.