2) Random Variables¶
econometrics.methods@gmail.com
Last updated: 9-16-2020
2.1) What is a random variable?¶
It is a statistical model that describes uncertain outcome of a random process.
Let’s model stock price as a random variable X:
stock = ['Increases', 'Stays same','Decreases']
x = [10, 0, -9]
prob = [0.3, 0.5, 0.2]
table = {'Stock Price':stock, 'x': x, 'P(X=x)':prob }
import pandas as pd
X = pd.DataFrame(table)
X
Stock Price | x | P(X=x) | |
---|---|---|---|
0 | Increases | 10 | 0.3 |
1 | Stays same | 0 | 0.5 |
2 | Decreases | -9 | 0.2 |
The capital “X” stands for the random variable; whereas the lower case “x” indicates the possible outcomes (10, 0, -9). The statement \(P(X=x_i)\) means the probability of the outcome \(x_i\). For example, \(P(X=x_0) = P(X=10) = 0.3\).
2.2) How to calculate the expected value of a random variable X, denoted by E(X)?¶
Expected value is a weighted average that uses probabilities to weight the possible outcomes.
Let’s calculate the mean (\(\mu\)) or expected value of the previous example random variable (X):
Therefore, the average return of this stock X is $1.2.
Let’s show the calculations step by step:
X['x*P(X=x)'] = X['x']*X['P(X=x)']
X
Stock Price | x | P(X=x) | x*P(X=x) | |
---|---|---|---|---|
0 | Increases | 10 | 0.3 | 3.0 |
1 | Stays same | 0 | 0.5 | 0.0 |
2 | Decreases | -9 | 0.2 | -1.8 |
Sum up the rows of ‘x*P(X=x)’:
sum(X['x*P(X=x)'])
1.2
2.3) How to calculate the variance of X, denoted by Var(X)?¶
The variance of a random variable X is the expected value of the squared deviation from its mean \(\mu\):
X['x-mu'] = X['x'] - sum(X['x*P(X=x)'])
X['(x-mu)^2'] = X['x-mu']*X['x-mu']
X['[(x-mu)^2]*P(X=x)'] = X['(x-mu)^2']*X['P(X=x)']
X
Stock Price | x | P(X=x) | x*P(X=x) | x-mu | (x-mu)^2 | [(x-mu)^2]*P(X=x) | |
---|---|---|---|---|---|---|---|
0 | Increases | 10 | 0.3 | 3.0 | 8.8 | 77.44 | 23.232 |
1 | Stays same | 0 | 0.5 | 0.0 | -1.2 | 1.44 | 0.720 |
2 | Decreases | -9 | 0.2 | -1.8 | -10.2 | 104.04 | 20.808 |
# Round 2 decimals
%precision 2
varX = sum(X['[(x-mu)^2]*P(X=x)'])
varX
44.76
Variance is a measure of variability around the mean. It is hard to interpret 44.76, because the measurement unit is the square of the measurement unit ($) of the random variable.
2.4) How to calculate the standard deviation of X, denoted by SD(X) or \(\sigma\)?¶
Standard Deviation is the square root of the variance.
$$ $6.7 $$
sdX = varX**(1/2)
sdX
6.69
The standard deviation, \(\sigma = 6.7\), has the same unit ($) of the random variable. Therefore, it is easy to interpret.
One \(\sigma\) above the mean (\(\mu\)) or below the mean (\(\mu\)) is a very likely outcome.
The standard deviation is a measure of variability around the mean. Bigger the number, bigger the variation.
In Finance, it is a proxy for risk. You want to minimize risk (\(\sigma\)) and maximize return (\(\mu\)).
2.5) Prove that \(E(cX) = cE(X)\), where c is a constant.¶
In the previous examples, I was using the index \(i\) starting from 0. Let’s start counting from 1 but remember that Python starts counting from 0.
By definition:
Then:
2.6) Prove that \(Var(cX)=c^2Var(X)\), where c is a constant.¶
By definition:
Then:
2.7) Prove that \(E(X+c)=E(X)+c\), where c is a constant.¶
As probability must sum up to 1:
Then:
2.8) Prove that \(E(c)=c\), where c is constant.¶
By definition: \($ E(X) = x_1p_1+x_2p_2+...+x_np_n\)$
Then:
2.9) Prove that \(Var(c) = 0\), where c is a constant.¶
By definition:
Then:
Intuition: By definition a constant has no variation.
2.10) Draw the distribution or probability mass function (PMF) of a random variable \(X\), that represents one roll from a fair six-sided die.¶
# Library to plot the chart
import matplotlib.pyplot as plt
# Function to plot the chart
def plot_pmf(xs, probs, rv_name='X'):
plt.plot(xs, probs, 'ro', ms=12, mec='b', color='r')
plt.vlines(xs, 0, probs, colors='g', lw=4)
plt.xlabel('$x$')
plt.ylabel('$P(X = x)$')
plt.ylim(0, 1)
plt.title('Probability Mass Function of $X$');
import numpy as np
# Generate x-axis and y-axis
xk = np.arange(1, 7)
pk = (1/6, 1/6, 1/6, 1/6, 1/6, 1/6)
plot_pmf(np.arange(1, 7), np.repeat(1/6, 6))
plt.yticks(np.linspace(0, 1, 7),
('0', r'$\frac{1}{6}$', r'$\frac{2}{6}$', r'$\frac{3}{6}$',
r'$\frac{4}{6}$', r'$\frac{5}{6}$', '1'));
2.11) Show that \(Var(X)=E(X^2) -[E(X)]^2\).¶
2.12) What is the Bernoulli random variable?¶
It is binary random variable, that can take only two values: 1 and 0. For example, we can simulate a biased coin with \(P(Tail) = p\). Let \(X = 1\) if the coin flip is tail, and \(X = 0\) if the coin flip is head.
Outcome |
x |
P(X=x) |
---|---|---|
Tail |
1 |
\(p\) |
Head |
0 |
\(1-p\) |
The expected value of \(X\) is:
The variance of \(X\) is:
The standard deviation of \(X\) is:
Exercises¶
1| Let the random variable \(X\) be the number of heads in three tosses.
The outcome space is:
The probability distribution of \(X\) is:
\(\text{Possible value of } x\) |
\(~~0~~\) |
\(~~1~~\) |
\(~~2~~\) |
\(~~3~~\) |
---|---|---|---|---|
\(P(X = x)\) |
\(1/8\) |
\(3/8\) |
\(3/8\) |
\(1/8\) |
a) What is the chance of getting more than one head, \(P(X>1)\)?
b) What is the chance of getting two or less than two heads, \(P(X \leq 2)\)?
2| Let \(Y\) be a random variable that represents a stock.
stock = ['Increases', 'Stays same','Decreases']
y = [5, 0, -3]
prob = [0.4, 0.1, 0.5]
table = {'Stock Price':stock, 'y': y, 'P(Y=y)':prob }
import pandas as pd
Y = pd.DataFrame(table)
Y
Stock Price | y | P(Y=y) | |
---|---|---|---|
0 | Increases | 5 | 0.4 |
1 | Stays same | 0 | 0.1 |
2 | Decreases | -3 | 0.5 |
a) Calculate the \(E(Y)\).
b) Calculate \(Var(Y)\).
c) Calculate the \(SD(Y)\).
3| Let \(D\) be a random variable that represents the roll of a single fair six-sided die.
a) Calculate the expected value of \(D\).
b) Calculate the standard deviation of \(D\).
4| Prove that \(Var(X+c)=Var(X)\), where c is a constant.
Reference¶
Adhikari, A., Pitman, J. (2020). Probability for Data Science.
Diez, D. M., Barr, C. D., Çetinkaya-Rundel, M. (2014). Introductory Statistics with Randomization and Simulation.
Lau, S., Gonzalez, J., Nolan, D. (2020). Principles and Techniques of Data Science.