{
"nbformat": 4,
"nbformat_minor": 0,
"metadata": {
"colab": {
"name": "2)_Random_Variables.ipynb",
"provenance": []
},
"kernelspec": {
"display_name": "Python 3",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.7.6"
}
},
"cells": [
{
"cell_type": "markdown",
"metadata": {
"colab_type": "text",
"id": "3xG4g-yFKnhm"
},
"source": [
"# 2) Random Variables"
]
},
{
"cell_type": "markdown",
"metadata": {
"colab_type": "text",
"id": "q-cWOMku-lx_"
},
"source": [
"[Vitor Kamada](https://www.linkedin.com/in/vitor-kamada-1b73a078)\n",
"\n",
"econometrics.methods@gmail.com\n",
"\n",
"Last updated: 9-16-2020"
]
},
{
"cell_type": "markdown",
"metadata": {
"colab_type": "text",
"id": "JX0o6kS_g2mP"
},
"source": [
"#### 2.1) What is a random variable?"
]
},
{
"cell_type": "markdown",
"metadata": {
"colab_type": "text",
"id": "-ABczChsiUmx"
},
"source": [
"It is a statistical model that describes uncertain outcome of a random process. \n",
"\n",
"Let's model stock price as a random variable X:"
]
},
{
"cell_type": "code",
"metadata": {
"colab_type": "code",
"id": "q2PQ_0WH5F-u",
"colab": {
"base_uri": "https://localhost:8080/",
"height": 139
},
"outputId": "1625df69-6918-47af-c834-61d51cbe3af2"
},
"source": [
"stock = ['Increases', 'Stays same','Decreases']\n",
"x = [10, 0, -9]\n",
"prob = [0.3, 0.5, 0.2]\n",
"table = {'Stock Price':stock, 'x': x, 'P(X=x)':prob }\n",
"import pandas as pd\n",
"X = pd.DataFrame(table)\n",
"X"
],
"execution_count": 28,
"outputs": [
{
"output_type": "execute_result",
"data": {
"text/html": [
"
\n",
"\n",
"
\n",
" \n",
" \n",
" \n",
" Stock Price \n",
" x \n",
" P(X=x) \n",
" \n",
" \n",
" \n",
" \n",
" 0 \n",
" Increases \n",
" 10 \n",
" 0.3 \n",
" \n",
" \n",
" 1 \n",
" Stays same \n",
" 0 \n",
" 0.5 \n",
" \n",
" \n",
" 2 \n",
" Decreases \n",
" -9 \n",
" 0.2 \n",
" \n",
" \n",
"
\n",
"
"
],
"text/plain": [
" Stock Price x P(X=x)\n",
"0 Increases 10 0.3\n",
"1 Stays same 0 0.5\n",
"2 Decreases -9 0.2"
]
},
"metadata": {
"tags": []
},
"execution_count": 28
}
]
},
{
"cell_type": "markdown",
"metadata": {
"colab_type": "text",
"id": "S4L4ihVq5JmT"
},
"source": [
"The capital \"X\" stands for the random variable; whereas the lower case \"x\" indicates the possible outcomes (10, 0, -9). The statement $P(X=x_i)$ means the probability of the outcome $x_i$. For example, $P(X=x_0) = P(X=10) = 0.3$."
]
},
{
"cell_type": "markdown",
"metadata": {
"colab_type": "text",
"id": "aVp5JdjwsPgS"
},
"source": [
"#### 2.2) How to calculate the expected value of a random variable X, denoted by E(X)?"
]
},
{
"cell_type": "markdown",
"metadata": {
"colab_type": "text",
"id": "VDLxIGPls18q"
},
"source": [
"Expected value is a weighted average that uses probabilities to weight the possible outcomes.\n",
"\n",
"Let's calculate the mean ($\\mu$) or expected value of the previous example random variable (X):\n",
"\n",
"$$ \\mu = E(X) = \\sum_{i=0}^{n} x_iP(X=x_i) $$\n",
"\n",
"$$ x_0P(X=x_0)+...+x_nP(X=x_n)$$\n",
"\n",
"$$ 10*0.3+0*0.5-9*0.2$$\n",
"\n",
"$$ 3+0-1.8 = 1.2$$\n",
"\n",
"Therefore, the average return of this stock X is $1.2.\n",
"\n",
"Let's show the calculations step by step:"
]
},
{
"cell_type": "code",
"metadata": {
"colab_type": "code",
"id": "FDltDYfx2ud-",
"colab": {
"base_uri": "https://localhost:8080/",
"height": 139
},
"outputId": "839e3a7b-e351-43d7-a12f-036a0241f059"
},
"source": [
"X['x*P(X=x)'] = X['x']*X['P(X=x)']\n",
"X"
],
"execution_count": 29,
"outputs": [
{
"output_type": "execute_result",
"data": {
"text/html": [
"\n",
"\n",
"
\n",
" \n",
" \n",
" \n",
" Stock Price \n",
" x \n",
" P(X=x) \n",
" x*P(X=x) \n",
" \n",
" \n",
" \n",
" \n",
" 0 \n",
" Increases \n",
" 10 \n",
" 0.3 \n",
" 3.0 \n",
" \n",
" \n",
" 1 \n",
" Stays same \n",
" 0 \n",
" 0.5 \n",
" 0.0 \n",
" \n",
" \n",
" 2 \n",
" Decreases \n",
" -9 \n",
" 0.2 \n",
" -1.8 \n",
" \n",
" \n",
"
\n",
"
"
],
"text/plain": [
" Stock Price x P(X=x) x*P(X=x)\n",
"0 Increases 10 0.3 3.0\n",
"1 Stays same 0 0.5 0.0\n",
"2 Decreases -9 0.2 -1.8"
]
},
"metadata": {
"tags": []
},
"execution_count": 29
}
]
},
{
"cell_type": "markdown",
"metadata": {
"colab_type": "text",
"id": "DCNEeZPR7AuD"
},
"source": [
"Sum up the rows of 'x*P(X=x)':"
]
},
{
"cell_type": "code",
"metadata": {
"colab_type": "code",
"id": "zOZGp-Bh6zIW",
"colab": {
"base_uri": "https://localhost:8080/",
"height": 34
},
"outputId": "037cdbca-abfb-4644-80ed-96a661384440"
},
"source": [
"sum(X['x*P(X=x)'])"
],
"execution_count": 30,
"outputs": [
{
"output_type": "execute_result",
"data": {
"text/plain": [
"1.20"
]
},
"metadata": {
"tags": []
},
"execution_count": 30
}
]
},
{
"cell_type": "markdown",
"metadata": {
"colab_type": "text",
"id": "F1c7PWMZ3svM"
},
"source": [
"#### 2.3) How to calculate the variance of X, denoted by Var(X)?"
]
},
{
"cell_type": "markdown",
"metadata": {
"colab_type": "text",
"id": "4wljk7iyrUyY"
},
"source": [
"The variance of a random variable X is the expected value of the squared deviation from its mean $\\mu$:\n",
"\n",
"$$ \\sigma^2=E[(X-\\mu)^2] = Var(X)$$\n",
"\n",
"$$ \\sum_{i=0}^{n} (x_i - \\mu)^2 P(X=x_i) $$\n",
"\n",
"$$ (x_0 - \\mu)^2 P(X=x_0)+...+(x_n - \\mu)^2 P(X=x_n) $$\n",
"\n",
"$$ 77.44*0.3+1.44*0.5+104.04*0.2 $$\n",
"\n",
"$$ 23.232 + 0.720 + 20.808 $$\n",
"\n",
"$$ 44.76$$"
]
},
{
"cell_type": "code",
"metadata": {
"colab_type": "code",
"id": "g92JBJ35rHML",
"colab": {
"base_uri": "https://localhost:8080/",
"height": 139
},
"outputId": "4aac219f-6a3d-4e41-cc8d-6d1c9c062343"
},
"source": [
"X['x-mu'] = X['x'] - sum(X['x*P(X=x)'])\n",
"X['(x-mu)^2'] = X['x-mu']*X['x-mu']\n",
"X['[(x-mu)^2]*P(X=x)'] = X['(x-mu)^2']*X['P(X=x)']\n",
"X"
],
"execution_count": 31,
"outputs": [
{
"output_type": "execute_result",
"data": {
"text/html": [
"\n",
"\n",
"
\n",
" \n",
" \n",
" \n",
" Stock Price \n",
" x \n",
" P(X=x) \n",
" x*P(X=x) \n",
" x-mu \n",
" (x-mu)^2 \n",
" [(x-mu)^2]*P(X=x) \n",
" \n",
" \n",
" \n",
" \n",
" 0 \n",
" Increases \n",
" 10 \n",
" 0.3 \n",
" 3.0 \n",
" 8.8 \n",
" 77.44 \n",
" 23.23 \n",
" \n",
" \n",
" 1 \n",
" Stays same \n",
" 0 \n",
" 0.5 \n",
" 0.0 \n",
" -1.2 \n",
" 1.44 \n",
" 0.72 \n",
" \n",
" \n",
" 2 \n",
" Decreases \n",
" -9 \n",
" 0.2 \n",
" -1.8 \n",
" -10.2 \n",
" 104.04 \n",
" 20.81 \n",
" \n",
" \n",
"
\n",
"
"
],
"text/plain": [
" Stock Price x P(X=x) x*P(X=x) x-mu (x-mu)^2 [(x-mu)^2]*P(X=x)\n",
"0 Increases 10 0.3 3.0 8.8 77.44 23.23\n",
"1 Stays same 0 0.5 0.0 -1.2 1.44 0.72\n",
"2 Decreases -9 0.2 -1.8 -10.2 104.04 20.81"
]
},
"metadata": {
"tags": []
},
"execution_count": 31
}
]
},
{
"cell_type": "code",
"metadata": {
"colab_type": "code",
"id": "uLVAL8WCKZEB",
"colab": {
"base_uri": "https://localhost:8080/",
"height": 34
},
"outputId": "a5c0c5f2-67c3-4430-a6c9-398b65045b9e"
},
"source": [
"# Round 2 decimals\n",
"%precision 2\n",
"\n",
"varX = sum(X['[(x-mu)^2]*P(X=x)'])\n",
"varX"
],
"execution_count": 32,
"outputs": [
{
"output_type": "execute_result",
"data": {
"text/plain": [
"44.76"
]
},
"metadata": {
"tags": []
},
"execution_count": 32
}
]
},
{
"cell_type": "markdown",
"metadata": {
"colab_type": "text",
"id": "PR8NaVlgM2Uu"
},
"source": [
"Variance is a measure of variability around the mean. It is hard to interpret 44.76, because the measurement unit is the square of the measurement unit (\\$) of the random variable."
]
},
{
"cell_type": "markdown",
"metadata": {
"colab_type": "text",
"id": "oodtbfINOXSA"
},
"source": [
"#### 2.4) How to calculate the standard deviation of X, denoted by SD(X) or $\\sigma$?"
]
},
{
"cell_type": "markdown",
"metadata": {
"colab_type": "text",
"id": "-mbh7TUQrHfi"
},
"source": [
"Standard Deviation is the square root of the variance.\n",
"\n",
"$$ \\sigma = \\sqrt{\\sigma^2}=\\sqrt{Var(X)}$$\n",
"\n",
"$$ \\sqrt{44.76} $$ \n",
"\n",
"$$ \\$6.7 $$"
]
},
{
"cell_type": "code",
"metadata": {
"colab_type": "code",
"id": "qglfn6n3iTs5",
"colab": {
"base_uri": "https://localhost:8080/",
"height": 34
},
"outputId": "e51d6559-43ef-4064-e0bb-cef7510bfefc"
},
"source": [
"sdX = varX**(1/2)\n",
"sdX"
],
"execution_count": 33,
"outputs": [
{
"output_type": "execute_result",
"data": {
"text/plain": [
"6.69"
]
},
"metadata": {
"tags": []
},
"execution_count": 33
}
]
},
{
"cell_type": "markdown",
"metadata": {
"colab_type": "text",
"id": "hxwj75XRRIc4"
},
"source": [
"The standard deviation, $\\sigma = 6.7$, has the same unit (\\$) of the random variable. Therefore, it is easy to interpret. \n",
"\n",
"One $\\sigma$ above the mean ($\\mu$) or below the mean ($\\mu$) is a very likely outcome.\n",
"\n",
"The standard deviation is a measure of variability around the mean. Bigger the number, bigger the variation.\n",
"\n",
" In Finance, it is a proxy for risk. You want to minimize risk ($\\sigma$) and maximize return ($\\mu$)."
]
},
{
"cell_type": "markdown",
"metadata": {
"colab_type": "text",
"id": "NdTX9NuDKLsF"
},
"source": [
"#### 2.5) Prove that $E(cX) = cE(X)$, where c is a constant."
]
},
{
"cell_type": "markdown",
"metadata": {
"colab_type": "text",
"id": "X25OpQRciT-x"
},
"source": [
"In the previous examples, I was using the index $i$ starting from 0. Let's start counting from 1 but remember that Python starts counting from 0.\n",
"\n",
"By definition:\n",
"\n",
"$$ E(X) = x_1p_1+x_2p_2+...+x_np_n$$\n",
"\n",
"Then:\n",
"\n",
"$$ E(cX) = cx_1p_1+cx_2p_2+...+cx_np_n$$\n",
"\n",
"$$ c(x_1p_1+x_2p_2+...+x_np_n)$$\n",
"\n",
"$$ cE(X)$$\n",
"\n",
"\n"
]
},
{
"cell_type": "markdown",
"metadata": {
"colab_type": "text",
"id": "ieexrAZWMLpV"
},
"source": [
"#### 2.6) Prove that $Var(cX)=c^2Var(X)$, where c is a constant."
]
},
{
"cell_type": "markdown",
"metadata": {
"colab_type": "text",
"id": "AN6t9KaVMim_"
},
"source": [
"By definition:\n",
"\n",
"$$ Var(X)=E[(X-\\mu_x)^2] $$\n",
"\n",
"Then:\n",
"\n",
"$$ Var(cX)=E[(cX-c\\mu_x)^2] $$\n",
"\n",
"$$ E[c^2(X-\\mu_x)^2] $$\n",
"\n",
"$$ c^2E[(X-\\mu_x)^2] $$\n",
"\n",
"$$ c^2Var(X) $$\n"
]
},
{
"cell_type": "markdown",
"metadata": {
"colab_type": "text",
"id": "iMkWpDPGOV3f"
},
"source": [
"#### 2.7) Prove that $E(X+c)=E(X)+c$, where c is a constant.\n"
]
},
{
"cell_type": "markdown",
"metadata": {
"colab_type": "text",
"id": "Aw2D45ZlOWpb"
},
"source": [
"$$ E(X+c) = (x_1+c)p_1+(x_2+c)p_2+...+(x_n+c)p_n$$\n",
"\n",
"$$ x_1p_1+cp_1+x_2p_2+cp_2+...+x_np_n+cp_n$$\n",
"\n",
"$$ (x_1p_1+x_2p_2+...+x_np_n) + c(p_1+p_2+...+p_n)$$\n",
"\n",
"\n",
"As probability must sum up to 1:\n",
"\n",
" $$ \\sum_{i=1}^{n}p_i=1 $$\n",
"\n",
"Then:\n",
"\n",
"$$ E(X) + c$$\n",
"\n"
]
},
{
"cell_type": "markdown",
"metadata": {
"colab_type": "text",
"id": "N3bita8gSTUW"
},
"source": [
"#### 2.8) Prove that $E(c)=c$, where c is constant."
]
},
{
"cell_type": "markdown",
"metadata": {
"colab_type": "text",
"id": "iILJMshTSpLq"
},
"source": [
"By definition:\n",
"$$ E(X) = x_1p_1+x_2p_2+...+x_np_n$$\n",
"\n",
"Then:\n",
"\n",
"$$ E(c) = cp_1+cp_2+...+cp_n$$\n",
"\n",
"$$ c(p_1+p_2+...+p_n)$$\n",
"$$c$$"
]
},
{
"cell_type": "markdown",
"metadata": {
"colab_type": "text",
"id": "yQd1rvJ7UC1V"
},
"source": [
"#### 2.9) Prove that $Var(c) = 0$, where c is a constant."
]
},
{
"cell_type": "markdown",
"metadata": {
"colab_type": "text",
"id": "hSKU6k4vUP4c"
},
"source": [
"By definition:\n",
"\n",
"$$ Var(X)=E[(X-\\mu_x)^2] $$\n",
"\n",
"Then:\n",
"\n",
"$$ Var(c)=E[(c-\\mu_c)^2] $$\n",
"\n",
"$$ E[(c-c)^2] $$\n",
"\n",
"$$ E[(0)^2] $$\n",
"\n",
"$$0$$\n",
"\n",
"**Intuition:** By definition a constant has no variation.\n",
"\n"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "B9G_mxyRKEy9",
"colab_type": "text"
},
"source": [
"#### 2.10) Draw the distribution or probability mass function (PMF) of a random variable $X$, that represents one roll from a fair six-sided die."
]
},
{
"cell_type": "code",
"metadata": {
"id": "l0CD8X7cLC8V",
"colab_type": "code",
"colab": {}
},
"source": [
"# Library to plot the chart\n",
"import matplotlib.pyplot as plt\n",
"\n",
"# Function to plot the chart\n",
"def plot_pmf(xs, probs, rv_name='X'):\n",
" plt.plot(xs, probs, 'ro', ms=12, mec='b', color='r')\n",
" plt.vlines(xs, 0, probs, colors='g', lw=4)\n",
" plt.xlabel('$x$')\n",
" plt.ylabel('$P(X = x)$')\n",
" plt.ylim(0, 1)\n",
" plt.title('Probability Mass Function of $X$');"
],
"execution_count": 34,
"outputs": []
},
{
"cell_type": "code",
"metadata": {
"id": "f8qVhIUVLFZi",
"colab_type": "code",
"colab": {
"base_uri": "https://localhost:8080/",
"height": 297
},
"outputId": "77a8d486-11f5-4581-c44b-1f2e0595b071"
},
"source": [
"import numpy as np\n",
"\n",
"# Generate x-axis and y-axis\n",
"xk = np.arange(1, 7)\n",
"pk = (1/6, 1/6, 1/6, 1/6, 1/6, 1/6)\n",
"\n",
"plot_pmf(np.arange(1, 7), np.repeat(1/6, 6))\n",
"\n",
"plt.yticks(np.linspace(0, 1, 7),\n",
" ('0', r'$\\frac{1}{6}$', r'$\\frac{2}{6}$', r'$\\frac{3}{6}$',\n",
" r'$\\frac{4}{6}$', r'$\\frac{5}{6}$', '1'));"
],
"execution_count": 35,
"outputs": [
{
"output_type": "display_data",
"data": {
"image/png": "iVBORw0KGgoAAAANSUhEUgAAAXsAAAEYCAYAAAC9Xlb/AAAABHNCSVQICAgIfAhkiAAAAAlwSFlzAAALEgAACxIB0t1+/AAAADh0RVh0U29mdHdhcmUAbWF0cGxvdGxpYiB2ZXJzaW9uMy4yLjIsIGh0dHA6Ly9tYXRwbG90bGliLm9yZy+WH4yJAAAWY0lEQVR4nO3de7hldX3f8fdnGATmoBIixiEwEKVFrfokcrRNvZSgRIwKao0x3g3pqIlGH41ootZaL9E02ihGDfWGFi8pEp6YqLWtIhoD5Ix3AVu1IAPKSJTKMJBB5ts/9pp4GM5ln3P2XnvO/r1fz7Of2Xvdfr/v2vDZ6/z2WmunqpAkTbcNk+6AJGn8DHtJaoBhL0kNMOwlqQGGvSQ1wLCXpAYY9pLUAMNekhpg2GtRSa5I8vBxrJvkG0lO3HfZ+dM1WpPYt0mOT/LlJDck+b0+29ZtGfZTpgvOm5LsTHJtkvclOXTS/dpXVf2Lqrpguekj+MDZneQu+0z/UpJKcuxqtrvKfux9T/Y+juyhzdvst8X2+ZidAXymqu5YVW/dd2aSeyTZlWTzvGlPSXJNkqN77emUM+yn02Oq6lDg/sAs8Ip9F0iysfdeTcb/BX5z74sk9wU2TaAfj6mqQ+c9rplAHybhGOAbi82sqm8DHwNeCJDkl4G3AadV1VW99LARhv0Uq6qrgU8A94F/Otp7aZKvAjcm2ZjkXkkuSHJ992f+qfts5gFJLk3yoyTvTXLw3hlJXpbk292f6JcmedwK1l3wiH2fIZ0PAFuAj3VHwy9N8tF9ln9rkrcssRs+ADx93utnAO9foN1Fa+navbqb980kD1tq+kp0f2EcN+/1+5K8dt7rK5L8fpKvJvl/ST6ydz8mOTrJeUl+kOQfkrytm77vfjtj3rb27ttF3/el2lykhgW3leTTwK8Ab+v68c8X2cQbgWcnuQ9wHvDsqvr7le5LLaOqfEzRA7gCeHj3/GgGR1WvmTfvy930Q4ADgW8BfwjcATgJuAE4ft7yX++WPxz4W+C189r6deBIBgcNvwHcCGwect35/Vzw+QLzNndtHNa93gjsAE5Yal8A3wTuBRwAbGdwtFnAscvVAhwPXAUc2S13LHCPxaYv954sMK+A4+a9ft8C++mSrm+HA5cBz+lq+Qrwn4EZ4GDgwUu1OW9/DPO+367NRfq/3LYuAH57iP9uP9Xt838/6f+HpvXhkf10Oj/J9cDngc8Cr583761VdVVV3QT8K+BQ4A1VtbuqPg38NfOGPYC3dcv/EHjd/HlV9d+q6pqq2lNVHwH+D/DAYdZdjar6HnAhg2AGOAW4rqq2LbPq3qP7kxkE19ULbHuxWm4FDgLuneTAqrqiBkMPi01fzPndke/1Sc4fvmpg8J5d0+3HjwG/2PXtSOAlVXVjVd1cVZ8fcnvDvO8LtbnabS0pyQYG+3MPg6N8jYFhP50eW1WHVdUxVfU7XbDvNX8c9EjgqqraM2/alcDPL7L8ld06ACR5egZnWlzffbjcB7jLMOuuwdnAU7vnT2UQ5Mv5APBk4JksMIQDi9dSVd9iMJ78H4AdST6c5MjFpi/Rh73vyWFV9dgh+jzf9+c938UgXI8Grqyqn6xwWzDc+75Qm6vd1nLeBBzG4AP2KStYTytg2Ldn/g8YXAMc3R1Z7bWF2x75Hr3PvGsAkhwD/BfgecDPVtVhDIZtsty6a+gvwPnA/brx3UcD5yy7gaorGXxR+2sMxoRvY7laquqDVfVgfjr888alpq/QLm77hfHdhlzvKmDLEl+0L/VDFcO878Na07aSPBt4HHAag/33kiRZei2thmHftosZhM0ZSQ7M4BzsxwAfnrfM7yY5KsnhwMuBj3TTZxgEyg8AkjyL7ovgIdZdiWuBu+99UVU3A+cCHwQuqarvDrmd04GTqurGBeYtWksG54mflOQg4GbgJmDPYtNXUd+XgScnOSDJKcC/GXK9S4DvAW9IMpPk4CQPmjf/NvttH8O878Na9ba6L4tfDzy6qnYweF/vwCD4NWKGfcOqajeD/zEfCVwHvB14elVdPm+xDzL48uw7wLeB13brXsrgz++/YxAs92XwJSzLrbtCfwS8ohte+f1u2tlde8MM4dD199tVNbfIvKVqOQh4A4P9833grsAfLDF9pV7A4D24nsEQxlDj+VV1a7feccB3GXzx/BvzFllov+1dd5j3fSir3VaSezL4QHhaVX19Xk1vBl660n5oeanyZwm1viTZAlwO3K2qfjzp/kjrgUf2Wle6seEXAR826KXh9XIVZZL3MPgybUdV7TuuKw0lyQyDYZYrGZx2KWlIvQzjJHkosBN4v2EvSf3rZRinqi4EfthHW5Kk29uvboaVZCuwFWBmZuaEe97znhPukSStH9u2bbuuqo5YaN5+FfZVdRZwFsDs7GzNzS14ppwkaQFJrlxsnmfjSFIDDHtJakAvYZ/kQwyuTjw+yfYkp/fRriRpoJcx+6pa061tJUlr4zCOJDXAsJekBhj2ktQAw16SGmDYS1IDDHtJaoBhL0kNMOwlqQGGvSQ1wLCXpAYY9pLUAMNekhpg2EtSAwx7SWqAYS9JDTDsJakBhr0kNcCwl6QGGPaS1ADDXpIaYNhLUgMMe0lqgGEvSQ0w7CWpAYa9JDXAsJekBhj2ktSAXsI+yYlJPpfknUlO7KNNSdJP9XVkX8BO4GBge09tSpI6G3tq53NV9dkkPwe8GXhKT+1KkujpyL6q9nRPfwQc1EebkqSf6uXIPsnjgUcAhwFvW2K5rcBWgC1btvTRNUlqQqpq0n1Y0OzsbM3NzU26G5K0biTZVlWzC83z1EtJakBfwzgL/vlQVemjfUlqXS9hb6hL0mT1dWS/AXgNcCdgrqrO7qNdSdJAX2P2pwFHAbfgRVWS1Lu+wv544AtV9SLguT21KUnq9HUF7XZgd/f81p7alCR1+gr784AzkzwEuLCnNiVJnb7OxtkFnN5HW5Kk2/OiKklqgGEvSQ0w7CWpAYa9JDXAsJekBhj2ktQAw16SGmDYS1IDDHtJaoBhL0kNMOwlqQGGvSQ1wLCXpAYY9pLUAMNekhpg2EtSAwx7SWqAYS9JDTDsJakBhr0kNcCwl6QG9Bb2SWaSzCV5dF9tSpIG+jyyfynwFz22J0nqbOyjkSQnA5cCB/fRniTptnoJe+BEYAa4N3BTko9X1Z6e2pak5vUS9lX1coAkzwSuWyzok2wFtgJs2bKlj65JUhNSVZPuw4JmZ2drbm5u0t2QpHUjybaqml1onqdeSlID+vqCdsE/H6oqfbQvSa3ra8zeUJekCerryH4D8BrgTsBcVZ3dR7uSpIG+xuxPA44CbgG299SmJKnTV9gfD3yhql4EPLenNiVJnb4uqtoO7O6e39pTm5KkTl9hfx5wZpKHABf21KYkqdPX2Ti7gNP7aEuSdHteVCVJDTDsJakBhr0kNcCwl6QGGPaS1ADDXpIaYNhLUgMMe0lqwIrDPslMkgPG0RlJ0ngsG/ZJNiR5cpK/SbIDuBz4XpJLk/ynJMeNv5uSpLUY5sj+M8A9gJcBd6uqo6vqrsCDgYuANyZ56hj7KElao2HujfPwqrolyVuq6gV7J1bVD4GPAh9NcuDYeihJWrNlj+yr6pbu6Q1JPpZkBiDJI5L87T7LSJL2Q0Pf9bKqXpHkycAFSXYDOxkM7UiS9nNDh32ShwH/DrgR2Az8VlV9c1wdkySNzkpOvXw58MqqOhF4AvCRJCeNpVeSpJFayTDOSfOefy3JIxl8Qfuvx9ExSdLorPoK2qr6HvCwEfZFkjQma7pdQlXdNKqOSJLGx3vjSFIDVhT2e7+Q9YtZSVpfVnpk/yf7/DuUJPdK8s4k5yZ57grblCSt0WqHcbKShavqsqp6DvBE4EGrbFOStEq9jdknORX4G+DjfbUpSRroLeyr6q+q6pHAU/pqU5I0MPRFVWuR5ETg8cBBLHFkn2QrsBVgy5YtfXRNkpqw0rDf2f17w0pWqqoLgAuGWO4s4CyA2dnZWmHfJEmLWNEwTlU9dP6/kqT1oa9hnAWP0qtqRWf1SJJWp5ewN9QlabKG+cHxs5PcYS2NdD9a/rokZyZ5xlq2JUlauWHG7K8C/i7JsfMnJrlfkvcM2c5pwFHALcD2lXRQkrR2yw7jdD9HeBHwP5O8ADgQeCFwR+AtQ7ZzPPCFqvrzJOcC/2u1HZYkrdywY/YXAp8EPgbsAJ5YVReuoJ3twO7u+a0rWE+SNALDjNm/Hfgag3Ps7wV8Gvi9JJtW0M55wCOSnMngg0OS1KNhjuy/Arx43g+VPDnJi4GLkjyhqv73chuoql3A6WvopyRpDYYJ+7Oq6jbnyVfVm5J8icGtD45Lkn2XkSTtP4Y5G+czSZ6fZN+b1XweeHWSswFPp5Sk/dgwR/anAL8FfCjJ3YEfAYcw+KD4FPCnVfWl8XVRkrRWw5x6eTPwduDtSQ4E7gLcVFXXj7tzkqTRGOZsnGckuS7JD4F3ATsNeklaX4YZs38lcDJwT+C7wOvH2iNJ0sgNM2b/43lj8q9McvE4OyRJGr1hwn5z9wtSlwOXMbhdgiRpHRkm7F8F3JfBb8feFzg0yccZXGz11ar60Bj7J0kagWHOxjlr/uskRzEI/fsBvwYY9pK0n1vxj5dU1XYGNzb7xOi7I0kahxX9Bq0kaX0y7CWpAYa9JDXAsJekBhj2ktQAw16SGmDYS1IDDHtJaoBhL0kNMOwlqQErvl3CaiR5LPAo4E7Au6vqU320K0ka6CXsq+p84PwkPwP8CYPfrpUk9aTvYZxXAH/Wc5uS1Ly+hnECvAH4RFV9sY82JUk/1UvYA88HHg7cOclxVfXOhRbqfhFrK8CWLVt66pokTb9U1aT7sKDZ2dmam5ubdDckad1Isq2qZhea56mXktSAvsbsF/zzoarSR/uS1Lq+Tr001CVpgvo6st8AvIbBRVVzVXV2H+1Kkgb6GrM/DTgKuIXBj5VLknrUV9gfD3yhql4EPLenNiVJnb7Os98O7O6e39pTm5KkTl9hfx5wZpKHABf21KYkqdPX2Ti7gNP7aEuSdHteVCVJDTDsJakBhr0kNcCwl6QGGPaS1ADDXpIaYNhLUgMMe0lqgGEvSQ0w7CWpAYa9JDXAsJekBhj2ktQAw16SGmDYS1IDDHtJaoBhL0kNMOwlqQGGvSQ1wLCXpAZMRdhXwcUXwzN+fReHz9zMARv2cPjMzTzzibu45JLB/GljzdNfc2v1gjWPteaq2i8fJ5xwQg1j9+6qZz1pVx276dr64w1n1NVsrls4oK5mc/3xhjPq2Jlr61lP2lW7dw+1uXXBmqe/5tbqrbLmUdQMzNUimdpLcAN3B94NnDvsOsOE/Z49gx31iE2frZ1sGpSzz2Mnm+pXD7mwnvWkXbVnz3A7bH9mzdNfc2v1VlnzqGqeeNj/U2MjDvuLLqo6dubaRXfU/B127My1dfHFy++s/Z01T3/NrdVbZc2jqnmpsF/XY/bveNMufuemNzHDriWXm2EXz73pzbzjTUsvtx5Y8+KmpebW6gVrXsrIal7sU2AcD0Z8ZP8zm26qq9m85Kfi3sd2jqzDZ25a/qNxP2fN019za/VWWfOoamaJI/sM5o9Xkp8FXgecDLyrqv5okeW2AlsBtmzZcsKVV1655HYP2LCHf6w7sJFbl+3DLWzkkA3/yE9uXdd/zFjzMqah5tbqBWtezrA1J9lWVbMLzetlb1XVP1TVc6rqHosFfbfcWVU1W1WzRxxxxLLbvfMhu9nBXYfqww7uyp0P2T18p/dT1ry0aai5tXrBmpcziprX9UfjqY/awzkbnjbUsudseBqnPmrPmHs0fta8tGmoubV6wZqXM5KaFxvfGeUDqIUeS60zyrNxbmCmjtnU1jf41rx+tVZvlTWPqmYmfTZOVWWhx1q3+8AHwq885o48/pBPciObFlxmJzP820M+wUmn3pEHPGCtLU6eNU9/za3VC9bcS82LfQqM8sFguOh1wJnAM4ZZZ8VX0M5cW6/PS2o7R9ZuNtZ2jqw3bnhpHbNpiq+6s+aprbm1equseRQ1M+mLqoDHAWcDbwYeNsw6w4Z91eBKtIsvrjrk3u+tgw68rpJb6qADr6tnPvHGuuSSoTezrljz9NfcWr1V1rzWmpcK+75OvXwZ8KOq+vMk51bVE5ZbZ3Z2tubm5lbWzqtvOzJUrxp/bZNmzdNfc2v1gjXD6mpe6tTLjavr1optB/aeN7T8SaWSpJHqK+zPA85M8hDgwp7alCR1egn7qtoFnN5HW5Kk21vXF1VJkoZj2EtSAwx7SWqAYS9JDTDsJakBhr0kNcCwl6QGGPaS1ADDXpIaYNhLUgMMe0lqgGEvSQ0w7CWpAYa9JDXAsJekBhj2ktQAw16SGmDYS1IDDHtJaoBhL0kNMOwlqQGGvSQ1oLewT3JKkm8m+VaSl/XVriSpp7BPcgDwZ8AjgXsDv5nk3n20LUnq78j+gcC3quo7VbUb+DBwWk9tS1LzUlXjbyR5AnBKVf129/ppwL+squfts9xWYGv38njgm6ts8i7Adatcd72y5unXWr1gzSt1TFUdsdCMjavvz+hV1VnAWWvdTpK5qpodQZfWDWuefq3VC9Y8Sn0N41wNHD3v9VHdNElSD/oK+78H/lmSX0hyB+BJwF/11LYkNa+XYZyq+kmS5wH/HTgAeE9VfWOMTa55KGgdsubp11q9YM0j08sXtJKkyfIKWklqgGEvSQ2YqrBP8p4kO5J8fdJ96UOSo5N8JsmlSb6R5AWT7tO4JTk4ySVJvtLV/OpJ96kvSQ5I8qUkfz3pvvQhyRVJvpbky0nmJt2fPiQ5LMm5SS5PclmSXx7ZtqdpzD7JQ4GdwPur6j6T7s+4JdkMbK6qLya5I7ANeGxVXTrhro1NkgAzVbUzyYHA54EXVNVFE+7a2CV5ETAL3KmqHj3p/oxbkiuA2apq5qKqJGcDn6uqd3VnLm6qqutHse2pOrKvqguBH066H32pqu9V1Re75zcAlwE/P9lejVcN7OxeHtg9pueIZRFJjgIeBbxr0n3ReCS5M/BQ4N0AVbV7VEEPUxb2LUtyLPBLwMWT7cn4dcMZXwZ2AP+jqqa+ZuBPgTOAPZPuSI8K+FSSbd2tVKbdLwA/AN7bDde9K8nMqDZu2E+BJIcCHwVeWFU/nnR/xq2qbq2qX2RwJfYDk0z1kF2SRwM7qmrbpPvSswdX1f0Z3C33d7th2mm2Ebg/8I6q+iXgRmBkt4M37Ne5btz6o8A5VXXepPvTp+5P3M8Ap0y6L2P2IODUbgz7w8BJSf7rZLs0flV1dffvDuAvGdw9d5ptB7bP+0v1XAbhPxKG/TrWfVn5buCyqnrzpPvThyRHJDmse34IcDJw+WR7NV5V9QdVdVRVHcvgViOfrqqnTrhbY5VkpjvpgG4o41eBqT7Lrqq+D1yV5Phu0sOAkZ1ssV/d9XKtknwIOBG4S5LtwKuq6t2T7dVYPQh4GvC1bgwb4A+r6uMT7NO4bQbO7n4QZwPwF1XVxKmIjfk54C8HxzNsBD5YVZ+cbJd68XzgnO5MnO8AzxrVhqfq1EtJ0sIcxpGkBhj2ktQAw16SGmDYS1IDDHtJaoBhL0kNMOwlqQGGvTSk7rcDTu6evzbJmZPukzSsqbqCVhqzVwH/McldGdxh9NQJ90camlfQSiuQ5LPAocCJ3W8ISOuCwzjSkJLcl8G9eXYb9FpvDHtpCN1PQJ4DnAbsTDLtt1XWlDHspWUk2QScB7y4qi4DXsNg/F5aNxyzl6QGeGQvSQ0w7CWpAYa9JDXAsJekBhj2ktQAw16SGmDYS1ID/j9BQ+HoPdDikQAAAABJRU5ErkJggg==\n",
"text/plain": [
""
]
},
"metadata": {
"tags": [],
"needs_background": "light"
}
}
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "TmZWdjkwZzsR",
"colab_type": "text"
},
"source": [
"#### 2.11) Show that $Var(X)=E(X^2) -[E(X)]^2$."
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "GtnElkthaefj",
"colab_type": "text"
},
"source": [
"$$Var(X)=E[X-E(X)]^2$$\n",
"\n",
"$$=E\\{X^2 -2XE(X)+[E(X)]^2\\}$$\n",
"\n",
"$$=E(X^2) -2E(X)E(X)+[E(X)]^2$$\n",
"\n",
"$$=E(X^2) -[E(X)]^2$$"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "qHDXVT8lPPrq",
"colab_type": "text"
},
"source": [
"#### 2.12) What is the Bernoulli random variable?"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "m00ES6J8Pq0z",
"colab_type": "text"
},
"source": [
"It is binary random variable, that can take only two values:\n",
"1 and 0. For example, we can simulate a biased coin with $P(Tail) = p$. Let $X = 1$ if the coin flip is tail, and $X = 0$ if the coin flip is head.\n",
"\n",
"| Outcome | x | P(X=x) | \n",
"|----------|---------|---------|\n",
"| Tail | 1 | $p$ |\n",
"| Head | 0 | $1-p$ |\n",
"\n",
"\n",
"The expected value of $X$ is:\n",
"\n",
"$$E[X] = 1 \\times p + 0 \\times (1 - p) = p$$\n",
"\n",
"The variance of $X$ is:\n",
"\n",
"$$Var(X) = E[X^2] - E[X]^2$$\n",
"\n",
"$$= [1^2 \\times p + 0^2 \\times (1 - p)] - p^2$$\n",
"\n",
"$$= p - p^2$$\n",
"\n",
"$$= p(1 - p)$$\n",
"\n",
"The standard deviation of $X$ is:\n",
"\n",
"$$SD(X) = \\sqrt{Var(X)}$$\n",
"\n",
"$$= \\sqrt{p(1 - p)}$$"
]
},
{
"cell_type": "markdown",
"metadata": {
"colab_type": "text",
"id": "QKZQTL-kzGQm"
},
"source": [
"## Exercises"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "9zRyEBPAzCma",
"colab_type": "text"
},
"source": [
"1| Let the random variable $X$ be the number of heads in three tosses.\n",
"\n",
"The outcome space is:\n",
"\n",
"$$\\Omega = \\{ HHH, HHT, HTH, THH, HTT, THT, TTH, TTT \\}$$\n",
"\n",
"The probability distribution of $X$ is:\n",
"\n",
"|$\\text{Possible value of } x$|$~~0~~$|$~~1~~$|$~~2~~$|$~~3~~$|\n",
"|-------------------------:|:-----:|:-----:|:-----:|:-----:|\n",
"|$P(X = x)$ |$1/8$ |$3/8$ |$3/8$ |$1/8$ |"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "uasxrJSyzvdT",
"colab_type": "text"
},
"source": [
"a) What is the chance of getting more than one head, $P(X>1)$?"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "zA0qsXF80XqK",
"colab_type": "text"
},
"source": [
"b) What is the chance of getting two or less than two heads, $P(X \\leq 2)$?"
]
},
{
"cell_type": "markdown",
"metadata": {
"colab_type": "text",
"id": "qeNcOAc0W414"
},
"source": [
"2| Let $Y$ be a random variable that represents a stock."
]
},
{
"cell_type": "code",
"metadata": {
"colab_type": "code",
"id": "xZQKoYTCPdIt",
"colab": {
"base_uri": "https://localhost:8080/",
"height": 139
},
"outputId": "2f72833b-7cb7-4703-a1e3-356d349b8383"
},
"source": [
"stock = ['Increases', 'Stays same','Decreases']\n",
"y = [5, 0, -3]\n",
"prob = [0.4, 0.1, 0.5]\n",
"table = {'Stock Price':stock, 'y': y, 'P(Y=y)':prob }\n",
"import pandas as pd\n",
"Y = pd.DataFrame(table)\n",
"Y"
],
"execution_count": 36,
"outputs": [
{
"output_type": "execute_result",
"data": {
"text/html": [
"\n",
"\n",
"
\n",
" \n",
" \n",
" \n",
" Stock Price \n",
" y \n",
" P(Y=y) \n",
" \n",
" \n",
" \n",
" \n",
" 0 \n",
" Increases \n",
" 5 \n",
" 0.4 \n",
" \n",
" \n",
" 1 \n",
" Stays same \n",
" 0 \n",
" 0.1 \n",
" \n",
" \n",
" 2 \n",
" Decreases \n",
" -3 \n",
" 0.5 \n",
" \n",
" \n",
"
\n",
"
"
],
"text/plain": [
" Stock Price y P(Y=y)\n",
"0 Increases 5 0.4\n",
"1 Stays same 0 0.1\n",
"2 Decreases -3 0.5"
]
},
"metadata": {
"tags": []
},
"execution_count": 36
}
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "G0kCpc3qOdlL",
"colab_type": "text"
},
"source": [
"a) Calculate the $E(Y)$."
]
},
{
"cell_type": "markdown",
"metadata": {
"colab_type": "text",
"id": "fP6WH182ZGSA"
},
"source": [
"b) Calculate $Var(Y)$."
]
},
{
"cell_type": "markdown",
"metadata": {
"colab_type": "text",
"id": "qyEBb72RbA4v"
},
"source": [
"c) Calculate the $SD(Y)$."
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "R5VPGkf-ND8F",
"colab_type": "text"
},
"source": [
"3| Let $D$ be a random variable that represents the roll of a single fair six-sided die.\n"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "VGcTZKBcNwWS",
"colab_type": "text"
},
"source": [
"a) Calculate the expected value of $D$."
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "IGTywSV8N4oz",
"colab_type": "text"
},
"source": [
"b) Calculate the standard deviation of $D$."
]
},
{
"cell_type": "markdown",
"metadata": {
"colab_type": "text",
"id": "urKmFvQKbS3n"
},
"source": [
"4| Prove that $Var(X+c)=Var(X)$, where c is a constant."
]
},
{
"cell_type": "markdown",
"metadata": {
"colab_type": "text",
"id": "Qv3izRSaa6PZ"
},
"source": [
"## Reference"
]
},
{
"cell_type": "markdown",
"metadata": {
"colab_type": "text",
"id": "jTCLmFEoa79O"
},
"source": [
"Adhikari, A., Pitman, J. (2020). [Probability for Data Science](http://prob140.org/textbook/README.html).\n",
"\n",
"Diez, D. M., Barr, C. D., Çetinkaya-Rundel, M. (2014). [Introductory Statistics with Randomization and Simulation](https://www.openintro.org/stat/textbook.php?stat_book=isrs). \n",
"\n",
"Lau, S., Gonzalez, J., Nolan, D. (2020). [Principles and Techniques of Data Science]( https://www.textbook.ds100.org/intro)."
]
}
]
}