class: center, middle, inverse, title-slide .title[ # MATH 204 Introduction to Statistics ] .subtitle[ ## Lecture 8: Introduction to Random Variables ] .author[ ### JMG ] --- ## Goals for Lecture * Introduce basic concepts and terminology for random variables. Textbook section 3.4 -- * We begin our discussion on the imporant notion of **random variable**. -- * In this lecture, we focus on getting an intuitive grasp on the notion of random variable. In our next lecture, we will be more precise in our discussion of the topic. --- ## Introduction to Random Variables - It's often useful to model a process using what's called a **random variable**. -- - Suppose we toss a coin ten times and add up the number of heads that have appeared. Tossing a coin ten times is a random process, the total number of heads after ten tosses is a random variable. -- - In general, a random variable assigns a **numerical value** to events from a random process. -- - We will see later that random variables have distributions associated with them, and we want to be able to describe the distributions of random variables. -- - The sample mean is an important example of a random variable and its distribution is an example of a **sampling distribution**. --- ## Notation for Random Variables - We typically denote random variables by capital letters at the end of the alphabet such as `\(X\)`, `\(Y\)`, or `\(Z\)`. -- - For example, let `\(X\)` be the random variable that is the sum of the number of heads that we obtain after tossing a coin ten times. The possible values that `\(X\)` can take on are `\(X=0\)`, `\(X=1\)`, `\(X=2\)`, `\(\ldots\)`, `\(X=10\)`. -- - Typical questions that we ask are ones such as, what is the probability that `\(X=2\)`, or what is the probability that `\(X\)` is less than 5. What do these questions mean in the context of our coin tossing example? -- - The distribution of `\(X\)` allows us to answer such questions. --- ## A Look at Some Data - Let's look at six rounds of tossing a coin ten times. At each toss, we record 1 if we land heads and 0 if we land tails. Then we can add up the 1's to get the total number of heads after ten tosses. Our data might look as follows: <table class="table" style="margin-left: auto; margin-right: auto;"> <thead> <tr> <th style="text-align:left;"> </th> <th style="text-align:right;"> X1 </th> <th style="text-align:right;"> X2 </th> <th style="text-align:right;"> X3 </th> <th style="text-align:right;"> X4 </th> <th style="text-align:right;"> X5 </th> <th style="text-align:right;"> X6 </th> <th style="text-align:right;"> X7 </th> <th style="text-align:right;"> X8 </th> <th style="text-align:right;"> X9 </th> <th style="text-align:right;"> X10 </th> <th style="text-align:right;"> total_heads </th> </tr> </thead> <tbody> <tr> <td style="text-align:left;"> round_1 </td> <td style="text-align:right;"> 0 </td> <td style="text-align:right;"> 0 </td> <td style="text-align:right;"> 1 </td> <td style="text-align:right;"> 1 </td> <td style="text-align:right;"> 0 </td> <td style="text-align:right;"> 0 </td> <td style="text-align:right;"> 0 </td> <td style="text-align:right;"> 1 </td> <td style="text-align:right;"> 1 </td> <td style="text-align:right;"> 1 </td> <td style="text-align:right;"> 5 </td> </tr> <tr> <td style="text-align:left;"> round_2 </td> <td style="text-align:right;"> 1 </td> <td style="text-align:right;"> 0 </td> <td style="text-align:right;"> 0 </td> <td style="text-align:right;"> 1 </td> <td style="text-align:right;"> 0 </td> <td style="text-align:right;"> 0 </td> <td style="text-align:right;"> 1 </td> <td style="text-align:right;"> 1 </td> <td style="text-align:right;"> 0 </td> <td style="text-align:right;"> 0 </td> <td style="text-align:right;"> 4 </td> </tr> <tr> <td style="text-align:left;"> round_3 </td> <td style="text-align:right;"> 1 </td> <td style="text-align:right;"> 0 </td> <td style="text-align:right;"> 0 </td> <td style="text-align:right;"> 1 </td> <td style="text-align:right;"> 0 </td> <td style="text-align:right;"> 1 </td> <td style="text-align:right;"> 1 </td> <td style="text-align:right;"> 1 </td> <td style="text-align:right;"> 1 </td> <td style="text-align:right;"> 1 </td> <td style="text-align:right;"> 7 </td> </tr> <tr> <td style="text-align:left;"> round_4 </td> <td style="text-align:right;"> 0 </td> <td style="text-align:right;"> 0 </td> <td style="text-align:right;"> 1 </td> <td style="text-align:right;"> 0 </td> <td style="text-align:right;"> 0 </td> <td style="text-align:right;"> 1 </td> <td style="text-align:right;"> 0 </td> <td style="text-align:right;"> 0 </td> <td style="text-align:right;"> 1 </td> <td style="text-align:right;"> 1 </td> <td style="text-align:right;"> 4 </td> </tr> <tr> <td style="text-align:left;"> round_5 </td> <td style="text-align:right;"> 1 </td> <td style="text-align:right;"> 0 </td> <td style="text-align:right;"> 1 </td> <td style="text-align:right;"> 0 </td> <td style="text-align:right;"> 0 </td> <td style="text-align:right;"> 1 </td> <td style="text-align:right;"> 0 </td> <td style="text-align:right;"> 0 </td> <td style="text-align:right;"> 0 </td> <td style="text-align:right;"> 1 </td> <td style="text-align:right;"> 4 </td> </tr> <tr> <td style="text-align:left;"> round_6 </td> <td style="text-align:right;"> 0 </td> <td style="text-align:right;"> 1 </td> <td style="text-align:right;"> 0 </td> <td style="text-align:right;"> 0 </td> <td style="text-align:right;"> 1 </td> <td style="text-align:right;"> 0 </td> <td style="text-align:right;"> 1 </td> <td style="text-align:right;"> 0 </td> <td style="text-align:right;"> 1 </td> <td style="text-align:right;"> 0 </td> <td style="text-align:right;"> 4 </td> </tr> </tbody> </table> -- - Let's repeat this process many more times and create a barplot that shows the number of times each of the values 0, 1, ..., 10 occurs. --- ## The Distribution of the Number of Heads <img src="index_files/figure-html/coin_dist-1.png" style="display: block; margin: auto;" /> -- - If we divide all of the counts by the total, we get the density. This provides an estimate for the probability value of each outcome. --- ## Probabilities for Number of Heads This plot shows the density instead of count for the previous barplot. <img src="index_files/figure-html/probs-1.png" style="display: block; margin: auto;" /> -- - From this plot, we can easily estimate probability values. For example, what is the probability of getting 4 heads out of ten tosses? It's about 0.2. What is the probability of getting less than three heads? It's about 0.04 + 0.01 + 0.0025 = 0.0525. --- # The Mean and Variance for Number of Heads We can compute the mean and variance for the number of heads, in which case we get: ``` ## [1] "The mean is 5.035400" ``` ``` ## [1] "The variance is 2.385024" ``` -- - This tells us that the "average" number of heads out of ten tosses is about 5. How does this correspond with your real life experience or expectations? -- - These values provide estimates for the **expected value** and **variance** of our random variable. These concepts will be defined and discussed in detail in the next lecture. --- ## Summary In this lecture, we introduced the basic notion of random variable. --- ## Notes --- ## Notes --- ## Notes