Lesson 17: A Normal Measure of Spread
Lesson 17: A Normal Measure of Spread
Objective
You will learn that standard deviation is another way to measure variability.
Vocabulary
standard deviation (SD)
Essential Concepts
Lesson 17 Essential Concepts
The standard deviation is another measure of spread. This is commonly used by statisticians because of its role in common models and distributions, such as the normal model.
Lesson
-
In your IDS Journal, create a two-column table and label the left column Measures of Center (Central Tendency) and the right column Measures of Spread (Dispersion).
-
Review the information below about methods you have learned so far for measuring center and measuring spread in distributions. Place each of those measures in the correct column in your table.
-
A measure of center tells us the value that is typical, or in the center. A measure of spread tells us how variable, or how spread apart, the data are.
Measures of Center: mean (average or typical value), median
Measures of Spread: mean absolute deviation (MAD), interquartile range (IQR)
-
Below is another student's response to question #2 above. Do you agree or disagree?
Insert a sample student response.
-
A measure of center or a measure of spread depicts one value for a distribution. In your IDS Journal, write down what you think the value of each measure tells you about the data in the distribution.
-
Add the term standard deviation (SD) to your Measures of Spread column.
-
The standard deviation of a distribution is another way to measure spread, or variability. The standard deviation is similar to the mean absolute deviation (MAD).
-
Recall the formula for calculating the MAD:
-
While the MAD measures the absolute distance of each data point from the mean, the standard deviation squares the distances of each data point from the mean. Both methods result in positive measurements because distance is always positive.
-
Observe the formula for calculating the standard deviation of a data set:
-
Now you will calculate standard deviations of the dotplots using the formula in #10 above by using the How Far Apart? (with standard deviation -- SD) handout.
Follow the directions on the handout. When you get to part iii, compute the SD for plot (a) as you follow along the steps in the video. Then compute the SD for plot (c) on your own.
To download a fillable copy of the How Far Apart? (with standard deviation -- SD) handout (LMR_2.16) click on the document name.
Insert a video showing step-by-step calculations of SD for plot (a).
Remember that you calculated MAD values in the How Far Apart? handout during Lesson 4 of this unit. Download a fillable copy of the How Far Apart? handout (LMR_2.6) by clicking on the document name.
Compare and contrast the standard deviations that you just calculated with the MAD values that you obtained in LMR _2.6 from Lesson 4.
In your IDS Journal, write down why you think the SD takes the square root of the average of the squares.
Using RStudio, you will now estimate the standard deviation for a few numerical distributions and explain the reasoning for your estimate. Load and view the atus data, then run the following functions one by one:
> histogram(~sleep, data=atus, breaks=seq(0,1500,by=100),
> main = “Distribution of sleep in minutes”)
> sleep_mean<-mean(~sleep, data=atus)
> add_line(vline=sleep_mean)
Observe the visual obtained and estimate the standard deviation. In your IDS Journal, report your estimate using the following sentence frame:
“The time spent sleeping (in minutes) typically varies from the mean by minutes.”
How did you come up with your estimate?
Obtain the actual standard deviation by running the function:
> sd(~sleep, data=atus)
Compare your estimate to the actual standard deviation.
Repeat this process with the following numerical variables: Household Size and Socializing. Refer to the functions below and report your estimate using the sentence frames provided:
Household Size:
> histogram(~household_size, data=atus, nint=13)
> household_mean<-mean(~household_size, data=atus)
> add_line(vline=household_mean)
“Household sizes typically vary from the mean by people.”
Socializing:
> histogram(~socializing, data=atus, breaks=seq(0,2000,by=100))
> social_mean<-mean(~socializing, data=atus)
> add_line(vline=social_mean)
“The time spent socializing (in minutes) typically varies from the mean by minutes.”
> sd(~socializing, data=atus)
Reflection
What are the essential learnings you are taking away from this lesson?