Lesson 16: Categorical Associations
Lesson 16: Categorical Associations
Objective
You will learn to construct, interpret, and calculate the joint relative frequencies of two-way frequency tables.
Vocabulary
two-way frequency table, joint relative frequency
Essential Concepts
Lesson 16 Essential Concepts
A two-way table is a summary of the association/relationship between two categorical variables. Joint relative frequencies answer questions like "What proportion of the people/objects had this value on the first variable and this value on the second?"
Lesson
-
Read the following scenario:
Rosa has a theory that cat owners are also musical. To find out, she decided to collect data that would help her understand the relationship between cat ownership and instrument playing among the students in her art class. She conducted a survey and found that out of the 35 students in her art class, 16 owned a cat and out of those that owned a cat, 7 played an instrument. She also discovered that 9 students owned a cat, but did not play an instrument. There were also 9 students who neither owned a cat nor played an instrument.
-
Rosa asked her classmates two questions that provided the data for the information she needed to collect.
What could those two questions be?
-
Based on the questions that you came up with for #2, what could be the variables that Rosa collected information about? For example, if "owns a cat" is a variable, what value or number of students represents that variable? What were the values of the other variables?
-
In your IDS Journal, summarize Rosa’s findings in one table. Your last row and your last column should show totals that add each row across and each column down. Make sure that you add the totals in each row and the totals in each column. Totals from each row and totals from each column should add up to 35. Use your knowledge of data structures from Lesson 2, especially organizing in rows and columns.
-
In this lesson you will be looking at associations in categorical variables. Recall from Lesson 3 that categorical variables represent values that have words.
-
There are many ways to organize the data that Rosa collected. The table below shows a two-way frequency table. A two-way frequency table displays the data related to two categories from one group. One category is represented in rows and the other is represented in columns.
The Cat Ownership and Instruments two-way frequency table below represents the data Rosa collected from the students in her art class. Compare your table to the one below. It might be organized a little differently, but the totals should add to 35.
Cat Ownership and Instruments
-
Based on the Cat Ownership and Instruments table, generate a couple of questions that can be asked and answered by the data and write them in your IDS Journal.
-
A two-way frequency table can show relative frequencies. A relative frequency is how often something occurs in relation to the total number of occurrences. A relative frequency is a fancy term for a ratio/proportion and can be expressed as a fraction, decimal or percent. For example, the relative frequency of those who own a cat and play an instrument is 7/35 = 0.2 = 20%.
-
The relative frequencies have been calculated in the two-way frequency table below.
Cat Ownership and Instruments: Relative Frequencies
Compare the two-way frequency table in #6 and the relative frequency table in #9.
What is the difference between a two-way frequency table and a two-way relative frequency table? When would it be better to use one over the other?
-
Using the Cat Ownership and Instruments: Relative Frequencies table, answer the following questions about relative frequencies:
-
What percent of the students in Rosa's art class own a cat?
-
What percent of the students in Rosa's art class do not own a cat but play an instrument?
-
What does the relative frequency 18/35 represent?
-
-
Choose two categorical variables that you predict might be associated. Then generate two questions about the two categorical variables that you chose and write them in your IDS Journal. Do not choose two random categorical variables.
Here's an example:
I think there is a relationship between watching scary movies and drinking soda, so I am going to survey family and friends by asking them if 1)they like watching scary movies, and 2)they drink soda.
Here's how you might organize a two-way table for the two questions above:
Likes scary movies Does not like scary movies Total Drinks soda Does not drink soda Total -
Now it's your turn. Come up with your own two questions about two categorical variables, then create a two-way table for organizing your data. Record these in your IDS Journal.
Reflection
What are the essential learnings you are taking away from this lesson?
Homework
Survey at least 20 people (family or friends) and ask them to answer the two questions you generated so that you can fill out your two-way table.