Introduction to Data Science Daily Overview: Unit 4

Unit 4
Daily Overview: Unit 4
| Theme | Day | Lessons and Labs | Campaign | Topics | Page |
|---|---|---|---|---|---|
| Predictions and Models (15 days) |
1 | Lesson 1: Water Usage | Data cycle, official data sets | 315 | |
| 2 | Lesson 2: Exploring Water Usage | Exploratory data analysis, campaign creation | 319 | ||
| 3 | Lesson 3: Evaluating and Implementinga Water Campaign | Water Campaign—data | Statistical questions, evaluate & mock implement campaign | 321 | |
| 4 | Lesson 4: Refining the Water Campaign | Water Campaign—data | Revise and edit campaign, data collection | 323 | |
| 5 | Lesson 5: Statistical Predictions Using One Variable | Water Campaign—data | One-variable predictions using a rule | 325 | |
| 6 | Lesson 6: Statistical Predictions by Applying the Rule | Water Campaign—data | Predictions applying mean square deviation, mean absolute error | 328 | |
| 7 | Lesson 7: Statistical Predictions Using Two Variables | Water Campaign—data | Two-variable statistical predictions, scatterplots | 333 | |
| 8 | LAB 4A: If the Line Fits… | Water Campaign—data | Estimate line of best fit | 335 | |
| 9 | LAB 4B: What’s the Score? | Water Campaign—data | Comparing predictions to real data | 337 | |
| 10 | Lesson 8: What’s the Trend? | Water Campaign—data | Trend, associations, linear model | 339 | |
| 11 | Lesson 9: Spaghetti Line | Water Campaign—data | Estimate line of best fit, single linear regression | 343 | |
| 12 | LAB 4C: Cross-Validation | Water Campaign—data | Use training and testing data for predictions | 346 | |
| 13 | Lesson 10: Predicting Values | Water Campaign—data | Predictions based on linear models | 348 | |
| 14 | Lesson 11: How Strong Is It? | Water Campaign—data | Correlation coefficient, strength of trend | 351 | |
| 15 | LAB 4D: Interpreting Correlations | Water Campaign—data | Use correlation coefficient to determine best model | 353 | |
| Piecing it Together (6 days) |
16 | Lesson 12: More Variables to Make Better Predictions | Water Campaign—data | Multiple linear regression | 358 |
| 17 | Lesson 13: Combination of Variables | Water Campaign—data | Multiple linear regression | 361 | |
| 18 | LAB 4E: This Model Is Big Enough for All of Us | Water Campaign—data | Multiple linear regression | 364 | |
| 19 | Practicum: Predictions | Water Campaign—data | Linear regression | 365 | |
| 20 | Lesson 14: Improving Your Model | Water Campaign—data | Non-linear regression | 366 | |
| 21 | LAB 4F: Some Models Have Curves | Water Campaign—data | Non-linear regression | 368 | |
| The Growth of Landfills (5 days) |
22 | Lesson 15: The Growth of Landfills | Water Campaign—data | Modeling to answer realworld problems | 372 |
| 23 | Lesson 16: Exploring Trash via the Dashboard | Water Campaign—data | Analyze data to improve models | 376 | |
| 24 | Lesson 17: Exploring Trash via RStudio | Water Campaign—data | Analyze data to improve models | 377 | |
| 25 | Prepare Team Presentations | Water Campaign—data | Modeling with statistics | - | |
| 26 | Present Team Recommendations | Water Campaign—data | Modeling with statistics | - | |
| Decisions, Decisions! (3 days) |
27 | Lesson 18: Grow Your Own Decision Tree | Water Campaign—data | Multiple predictors, classifying into groups, decision trees | 380 |
| 28 | Lesson 19: Data Scientists or Doctors? | Water Campaign—data | Decision trees based on training and testing data | 385 | |
| 29 | LAB 4G: Growing Trees | Water Campaign—data | Decision trees to classify observations | 388 | |
| Ties that Bind (3 days) |
30 | Lesson 20: Where Do I Belong? | Water Campaign—data | Clustering, k-means | 392 |
| 31 | LAB 4H: Finding Clusters | Water Campaign—data | Clustering, k-means | 397 | |
| 32+ | Lesson 21: Our Class Network | Water Campaign—data | Clustering, networks | 399 | |
| End of Unit Project (7 days) |
33- 40 |
End of Unit 3 and 4 Design Project andOral Presentations: Water Usage | Water Campaign | Synthesis of above | 403 |
^=Data collection window begins.
+=Data collection window ends.