Model selection and overfitting
Lecture 18
Warm-up
While you wait…
Go to your
ae
project in RStudio.Make sure all of your changes up to this point are committed and pushed, i.e., there’s nothing left in your Git pane.
If you missed class last Thursday, pull to get today’s application exercise file: ae-15-modeling-loans.qmd.
Make sure you’ve completed the “Get to know the data” section of your AE.
Announcements
- My office hours this week:
- I’ll hold them at a modified time: 1-2 pm on Wednesday at Old Chem 213 (in place of Dav’s office hours)
- Dav will fill in for me 2-4 pm on Wednesday at Old Chem 203B
- Make sure you’re caught up with prepare materials before Thursday’s class
Reminders
What is the difference between \(R^2\) and adjusted \(R^2\)?
-
\(R^2\):
Proportion of variability in the outcome explained by the model.
Useful for quantifying the fit of a given model.
-
Adjusted \(R^2\):
Proportion of variability in the outcome explained by the model, with a penalty added for the number of predictors in the model.
Useful for comparing models.
Application exercise
Finish up Thursday’s AE
Go to your ae project in RStudio.
Get back to working on
ae-15-modeling-loans
Goals:
Review prediction and interpretation of model results
Review main and interaction effects models
Discuss model selection further
Recap
What is the practical difference between a model with parallel and non-parallel lines?
What is the definition of R-squared?
Why do we choose models based on adjusted R-squared and not R-squared?