STA 199: Introduction to Data Science and Statistical Thinking

This page contains an outline of the topics, content, and assignments for the semester. Note that this schedule will be updated as the semester progresses and the timeline of topics and assignments might be updated throughout the semester.

WEEK DATE PREPARE TOPIC MATERIALS DUE
1 Mon, Aug 26

Lab 0: Hello, World and STA 199!

πŸ’» lab 0

Lab 0 due at the end of lab session


Tue, Aug 27

Welcome to STA 199

πŸ–₯️ slides 00
πŸ—’οΈ notes 00
⌨️ ae 00



Thu, Aug 29

πŸ“— r4ds - intro
πŸ“˜ ims - chp 1
πŸŽ₯ Meet the toolkit :: R and RStudio
πŸŽ₯ Meet the toolkit :: Quarto
πŸŽ₯ Code along :: First data viz with UN Votes

Meet the toolkit

πŸ–₯️ slides 01
πŸ—’οΈ notes 01
⌨️ ae 01


2 Mon, Sep 2

No lab - Labor Day




Tue, Sep 3

πŸ“— r4ds - chp 1
πŸ“˜ ims - chp 4
πŸŽ₯ Visualizing data
πŸŽ₯ Building a plot step-by-step with ggplot2
πŸŽ₯ Grammar of graphics
πŸŽ₯ Code along :: First look at Palmer Penguins

Grammar of data visualization

πŸ–₯️ slides 02
πŸ—’οΈ notes 02
⌨️ ae 02
βœ… ae 02



Thu, Sep 5

πŸ“— r4ds - chp 2
πŸ“— r4ds - chp 3.1-3.5
πŸŽ₯ Grammar of data transformation
πŸŽ₯ Code along :: Flights and pipes

Grammar of data transformation

πŸ–₯️ slides 03
πŸ—’οΈ notes 03
⌨️ ae 03
βœ… ae 03


3 Mon, Sep 9

Lab 1: From the Midwest to North Carolina

πŸ’» lab 1
βœ… lab 1



Tue, Sep 10

πŸ“— r4ds - chp 3.6-3.7
πŸŽ₯ Visualizing and summarizing categorical data
πŸŽ₯ Visualizing and summarizing numerical data
πŸŽ₯ Visualizing and summarizing relationships
πŸŽ₯ Code along :: Star Wars characters

Exploring data I

πŸ–₯️ slides 04
πŸ—’οΈ notes 04
⌨️ ae 04
βœ… ae 04



Thu, Sep 12

πŸ“˜ ims - chp 5
πŸ“˜ ims - chp 6
πŸŽ₯ Code along :: Diving deeper with Palmer Penguins

Exploring data II

πŸ–₯️ slides 05
πŸ—’οΈ notes 05
⌨️ ae 05
βœ… ae 05


4 Mon, Sep 16

πŸ“— r4ds - chp 4

Lab 2: Revisiting the Midwest

πŸ’» lab 2

Lab 1 at 8:30 am


Tue, Sep 17

πŸŽ₯ Tidy data
πŸŽ₯ Tidying data
πŸŽ₯ Code along :: Country populations over time
πŸ“— r4ds - chp 5

Tidying data

πŸ–₯️ slides 06
πŸ—’οΈ notes 06
⌨️ ae 06
βœ… ae 06



Thu, Sep 19

πŸŽ₯ Joining data
πŸŽ₯ Code along :: Continent populations
πŸ“— r4ds - chp 19.1-19.3

Joining data

πŸ–₯️ slides 07
πŸ—’οΈ notes 07
⌨️ ae 07
βœ… ae 07


5 Mon, Sep 23

Lab 3: Inflation everywhere

πŸ’» lab 3

Lab 2 at 8:30 am


Tue, Sep 24

πŸŽ₯ Data types
πŸŽ₯ Data classes
πŸŽ₯ Code along :: That’s my type
πŸ“— r4ds - chp 16

Data types and classes

πŸ–₯️ slides 08
πŸ—’οΈ notes 08
⌨️ ae 08
βœ… ae 08



Thu, Sep 26

πŸŽ₯ Importing data
πŸŽ₯ Code along :: Halving CO2 emissions
πŸŽ₯ Code along :: Student survey
πŸ“— r4ds - chp 7
πŸ“— r4ds - chp 17.1 - 17.3

Importing and recoding data

πŸ–₯️ slides 09
πŸ—’οΈ notes 09
⌨️ ae 09
βœ… ae 09


6 Mon, Sep 30

Lab 4: Everything so far I

πŸ’» lab 4

Lab 3 at 8:30 am


Tue, Oct 1

πŸŽ₯ Web scraping basics
πŸŽ₯ Code along :: Scraping an eCommerce page
πŸŽ₯ Code along :: Scraping many eCommerce pages
πŸŽ₯ Web scraping considerations
πŸ“— r4ds - chp 24.1 - 24.6
πŸ“— r4ds - chp 25.1 - 25.2

Web scraping

πŸ–₯️ slides 10
πŸ—’οΈ notes 10
⌨️ ae 10
⌨️ ae 10
βœ… ae 10



Thu, Oct 3

Midterm review

πŸ–₯️ slides 11
πŸ—’οΈ notes 11
πŸ“ midterm review
βœ… midterm review


7 Mon, Oct 7

πŸ“ Merge conflicts

Project milestone 1 - Working collaboratively

πŸ““ project milestone 1

Lab 4 due at 8:30 am
Project milestone 1 at the end of lab session


Tue, Oct 8

Midterm - In-class + take-home released




Thu, Oct 10

Working with generative AI tools

πŸ–₯️ slides 12
πŸ—’οΈ notes 12
⌨️ ae 11

Midterm course evaluation due midnight (optional)


Fri, Oct 11


Midterm take-home due at 5:00 pm

8 Mon, Oct 14

No lab - Fall Break




Tue, Oct 15

No lecture - Fall Break




Thu, Oct 17

πŸ“• mdsr - chp 8
πŸ“ How to make a racist AI in R without really trying
πŸŽ₯ Alberto Cairo - How charts lie
πŸŽ₯ Joy Buolamwini - How I’m fighting bias in algorithms

Data science ethics

πŸ–₯️ slides 13
πŸ—’οΈ notes 13



Fri, Oct 18


Peer evaluation 1 due by 5:00 pm

9 Mon, Oct 21

πŸ“ Project description
πŸ“ Tidyverse style guide - Chp 1-5

Project milestone 2 - Project proposals

πŸ““ project milestone 2



Tue, Oct 22

πŸŽ₯ The language of models
πŸ“˜ ims - chp 7.1

The language of models

πŸ–₯️ slides 14
πŸ—’οΈ notes 14
⌨️ ae 12
βœ… ae 12



Thu, Oct 24

πŸŽ₯ Fitting and interpreting models
πŸŽ₯ Modeling nonlinear relationships
πŸ“˜ ims - chp 7.2

Linear regression with a single predictor

πŸ–₯️ slides 15
πŸ—’οΈ notes 15
⌨️ ae 13
βœ… ae 13


10 Mon, Oct 28

Lab 5: Visualize, model, interpret

πŸ’» lab 5

Project milestone 2 at 8:30 am


Tue, Oct 29

πŸŽ₯ Models with multiple predictors
πŸŽ₯ More models with multiple predictors
πŸ“˜ ims - chp 8.1-8.2

Linear regression with multiple predictors I

πŸ–₯️ slides 16
πŸ—’οΈ notes 16
⌨️ ae 14
βœ… ae 14



Thu, Oct 31

πŸ“˜ ims - chp 8.3-8.5

Linear regression with multiple predictors II

πŸ–₯️ slides 17
πŸ—’οΈ notes 17
⌨️ ae 15



Fri, Nov 1


Peer evaluation 2 due by 5:00 pm

11 Mon, Nov 4

Lab 6: Visualize, model, interpret again

πŸ’» lab 6

Lab 5 at 8:30 am


Tue, Nov 5

πŸŽ₯ Logistic regression

Model selection and overfitting

πŸ–₯️ slides 18
πŸ—’οΈ notes 18
⌨️ ae 15 - Continue
βœ… ae 15



Thu, Nov 7

πŸ“˜ ims - chp 9

Logistic regression

πŸ–₯️ slides 19
πŸ—’οΈ notes 19
⌨️ ae 16
βœ… ae 16


12 Mon, Nov 11

Lab 7: Explore and classify

πŸ’» lab 7

Lab 6 at 8:30 am


Tue, Nov 12

πŸŽ₯ Prediction and overfitting

Evaluating models

πŸ–₯️ slides 20
πŸ—’οΈ notes 20
⌨️ ae 17
βœ… ae 17



Thu, Nov 14

πŸŽ₯ Quantifying uncertainty
πŸŽ₯ Bootstrapping
πŸ“˜ ims - chp 12

Quantifying uncertainty with bootstrap intervals

πŸ–₯️ slides 21
πŸ—’οΈ notes 21



Fri, Nov 15


Project milestone 3 - Improvement and progress at 5:00 pm

13 Mon, Nov 18

Lab 8: Everything so far II

πŸ’» lab 8

Lab 7 at 8:30 am


Tue, Nov 19

πŸ“˜ ims - chp 11

Making decisions with randomization tests

πŸ–₯️ slides 22
πŸ—’οΈ notes 22
⌨️ ae 18
βœ… ae 18



Thu, Nov 21

Inference overview

πŸ–₯️ slides 23
πŸ—’οΈ notes 23
⌨️ ae 19



Fri, Nov 22


Peer evaluation 3 due by 5:00 pm

14 Mon, Nov 25

Project milestone 4 - Peer review

πŸ““ project milestone 4

Project milestone 4 at the end of lab session


Tue, Nov 26

πŸŽ₯ Tips for effective data visualization
πŸ“˜ ims - chp 6
πŸ“— r4ds - chp 10
πŸŽ₯ Doing data science

Communicating data science results effectively
Customizing Quarto reports and presentations

πŸ–₯️ slides 24
πŸ—’οΈ notes 24
⌨️ ae 19
βœ… ae 19

Lab 8 at 10:30 pm


Thu, Nov 28

No lecture - Thanksgiving



15 Mon, Dec 2

Project milestone 5 - Work on writeup and presentations

πŸ““ project milestone 5



Tue, Dec 3

Looking back: STA 199 overview

πŸ–₯️ slides 25
πŸ—’οΈ notes 25
⌨️ ae 20
βœ… ae 20



Thu, Dec 5

Looking further: Building interactive web apps with R and Shiny

πŸ–₯️ slides 26
πŸ—’οΈ notes 26
⌨️ ae 21
βœ… ae 21

Project milestone 5 - Writeup and presentation videos by 5:00 pm


Fri, Dec 6


Peer evaluation 4 due by 5:00 pm


Tue, Dec 10

Final review (11 am - 1 pm, Bio Sci 111)

πŸ“ final review
βœ… final review


16 Thu, Dec 12

Final (9 am - 12 pm)