Lecture 1
Duke University
STA 199 - Fall 2024
August 29, 2024
If you have not yet completed the Getting to know you survey, please do so asap!
If you have not yet accepted the invite to join the course GitHub Organization (I’m looking at 25 of you as of this morning!), please do so asap!
Peer tutoring
More info at https://sta199-f24.github.io/course-support.html#peer-tutoring
Complete all the preparation work before class.
Ask questions.
Do the readings.
Do the lab.
Don’t procrastinate – at least on a weekly basis!
Course operation
Doing data science
By the end of the course, you will be able to…
What does it mean for a data analysis to be “reproducible”?
Short-term goals:
Long-term goals:
Packages: Fundamental units of reproducible R code, including reusable R functions, the documentation that describes how to use them, and sample data1
As of 27 August 2024, there are 21,168 R packages available on CRAN (the Comprehensive R Archive Network)2
We’re going to work with a small (but important) subset of these!
Option 1:
Sit back and enjoy the show!
Option 2:
Go to your container and launch RStudio.
install.packages()
, once per system:Note
We already pre-installed many of the package you’ll need for this course, so you might go the whole semester without needing to run install.packages()
!
library()
, once per session:If data analysis was cooking…
Installing a package would be like buying ingredients from the store
Loading a package would be like getting the ingredients out of your pantry and setting them on your counter top to be used
aka the package you’ll hear about the most…
penguins
data framebill_length_mm
flipper_length_mm
First, run the code below and read the error. Then, fix the code to access the flipper_length_mm
variable in the penguins
data frame.
function(argument)
Functions are (most often) verbs, followed by what they will be applied to in parentheses:
trim
med mean()
Object documentation can be accessed with ?
I cleaned up your data!
No thanks to the people who responded “yo” or “The fifth day in the month of October!!” or “may twentieth”! 🤣
Dates with more than one student with a birthday:
# A tibble: 65 × 2
birthday n
<chr> <int>
1 04-05 4
2 04-15 4
3 05-20 4
4 06-18 4
5 10-25 4
6 01-24 3
7 03-15 3
8 03-29 3
9 04-19 3
10 04-22 3
11 05-15 3
12 06-25 3
13 07-26 3
14 08-29 3
15 09-11 3
16 11-19 3
17 12-20 3
18 01-07 2
19 01-08 2
20 01-13 2
21 01-20 2
22 01-21 2
23 01-23 2
24 01-25 2
25 01-27 2
26 02-07 2
27 02-18 2
28 03-07 2
29 03-14 2
30 03-16 2
31 03-22 2
32 03-23 2
33 03-28 2
34 03-30 2
35 04-06 2
36 04-12 2
37 05-01 2
38 05-05 2
39 05-11 2
40 05-12 2
41 05-28 2
42 05-30 2
43 06-02 2
44 06-04 2
45 06-28 2
46 07-22 2
47 08-09 2
48 08-16 2
49 09-04 2
50 09-15 2
51 09-17 2
52 09-19 2
53 09-28 2
54 09-30 2
55 10-12 2
56 10-14 2
57 10-28 2
58 11-03 2
59 11-06 2
60 11-07 2
61 11-14 2
62 11-16 2
63 11-17 2
64 11-21 2
65 12-03 2
GitHub is the home for your Git-based projects on the internet – like DropBox but much, much better
We will use GitHub as a platform for web hosting and collaboration (and as our course management system!)
with human readable messages
Option 1:
Sit back and enjoy the show!
Note
You’ll need to stick to this option if you haven’t yet accepted your GitHub invite and don’t have a repo created for you.
Option 2:
Go to the course GitHub organization and clone ae-your_github_name
repo to your container.
Find your application repo, that will always be named using the naming convention assignment_title-your_github_name
Click on the green “Code” button, make sure SSH is selected, copy the repo URL
yes
in the pop-up dialogueNever received GitHub invite \(\rightarrow\) Fill out “Getting to know you survey
Never accepted GitHub invite \(\rightarrow\) Look for it in your email and accept it
Cloning repo fails \(\rightarrow\) Review/redo Lab 0 steps for setting up SSH key
Still no luck? Come by my office today (Thursday, 8/29) between 4-5pm or post on Ed for help
Option 1:
Sit back and enjoy the show!
Note
If you chose (or had to choose) this option for the previous tour, or if you couldn’t clone your repo for any reason, you’ll need to stick to this option.
Option 2:
Go to RStudio and open the document ae-01-meet-the-penguins.qmd
.
Once we made changes to our Quarto document, we
went to the Git pane in RStudio
staged our changes by clicking the checkboxes next to the relevant files
committed our changes with an informative commit message
pushed our changes to our application exercise repos
confirmed on GitHub that we could see our changes pushed from RStudio
Grab one before you leave!