Final review
Suggested answers
- Take a random sample of size 25, with replacement, from the original sample. Calculate the proportion of students in this simulated sample who work 5 or more hours. Repeat this process 1000 times to build the bootstrap distribution. Take the middle 95% of this distribution to construct a 95% confidence interval for the true proportion of statistics majors who work 5 or more hours.
- The exact 95% CI is (40%, 80%). Answers reasonably close to the upper and lower bounds would be accepted.
- (e) None of the above. The correct interpretation is “We are 95% confident that 40% to 80% of statistics majors work at least 5 hours per week.”
- (c) For every additional $1,000 of annual salary, the model predicts the raise to be higher, on average, by 0.0155%.
- $R^2$ of
raise_2_fit
is higher than $R^2$ ofraise_1_fit
sinceraise_2_fit
has one more predictor and $R^2$ always - The reference level of
performance_rating
is High, since it’s the first level alphabetically. Therefore, the coefficient -2.40% is the predicted difference in raise comparing High to Successful. In this context a negative coefficient makes sense since we would expect those with High performance rating to get higher raises than those with Successful performance. - (a) “Poor”, “Successful”, “High”, “Top”.
- Option 3. It’s a linear model with no interaction effect, so parallel lines. And since the slope for
salary_typeSalaried
is positive, its intercept is higher. The equations of the lines are as follows:Hourly:
\[ \begin{align*} \widehat{percent\_incr} &= 1.24 + 0.0000137 \times annual\_salary + 0.913 salary\_typeSalaried \\ &= 1.24 + 0.0000137 \times annual\_salary + 0.913 \times 0 \\ &= 1.24 + 0.0000137 \times annual\_salary \end{align*} \]
Salaried:
\[ \begin{align*} \widehat{percent\_incr} &= 1.24 + 0.0000137 \times annual\_salary + 0.913 salary\_typeSalaried \\ &= 1.24 + 0.0000137 \times annual\_salary + 0.913 \times 1 \\ &= 2.153 + 0.0000137 \times annual\_salary \end{align*} \]
- A parsimonious model is the simplest model with the best predictive performance.
- (c) The model predicts that the percentage increase employees with Successful performance get, on average, is higher by a factor of 1025 compared to the employees with Poor performance rating.\/(a) and (d).
- (a) and (d).
- (c) We are 95% confident that the mean number of texts per month of all American teens is between 1450 and 1550.