Warm Up

Andrew Dickinson

Published

May 21, 2024

Testing Hypotheses in Regression Analysis

Consider a dataset obtained from a study on the impact of various factors on the number of bicyclists hit by motor vehicles in Eugene. The dataset includes a random sample of 26 months and the following variables:

  • \(\text{Hits}_i\): The total number of bicyclists hit by motor vehicles in month \(i\).
  • \(\text{Rain}_i\): The total number of days it rained in month \(i\).
  • \(\text{Temp}_i\): The average daytime temperature in month \(i\) (in degrees Celsius).
  • \(\text{Days}_i\): The total number of days in month \(i\).
  • \(\text{WeekDays}_i\): The total number of weekdays (Monday-Friday) in month \(i\).
  • \(\text{School}_i\): A binary variable indicating if the University of Oregon is in session in month \(i\).

You then estimate the following regression models via OLS:

Regression Models

  1. \(\text{Hits}_i = \beta_1 + \beta_2 \text{Rain}_i + \beta_3 \text{Temp}_i + \beta_4 \text{Days}_i + \beta_5 \text{WeekDays}_i + \beta_6 \text{School}_i + u_i\)
  2. \(\text{Hits}_i = \beta_1 + \beta_2 \text{Temp}_i + \beta_3 \text{Days}_i + \beta_4 \text{WeekDays}_i + \beta_5 \text{School}_i + u_i\)
  3. \(\text{Hits}_i = \beta_1 + \beta_2 \text{Days}_i + \beta_3 \text{WeekDays}_i + u_i\)
  4. \(\text{Hits}_i = \beta_1 + \beta_2 \text{Days}_i + \beta_3 \text{WeekDays}_i + \beta_4 \text{School}_i + u_i\)
  5. \(\text{Hits}_i = \beta_1 + \beta_2 \text{Rain}_i + \beta_3 \text{Temp}_i + \beta_4 \text{Days}_i + \beta_5 \text{School}_i + u_i\)

The corresponding values of RSS for each regression model are as follows:

Regression RSS
Regression 1 400
Regression 2 410
Regression 3 450
Regression 4 420
Regression 5 445

Questions

1. Interpretation of Coefficients (5 points)

In Regression 1, interpret the coefficient \(\beta_3\) (the coefficient for the variable \(\text{Temp}_i\)). What does this coefficient tell us about the relationship between the average daytime temperature and the number of bicyclists hit by motor vehicles?

Solution:

2. Hypothesis Testing

Test the following hypothesis at the 1% significance level (parameter subscripts refer to Regression 1):

\[ H_0: \beta_2 = \beta_3 = 0 \] \[ H_1: \beta_2 \neq \beta_3 \neq 0 \]

Recall the \(F\)-statistic is given by:

\[ \begin{align} F_{q,\,n-k-1} = \dfrac{\left(\text{RSS}_r - \text{RSS}_u\right)/q}{\text{RSS}_u/(n-k-1)} \end{align} \]

where \(q\) is the number of restrictions, \(n\) is the number of observations, and \(k\) is the number of regressors in the unrestricted model. The critical value for the \(F\)-statistic at the 1% significance level is:

F_crit = qf(0.99, q, n-k-1)
F_crit
[1] 5.925879

Do you reject or fail to reject the null hypothesis? Explain your answer.

3. Calculating \(R^2\)

Suppose the total sum of squares (TSS) is 500. What is the \(R^2\) for Regression 3?