PS02

Due date: April 23 @ 11:59p

Instructions

Complete the following questions below and show all work. You may either type or hand write your answers. However, you must submit your problem set to Canvas as an html or pdf. Meaning any handwritten solutions are to be scanned and uploaded. The onus is yours to deliver an organized, clear, and/or legible submission of the correct file type that loads properly on Canvas.1 Double-check your submissions!

Integrity: If you are suspected of cheating, you will receive a zero—for the assignment and possibly for the course. Cheating includes copying work from your classmates, from the internet, and from previous problem sets. You are encouraged to work with your peers, but everyone must submit their own answers. Remember, the problem sets are designed to help you study for the midterm. By cheating, you do yourself a disservice.

Questions

001. Simple Linear Regression without an Intercept

In a simple linear regression model, the intercept term represents the expected value of the dependent variable when the independent variable is zero. However, in some cases, it might not make sense to have an intercept or assume that the dependent variable has a non-zero value when the independent variable is zero. In such situations, a simple linear regression without an intercept might be more appropriate. Suppose we have the following simple linear regression model without an intercept:

\[ Y_i = \beta X_i + u_i \]

where the corresponding residuals are written as:

\[ \hat{u} = Y_i - {\beta} X_i \]

In this question, we will derive the OLS estimate of \(\beta\).2

a. Describe in words what the objective of the OLS estimator is and how the first order condition reaches that objective.

The objective of the OLS estimator, in the context of no intercept, is to find the \(\beta\) that minimizes the sum of the squared residuals. The intuition behind the FOC is that a minimum will occur when the derivative of the RSS with respect to \(\beta\) will be equal to zero. This is smooth as a function of \(\beta\), so all minima (or maxima) occur when the derivative is zero.

b. Set up and solve the first order condition. (i.e. Find \(\frac{\partial \text{RSS}}{\partial \hat{\beta}}=0\))

To set up the first order condition, we pick a \(\beta\) that minimizes the sum of squared errors. Setting up the RSS

\[ \text{RSS}(\beta) = \sum_{i=1}^{n} (y_i - \beta x_i)^2 \]

We can differentiate the RSS function with respect to \(\beta\) and set the resulting expression to zero. This will help us find the minimum of the function.

\[ \frac{\partial \text{SSE}(\beta)}{\partial \beta} = -2 \sum_{i=1}^{n} x_i (y_i - \beta x_i) = 0 \]

c. Solve for the simple OLS estimator (i.e., Solve for \(\beta\) from the first order condition above).

\begin{solution} Solving for \(\beta\):

\[ \sum_{i=1}^{n} x_i y_i - \beta \sum_{i=1}^{n} x_i^2 = 0 \]

Rearranging to isolate \(\beta\):

\[ \beta = \frac{\sum_{i=1}^{n} x_i y_i}{\sum_{i=1}^{n} x_i^2} \]

Thus, the OLS estimator for a simple linear regression without an intercept is given by:

\[ \hat{\beta} = \frac{\sum_{i=1}^{n} x_i y_i}{\sum_{i=1}^{n} x_i^2} \]



002. OLS Estimation

Suppose we have the following data on a dependent variable (Y) and an independent variable (X):

Observation Number Y X
1 10 12
2 2 14
3 6 16

We wish to estimate the simple linear regression model:

\[ Y_i = \beta_1 + \beta_2X_i + u_i \]

a.Calculate the OLS estimates \(\hat{\beta}_1\) and \(\hat{\beta}_2\). Show your work.

\[ \bar{Y} = \frac{1}{3}(10 + 2 + 6) = 6, \quad \bar{X} = \frac{1}{3}(12 + 14 + 16) = 14 \]

\[ \hat{\beta}_2 = \frac{(10 - 6)(12 - 14) + (2 - 6)(14 - 14) + (6 - 6)(16 - 14)}{(12 - 14)^2 + (14 - 14)^2 + (16 - 14)^2} \]

\[ = \frac{-8}{8} = -1 \]

\[ \hat{\beta}_1 = 6 - (-1)(14) = 6 + 14 = 20 \]

b. Calculate the \(R^2\) for this regression. Show your work.

\[ R^2 = \frac{ESS}{TSS} \]

TSS (Total Sum of Squares) calculation:

\[ TSS = (10 - 6)^2 + (2 - 6)^2 + (6 - 6)^2 = 32 \]

Estimated values (\(\hat{Y_i}\)) using the estimated regression equation:

\[ \hat{Y_1} = 20 - 12 = 8 \]

\[ \hat{Y_2} = 20 - 14 = 6 \]

\[ \hat{Y_3} = 20 - 16 = 4 \]

ESS (Explained Sum of Squares) calculation:

\[ ESS = (8 - 6)^2 + (6 - 6)^2 + (4 - 6)^2 = 8 \]

Finally, calculating \(R^2\):

\[ R^2 = \frac{8}{32} = \frac{1}{4} \]




003. Interpretation of OLS Estimates

Suppose you are trying to figure out how to make cars more fuel efficient, so you run a regression of miles per gallon on weight to estimate the effect of weight on MPG.

\[ mpg_i = \beta_0 + \beta_1 wt_i + u_i \] You run a regression, and get the following output:

# A tibble: 2 × 5
  term        estimate std.error statistic  p.value
  <chr>          <dbl>     <dbl>     <dbl>    <dbl>
1 (Intercept)    37.3      1.88      19.9  8.24e-19
2 wt             -5.34     0.559     -9.56 1.29e-10

a. Give the interpretation of the two estimates.

A 1000 lb increase in weight is associated with a 5.34 mpg decrease in fuel efficiency. Cars that weigh nothing get 37.29 mpg on average.

b. The regression in a. produced an R-squared of \(0.75\). What does this value tell us?

The R-squared tells us that \(75\%\) of the variation in miles per gallon is explained by weight.

Footnotes

  1. Do not simply change the file type and submit (eg edit name of document from .jpg \(\rightarrow\) .pdf)↩︎

  2. Hint: This derivation follows the derivation of OLS with an intercept—with a lot less algebra. The result will look different yet be functionally equivalent of the standard result↩︎