EC 320, Set 03
Spring 2024
PS02:
Reading: (up to this point)
ItE: R, 1, 2
K06 & K06b
dplyr
lessonlm
) lesson
We considered a simple linear regression of \(Y_i\) on \(X_i\):
\[ Y_i = \beta_0 + \beta_1X_i + u_i. \]
We derived the OLS estimator by minimizing \(\sum_{i=1}^n \hat{u}_i^2\).
Intercept:
\[ \hat{\beta}_0 = \bar{Y} - \hat{\beta}_1 \bar{X} \]
Slope:
\[ \begin{aligned} \hat{\beta}_1 &= \dfrac{\sum_{i=1}^n (Y_i - \bar{Y})(X_i - \bar{X})}{\sum_{i=1}^n (X_i - \bar{X})^2} \end{aligned} \]
We used these formulas to obtain estimates of the parameters \(\beta_0\) and \(\beta_1\) in a regression of \(Y_i\) on \(X_i\).
With the OLS estimates, we constructed a regression line:
\[ \hat{Y_i} = \hat{\beta}_0 + \hat{\beta}_1X_i. \]
OLS still produces prediction errors: \(\hat{u}_i = Y_i - \hat{Y_i}\).
What is the equation for the regression model estimated below?
The estimated intercept is -11.13. What does this tell us?
With zero years of education, estimate hourly earnings would be –$10.44
The estimated slope is 2.32. How do we interpret it?
For each additional year of schooling, hourly earnings increase by $2.32
OLS has three important properties:
The point \((\bar{X}, \bar{Y})\) is always on the regression line.
\[ \begin{aligned} \hat{Y_i} &= \bar{Y} - \hat{\beta}_2 \bar{X} + \hat{\beta}_2\bar{X} \\ &= \bar{Y} \end{aligned} \]
Residuals sum to zero: \(\sum_{i=1}^n \hat{u}_i = 0\)
Steps
\[ \sum_{i=1}^n \hat{u}_i = \sum_{i=1}^n y_i - \hat{\beta}_0 - \hat{\beta}_1x_i \]
\[ \sum_{i=1}^n \hat{u}_i = \sum_{i=1}^n y_i - \sum_{i=1}^n\bar{y} + \hat{\beta}_1\sum_{i=1}^n\bar{x} - \hat{\beta}_1 \sum_{i=1}^n x_i \\ \]
\[\begin{align*} \sum_{i=1}^n \hat{u}_i & = \sum_{i=1}^n y_i - n \left( \frac{1}{n} \sum_{i=1}^n y_i \right) + \hat{\beta}_1 n \left(\frac{1}{n} \sum_{i=1}^n x_i\right) - \hat{\beta}_1 \sum_{i=1}^n x_i \\ &= \sum_{i=1}^n \left(y_i - y_i\right) + \hat{\beta}_1 \sum_{i=1}^n\left( x_i - x_i\right) \\ &=0 \end{align*}\]
Thus, the sum of the residuals is zero
Why is this true?
Intercept Alignment:
Balanced Deviations:
In one sentence:
The fitted line, centered through \(\bar{x}\) and \(\bar{y}\), balances the positive and negative deviations to minimize the total error
Using the covariance formula, and Property 2:
\[\begin{equation} \mathrm{Cov}(x_i,\hat{u}_i) = \frac{1}{n-1}\sum_{i=1}^n(x_i-\bar{x})(\hat{u}_i-\bar{\hat{u}}) \end{equation}\]
Since \(\sum_{i=1}^n \hat{u}_i = 0\), the term \(\sum_{i=1}^n \bar{\hat{u}} = 0\).
\[\begin{align*} \mathrm{Cov}(x_i,\hat{u}_i) &= \frac{1}{n-1}\sum_{i=1}^nx_i\hat{u}_i-\bar{x}\hat{u}_i\\ &= \frac{1}{n-1}\sum_{i=1}^nx_i(y_i - \hat{\beta}_0 - \hat{\beta}_1x_i)-\frac{\bar{x}}{n-1}\sum_{i=1}^n\hat{u}_i\\ &= \frac{1}{n-1}\sum_{i=1}^nx_iy_i - \hat{\beta}_0\sum_{i=1}^nx_i - \hat{\beta}_1\sum_{i=1}^nx_i^2 \\ &= \frac{1}{2(n-1)}\left(2\sum_{i=1}^nx_iy_i - 2\hat{\beta}_0\sum_{i=1}^nx_i - 2\hat{\beta}_1\sum_{i=1}^nx_i^2 \right)\\ &= \frac{1}{2(n-1)}\left(0 \right)\\ &= 0 \end{align*}\]
Where
\[ 2\sum_{i=1}^nx_iy_i - 2\hat{\beta}_0\sum_{i=1}^nx_i - 2\hat{\beta}_1\sum_{i=1}^nx_i^2=0 \]
is a result from the first order condition of \(\hat{\beta_1}\)
Regression 1 vs. Regression 2
Q: Which fitted regression line “explains”1 the data better?
Regression 1 vs. Regression 2
The coefficient of determination, \(R^2\), is the fraction of the variation in \(Y_i\) “explained” by \(X_i\).
Residuals remind us that there are parts of \(Y_i\) we can’t explain.
\[ Y_i = \hat{Y_i} + \hat{u}_i \]
\[ \bar{\hat{u}} = 0 \implies \bar{Y} = \bar{\hat{Y}} \]
Total Sum of Squares (TSS) measures variation in \(Y_i\):
\[ {\color{#BF616A} \text{TSS}} \equiv \sum_{i=1}^n (Y_i - \bar{Y})^2 \]
Explained Sum of Squares (ESS) measures the variation in \(\hat{Y_i}\):
\[ {\color{#EBCB8B} \text{ESS}} \equiv \sum_{i=1}^n (\hat{Y_i} - \bar{Y})^2 \]
Residual Sum of Squares (RSS) measures the variation in \(\hat{u}_i\):
\[ {\color{#D08770} \text{RSS}} \equiv \sum_{i=1}^n \hat{u}_i^2 \]
Goal: Show that \({\color{#BF616A} \text{TSS}} = {\color{#EBCB8B} \text{ESS}} + {\color{#D08770} \text{RSS}}\).
Goal: Show that \({\color{#BF616A} \text{TSS}} = {\color{#EBCB8B} \text{ESS}} + {\color{#D08770} \text{RSS}}\).
Step 1: Plug \(Y_i = \hat{Y_i} + \hat{u}_i\) into \({\color{#BF616A} \text{TSS}}\).
\[ \begin{align*} {\color{#BF616A} \text{TSS}} &= \sum_{i=1}^n (Y_i - \bar{Y})^2 \\ &= \sum_{i=1}^n ([\hat{Y_i} + \hat{u}_i] - [\bar{\hat{Y}} + \bar{\hat{u}}])^2 \end{align*} \]
Goal: Show that \({\color{#BF616A} \text{TSS}} = {\color{#EBCB8B} \text{ESS}} + {\color{#D08770} \text{RSS}}\).
Step 2: Recall that \(\bar{\hat{u}} = 0\) & \(\bar{Y} = \bar{\hat{Y}}\).
\[ \begin{align*} {\color{#BF616A} \text{TSS}} &= \sum_{i=1}^n \left( [\hat{Y_i} - \bar{Y}] + \hat{u}_i \right)^2 \\ &= \sum_{i=1}^n \left( [\hat{Y_i} - \bar{Y}] + \hat{u}_i \right) \left( [\hat{Y_i} - \bar{Y}] + \hat{u}_i \right) \\ &= \sum_{i=1}^n (\hat{Y_i} - \bar{Y})^2 + \sum_{i=1}^n \hat{u}_i^2 + 2 \sum_{i=1}^n \left( (\hat{Y_i} - \bar{Y})\hat{u}_i \right) \end{align*} \]
Goal: Show that \({\color{#BF616A} \text{TSS}} = {\color{#EBCB8B} \text{ESS}} + {\color{#D08770} \text{RSS}}\).
Step 2: Recall that \(\bar{\hat{u}} = 0\) & \(\bar{Y} = \bar{\hat{Y}}\).
\[ \begin{align*} {\color{#BF616A} \text{TSS}} &= \sum_{i=1}^n \left( [\hat{Y_i} - \bar{Y}] + \hat{u}_i \right)^2 \\ &= \sum_{i=1}^n \left( [\hat{Y_i} - \bar{Y}] + \hat{u}_i \right) \left( [\hat{Y_i} - \bar{Y}] + \hat{u}_i \right) \\ &= {\color{#EBCB8B}{\sum_{i=1}^n (\hat{Y_i} - \bar{Y})^2}} + {\color{#D08770}{\sum_{i=1}^n \hat{u}_i^2}} + 2 \sum_{i=1}^n \left( (\hat{Y_i} - \bar{Y})\hat{u}_i \right) \end{align*} \]
Step 3: Notice ESS and RSS.
Goal: Show that \({\color{#BF616A} \text{TSS}} = {\color{#EBCB8B} \text{ESS}} + {\color{#D08770} \text{RSS}}\).
Step 2: Recall that \(\bar{\hat{u}} = 0\) & \(\bar{Y} = \bar{\hat{Y}}\).
\[ \begin{align*} {\color{#BF616A} \text{TSS}} &= \sum_{i=1}^n \left( [\hat{Y_i} - \bar{Y}] + \hat{u}_i \right)^2 \\ &= \sum_{i=1}^n \left( [\hat{Y_i} - \bar{Y}] + \hat{u}_i \right) \left( [\hat{Y_i} - \bar{Y}] + \hat{u}_i \right) \\ &= {\color{#EBCB8B}{\sum_{i=1}^n (\hat{Y_i} - \bar{Y})^2}} + {\color{#D08770}{\sum_{i=1}^n \hat{u}_i^2}} + 2 \sum_{i=1}^n \left( (\hat{Y_i} - \bar{Y})\hat{u}_i \right) \\ &= {\color{#EBCB8B} \text{ESS}} + {\color{#D08770} \text{RSS}} + 2 \sum_{i=1}^n \left( (\hat{Y_i} - \bar{Y})\hat{u}_i \right) \end{align*} \]
Goal: Show that \({\color{#BF616A} \text{TSS}} = {\color{#EBCB8B} \text{ESS}} + {\color{#D08770} \text{RSS}}\).
Step 4: Simplify.
\[ \begin{align*} {\color{#BF616A} \text{TSS}} &= {\color{#EBCB8B} \text{ESS}} + {\color{#D08770} \text{RSS}} + 2 \sum_{i=1}^n \left( (\hat{Y_i} - \bar{Y})\hat{u}_i \right) \\ &= {\color{#EBCB8B} \text{ESS}} + {\color{#D08770} \text{RSS}} + 2 \sum_{i=1}^n \hat{Y_i}\hat{u}_i - 2 \bar{Y}\sum_{i=1}^n \hat{u}_i \end{align*} \]
Step 5: Shut down the last two terms. Notice that
\[ \begin{align*} 2 \sum_{i=1}^n \hat{Y_i}\hat{u}_i - 2 \bar{Y}\sum_{i=1}^n \hat{u}_i = 0 \end{align*} \]
Which you will prove to be true in PS03
Some visual intuition makes all the math seem a lot simpler
Let’s regress mpg
on wt
using a new dataset, mtcars
\[ {\color{#ffffff} \text{TSS} \equiv \sum_{i=1}^n (Y_i - \bar{Y})^2} \]
\[ {\color{#ffffff} \text{TSS} \equiv \sum_{i=1}^n (Y_i - \bar{Y})^2} \]
\[ {\color{#A3BE8C} \overline{\text{MPG}_i}} = 20.09 \]
\[ {\color{#ffffff} \text{TSS} \equiv \sum_{i=1}^n (Y_i - \bar{Y})^2} \]
\[ {\color{#BF616A} \text{TSS}} \equiv \sum_{i=1}^n (Y_i - \bar{Y})^2 \]
\[ {\color{#ffffff} \overline{\text{Crime}_i} = 21.05} \]
\[ {\color{#B48EAD} \widehat{\text{Crime}_i}} = 18.41 + 1.76 \cdot \text{Police}_i \]
\[ {\color{#ffffff} \text{ESS} \equiv \sum_{i=1}^n (\hat{Y_i} - \bar{Y})^2} \]
\[ {\color{#EBCB8B} \text{ESS}} \equiv \sum_{i=1}^n (\hat{Y_i} - \bar{Y})^2 \]
\[ {\color{#ffffff} \widehat{\text{Crime}_i} = 18.41 + 1.76 \cdot \text{Police}_i} \]
\[ {\color{#D08770} \text{RSS}} \equiv \sum_{i=1}^n \hat{u}_i^2 \]
\[ {\color{#ffffff} \text{ESS} \equiv \sum_{i=1}^n (\hat{Y_i} - \bar{Y})^2} \]
\[ {\color{#BF616A} \text{TSS}} \equiv \sum_{i=1}^n (Y_i - \bar{Y})^2 \]
\[ {\color{#EBCB8B} \text{ESS}} \equiv \sum_{i=1}^n (\hat{Y_i} - \bar{Y})^2 \]
\[ {\color{#D08770} \text{RSS}} \equiv \sum_{i=1}^n \hat{u}_i^2 \]
What percentage of the variation in our \(Y_i\) is apparently explained by our model? The \(R^2\) term represents this percentage.
Total variation is represented by TSS and our model is capturing the ‘explained’ sum of squares, ESS.
Taking a simple ratio reveals how much variation our model explains.
\(R^2 = \frac{{\color{#EBCB8B} \text{ESS}}}{{\color{#BF616A} \text{TSS}}}\) varies between 0 and 1
\(R^2 = 1 - \frac{{\color{#D08770} \text{RSS}}}{{\color{#BF616A} \text{TSS}}}\), 100% less the unexplained variation
\(R^2\) is related to the correlation between the actual values of \(Y\) and the fitted values of \(Y\). Can show that \(R^2 = (r_{Y, \hat{Y}})^2\).
So what? In the social sciences, low \(R^2\) values are common.
Low \(R^2\) doesn’t mean that an estimated regression is useless.
High \(R^2\) doesn’t necessarily mean you have a “good” regression.
Policy Question: How much should the state subsidize higher education?
Empirical Question: What is the return to an additional year of education?1
Step 1: Write down the population model.
\[ \log(\text{Earnings}_i) = \beta_1 + \beta_2\text{Education}_i + u_i \]
Step 2: Find data.
Step 3: Run a regression using OLS.
\[ \log(\hat{\text{Earnings}_i}) = \hat{\beta}_1 + \hat{\beta}_2\text{Education}_i \]
Logarithms are functions are convenient for numbers that span several orders of magnitude
The natural logarithm
Inverse function to the exponential function with base \(e\) defined as:
\[ \log{x} \quad \textit{or} \quad \ln{x} \]
Simplifies interpretation and stabilize relationships between variables.
Our example model: \[\log(\text{Earnings}_i) = \beta_1 + \beta_2\text{Education}_i + u_i\]
Natural logarithm is applied to the dependent variable (Earnings).
Estimates how the percentage change in earnings. \(\beta_2\) represents the percentage change in earnings for a one-unit increase in education.
\(\log(\hat{\text{Earnings}_i})\) \(=\) 5.97 \(+\) 0.06 \(\times\) \(\text{Education}_i\).
Additional year of school associated with a 6% increase in earnings.
\(R^2\) \(=\) 0.097.
Education explains 9.7% of the variation in wages.
What must we assume to interpret \(\hat{\beta}_2\) \(=\) 0.06 as the return to schooling?
The most important assumptions concern the error term \(u_i\).
Important: An error \(u_i\) and a residual \(\hat{u}_i\) are related, but different.
Error:
Difference between the wage of a worker with 16 years of education and the expected wage with 16 years of education.
Residual:
Difference between the wage of a worker with 16 years of education and the average wage of workers with 16 years of education.
Population vs. sample
A residual tells us how a worker’s wages compare to the average wages of workers in the sample with the same level of education.
A residual tells us how a worker’s wages compare to the average wages of workers in the sample with the same level of education.
An error tells us how a worker’s wages compare to the expected wages of workers in the population with the same level of education.
A1. Linearity: The population relationship is linear in parameters with an additive error term.
The population relationship is linear in parameters with an additive error term.
Examples
The population relationship is linear in parameters with an additive error term.
Violations
A1. Linearity: The population relationship is linear in parameters with an additive error term.
A2. Sample Variation: There is variation in \(X\).
There is variation in \(X\).
Example
There is variation in \(X\).
Violation
As we will see later, sample variation matters for inference as well.
A1. Linearity: The population relationship is linear in parameters with an additive error term.
A2. Sample Variation: There is variation in \(X\).
A3. Exogeniety: The \(X\) variable is exogenous
Assumption
The \(X\) variable is exogenous:
\[ \mathop{\mathbb{E}}\left( u|X \right) = 0 \]
The assignment of \(X\) is effectively random.
Implication: no selection bias or omitted variable bias
Assumption
The \(X\) variable is exogenous:
\[ \mathop{\mathbb{E}}\left( u|X \right) = 0 \]
Example
In the labor market, an important component of \(u\) is unobserved ability.
Graphically…
Valid exogeniety, i.e., \(\mathop{\mathbb{E}}\left( u \mid X \right) = 0\)
Invalid exogeniety, i.e., \(\mathop{\mathbb{E}}\left( u \mid X \right) \neq 0\)
When can we trust OLS?
An estimator is biased if its expected value is different from the true population parameter.
Unbiased estimator: \(\mathop{\mathbb{E}}\left[ \hat{\beta} \right] = \beta\)
Biased estimator: \(\mathop{\mathbb{E}}\left[ \hat{\beta} \right] \neq \beta\)
A1. Linearity: The population relationship is linear in parameters with an additive error term.
A2. Sample Variation: There is variation in \(X\).
A3. Exogeniety: The \(X\) variable is exogenous
☛ (3) implies random sampling. Without it, our external validity becomes uncertain.1
Let’s prove unbiasedness of OLS
Proving unbiasedness of simple OLS
Suppose we have the following model
\[ Y_i = \beta_1 + \beta_2 X_i + u_i \]
The slope parameter follows as:
\[ \hat{\beta}_2 = \frac{\sum (x_i - \bar{x})(y_i - \bar{y})}{\sum(x_i - \bar{x})^2} \]
(As shown in section 2.3 in ItE) that the estimator \(\hat{\beta_2}\), can be broken up into a nonrandom and a random component:
Proving unbiasedness of simple OLS
Substitute for \(y_i\):
\[ \hat{\beta}_2 = \frac{\sum((\beta_1 + \beta_2x_i + u_i) - \bar{y})(x_i - \bar{x})}{\sum(x_i - \bar{x})^2} \]
Substitute \(\bar{y} = \beta_1 + \beta_2\bar{x}\):
\[ \hat{\beta}_2 = \frac{\sum(u_i(x_i - \bar{x}))}{\sum(x_i - \bar{x})^2} + \frac{\sum(\beta_2x_i(x_i - \bar{x}))}{\sum(x_i - \bar{x})^2} \]
The non-random component, \(\beta_2\), is factored out:
\[ \hat{\beta}_2 = \frac{\sum(u_i(x_i - \bar{x}))}{\sum(x_i - \bar{x})^2} + \beta_2\frac{\sum(x_i(x_i - \bar{x}))}{\sum(x_i - \bar{x})^2} \]
Proving unbiasedness of simple OLS
Observe that the second term is equal to 1. Thus, we have:
\[ \hat{\beta}_2 = \beta_2 + \frac{\sum(u_i(x_i - \bar{x}))}{\sum(x_i - \bar{x})^2} \]
Taking the expectation,
\[ \mathrm{E}[\hat{\beta_2}] = \mathrm{E}[\beta] + \mathrm{E}\left[\frac{\sum \hat{u_i} (x_i - \bar{x})}{\sum(x_i - \bar{x})^2} \right] \]
By Rules 01 and 02 of expected value and A3:
\[ \begin{equation*} \mathrm{E}[\hat{\beta_2}] = \beta + \frac{\sum \mathrm{E}[\hat{u_i}] (x_i - \bar{x})}{\sum(x_i - \bar{x})^2} = \beta \end{equation*} \]
A1. Linearity: The population relationship is linear in parameters with an additive error term.
A2. Sample Variation: There is variation in \(X\).
A3. Exogeniety: The \(X\) variable is exogenous
☛ (3) implies random sampling. Without it, our external validity becomes uncertain.1
Result: OLS is unbiased.
A1. Linearity: The population relationship is linear in parameters with an additive error term.
A2. Sample Variation: There is variation in \(X\).
A3. Exogeniety: The \(X\) variable is exogenous
The following 2 assumptions are not required for unbiasedness…
But the are important for an efficient estimator
Variance matters, too
Unbiasedness tells us that OLS gets it right, on average. But we can’t tell whether our sample is “typical.”
Variance tells us how far OLS can deviate from the population mean.
Unbiasedness tells us that OLS gets it right, on average. But we can’t tell whether our sample is “typical.”
The smaller the variance, the closer OLS gets, on average, to the true population parameters on any sample.
A1. Linearity: The population relationship is linear in parameters with an additive error term.
A2. Sample Variation: There is variation in \(X\).
A3. Exogeniety: The \(X\) variable is exogenous
A4. Homoskedasticity: The error term has the same variance for each value of the independent variable
The error term has the same variance for each value of the independent variable:
\[ \mathop{\mathrm{Var}}(u|X) = \sigma^2. \]
Example:
The error term has the same variance for each value of the independent variable:
\[ \mathop{\mathrm{Var}}(u|X) = \sigma^2. \]
Violation: Heteroskedasticity
The error term has the same variance for each value of the independent variable:
\[ \mathop{\mathrm{Var}}(u|X) = \sigma^2. \]
Violation: Heteroskedasticity
Suppose we study the following relationship:
\[ \text{Luxury Expenditure}_i = \beta_1 + \beta_2 \text{Income}_i + u_i \]
As income increases, variation in luxury expenditures increase
Common test for heteroskedasticity… Plot the residuals across \(X\)
There is more to be said about homo/heteroskedasticity in EC 421
A1. Linearity: The population relationship is linear in parameters with an additive error term.
A2. Sample Variation: There is variation in \(X\).
A3. Exogeniety: The \(X\) variable is exogenous
A4. Homoskedasticity: The error term has the same variance for each value of the independent variable
A5. Non-autocorrelation: The values of error terms have independent distributions
The values of error terms have independent distributions1
\[ E[u_i u_j]=0, \forall i \text{ s.t. } i \neq j \]
Or…
\[ \begin{align*} \mathop{\text{Cov}}(u_i, u_j) &= E[(u_i - \mu_u)(u_j - \mu_u)]\\ &= E[u_i u_j] = E[u_i] E[u_j] = 0, \text{where } i \neq j \end{align*} \]
The values of error terms have independent distributions
\[ E[u_i u_j]=0, \forall i \text{ s.t. } i \neq j \]
Let’s take a moment to talk about the variance of the OLS estimator
Proof of the population variance
By definition
\[ Var(\hat{\beta}_2) = E\left[\left(\hat{\beta}_2 - E(\hat{\beta}_2)\right)^2\right] = E\left[\left(\hat{\beta}_2 - \beta_2\right)^2\right] \]
Step 1: Derive an expression for \(\hat{\beta}_2 - \beta_2\):
\[ \hat{\beta}_2 - \beta_2 = \frac{\sum(y_i - \bar{y})(x_i - \bar{x})}{\sum(x_i - \bar{x})^2} - \beta_2 \]
Substitute for \(Y_i\):
\[ \hat{\beta}_2 - \beta_2 = \frac{\sum((\beta_1 + \beta_2x_i + u_i) - \bar{y})(x_i - \bar{x})}{\sum(x_i - \bar{x})^2} - \beta_2 \]
Proof of the population variance
Substitute \(\bar{y} = \beta_1 + \beta_2\bar{x}\):
\[ \hat{\beta}_2 - \beta_2 = \frac{\sum(u_i(x_i - \bar{x}))}{\sum(x_i - \bar{x})^2} \]
Step 1: Take variance \(\hat{\beta}_2 - \beta_2\):
\[ Var(\hat{\beta}_2) = E\left[\left(\frac{\sum(u_i(x_i - \bar{x}))}{\sum(x_i - \bar{x})^2}\right)^2\right] \]
Square the numerator:
\[ Var(\hat{\beta}_2) = E\left[\frac{\sum(u_i(x_i - \bar{x}))\sum(\epsilon_j(x_j - \bar{x}))}{\left(\sum(x_i - \bar{x})^2\right)^2}\right] \]
Using A5, we can simplify:
\[ Var(\hat{\beta}_2) = \frac{1}{\left(\sum(x_i - \bar{x})^2\right)^2}\sum E(u_i^2(x_i - \bar{x})^2) \]
Using A4: \(Var(u_i) = \sigma^2\).
\[ Var(\hat{\beta}_2) = \frac{\sigma^2}{\left(\sum(x_i - \bar{x})^2\right)^2}\sum (x_i - \bar{x})^2 \]
Thus we arrive at the variance of the OLS slope parameter:
\[ Var(\hat{\beta}_2) = \frac{\sigma^2}{\sum(x_i - \bar{x})^2} \]
If A4. and A5. are satisfied, along with A1., A2., and A3., then we are using the most efficient linear estimator.
A1. Linearity: The population relationship is linear in parameters with an additive error term.
A2. Sample Variation: There is variation in \(X\).
A3. Exogeniety: The \(X\) variable is exogenous
A4. Homoskedasticity: The error term has the same variance for each value of the independent variable
A5. Non-autocorrelation: The values of error terms have independent distributions
A6. Normality: The population error term in normally distributed with mean zero and variance \(\sigma^2\)
The population error term in normally distributed with mean zero and variance \(\sigma^2\)
Or,
\[ u \sim N(0,\sigma^2) \]
Where \(\sim\) stands for distributed by and \(N\) stands for normal distribution
However, A6. is note required for efficientcy nor unbiasedness
OLS is the Best Linear Unbiased Estimator (BLUE) when the following assumptions hold:
A1. Linearity: The population relationship is linear in parameters with an additive error term.
A2. Sample Variation: There is variation in \(X\).
A3. Exogeniety: The \(X\) variable is exogenous
A4. Homoskedasticity: The error term has the same variance for each value of the independent variable
A5. Non-autocorrelation: The values of error terms have independent distributions
OLS is the Best Unbiased Estimator (BUE) when the following assumptions hold:
A1. Linearity: The population relationship is linear in parameters with an additive error term.
A2. Sample Variation: There is variation in \(X\).
A3. Exogeniety: The \(X\) variable is exogenous
A4. Homoskedasticity: The error term has the same variance for each value of the independent variable
A5. Non-autocorrelation: The values of error terms have independent distributions
A6. Normality: The population error term in normally distributed with mean zero and variance \(\sigma^2\)
Q: Why do we care about population vs. sample?
\[ \begin{align*} y_i &= 2.53 + 0.57 x_i + u_i \\ \end{align*} \]
Q: Why do we care about population vs. sample?
\[ \begin{align*} y_i &= 2.53 + 0.57 x_i + u_i \\ \hat{y}_i &= 2.36 + 0.61 x_i \end{align*} \]
Q: Why do we care about population vs. sample?
$$ \[\begin{align*} y_i &= 2.53 + 0.57 x_i + u_i \\ \hat{y}_i &= 2.79 + 0.56 x_i \end{align*}\] $$
Q: Why do we care about population vs. sample?
\[ \begin{align*} y_i &= 2.53 + 0.57 x_i + u_i \\ \hat{y}_i = 3.21 + 0.45 x_i \end{align*} \]
By now you should know where this is going… Repeat 10,000 times
Q: Why do we care about population vs. sample?
Answer: Uncertainty matters.
Variance of the slope estimator:
\[ \mathop{\mathrm{Var}}(\hat{\beta}_2) = \frac{\sigma^2}{\sum_{i=1}^n (X_i - \bar{X})^2} \]
Variance of the slope estimator:
\[ \mathop{\mathrm{Var}}(\hat{\beta}_2) = \frac{\sigma^2}{\sum_{i=1}^n (X_i - \bar{X})^2} \]
Same as before \(u_i \sim (0,1)\), \(X_i \sim (0,1.5)\), \(n = 30\)
Increase variance in error term \(u_i \sim (0,2.5)\), \(X_i \sim (0,1.5)\), \(n = 30\)
Variance of the slope estimator:
\[ \mathop{\mathrm{Var}}(\hat{\beta}_2) = \frac{\sigma^2}{\sum_{i=1}^n (X_i - \bar{X})^2} \]
Original simulation \(u_i \sim (0,1)\), \(X_i \sim (0,1.5)\), \(n = 30\)
Increase variance in \(X_i\) term \(u_i \sim (0,1)\), \(X_i \sim (0,3)\), \(n = 30\)
Variance of the slope estimator:
\[ \mathop{\mathrm{Var}}(\hat{\beta}_2) = \frac{\sigma^2}{\sum_{i=1}^n (X_i - \bar{X})^2} \]
Original simulation \(u_i \sim (0,1)\), \(X_i \sim (0,1.5)\), \(n = 30\)
Increase sample size (\(n\)) in \(X_i\) term \(u_i \sim (0,1)\), \(X_i \sim (0,1)\), \(n = 60\)
We will further this discussion next time to conduct inference.
EC320, Set 03 | Simple linear regression (again)