Warm Up

Andrew Dickinson

Published

May 30, 2024

Categorical Variables

Suppose you want to model the relationship between an individual’s pay (\(Pay\)) and their gender (\(Male\)) using the following model:

\[ Pay_i = \beta_0 + \beta_1 Male_i + u_i \]

where \(Male_i\) is a binary variable that equals 1 if the individual is male and 0 otherwise. Which of the following statements is correct?

  1. \(\beta_0\) represents the expected pay for males.
  2. \(\beta_1\) represents the expected difference in pay between males and females.
  3. \(\beta_0\) represents the expected pay for females.
  4. \(\beta_1\) represents the expected pay for males.

Consider the following model attempting to explain a person’s pay (\(Pay\)) as a function of their years of education (\(School\)), gender (\(Male\)), and their interaction:

\[ Pay_i = \beta_0 + \beta_1 School_i + \beta_2 Male_i + \beta_3 (School_i \times Male_i) + u_i \]

For this regression, which of the following statements is closest to being correct?

  1. \(\beta_2\) gives the expected difference in pay between males and females, holding years of education constant.
  2. \(\beta_3\) gives the expected difference in pay for an additional year of schooling for males compared to females.
  3. \(\beta_1\) gives the expected increase in pay for an additional year of schooling for females.
  4. \(\beta_0\) gives the expected pay for females with zero years of schooling.