18.S096 Problem Set 3

We have $\displaystyle H = X \left(X^T X\right)^{-1} X^T$.

1.a

Let’s first prove that $\left(X^T X\right)^{-1}$ is symmetric:

$\displaystyle \left(\left(X^T X\right)^{-1}\right)^T (X^T X) = \left(\left(X^T X\right)^{-1}\right)^T (X^T X)^T$

$\displaystyle = \left(X^T X \left(X^T X\right)^{-1}\right)^T$

$\displaystyle = I^T$

$\displaystyle = I$.

So we have $\left(X^T X\right)^{-1} = \left(\left(X^T X\right)^{-1}\right)^T$ by uniqueness of the inverse.

Proving that $H$ is a projection matrix:

$\displaystyle H^T = \left(X \left(X^T X\right)^{-1} X^T\right)^T$

$\displaystyle = \left(X^T\right)^T \left(\left(X^T X\right)^{-1}\right)^T X^T$

$\displaystyle = X \left(X^T X\right)^{-1} X^T$

$\displaystyle = H$

$\displaystyle H^2 = X \left(X^T X\right)^{-1} X^T X \left(X^T X\right)^{-1} X^T$

$\displaystyle = X \left(X^T X\right)^{-1} (X^T X) \left(X^T X\right)^{-1} X^T$

$\displaystyle = X \left(X^T X\right)^{-1} X^T$

$\displaystyle = H$.

1.b

$\displaystyle \frac{d\hat{y}_i}{d y_i} = \frac{d}{d y_i}\sum_{j=1}^n H_{ij} y_j = \sum_{j=1}^n H_{ij} \frac{d y_j}{d y_i} = \sum_{j=1}^n H_{ij} \delta_{j i} = H_{i i}$

1.c

$\displaystyle \mathrm{Avg}(H_{i i}) = \frac{1}{n}\mathrm{tr} H$

$\displaystyle = \frac{1}{n}\mathrm{tr} \left(X \left(X^T X\right)^{-1} X^T\right)$

$\displaystyle = \frac{1}{n}\mathrm{tr} \left(X^T X \left(X^T X\right)^{-1}\right)$

$\displaystyle = \frac{1}{n}\mathrm{tr} I_p$

$\displaystyle = \frac{p}{n}$

1.d

$\displaystyle H' = X' \left(X'^T X'\right)^{-1} X'^T$

$\displaystyle = X G \left((X G)^T X G\right)^{-1} (X G)^T$

$\displaystyle = X G \left(G^T X^T X G\right)^{-1} G^T X^T$

$\displaystyle = X G G^{-1} \left(X^T X\right)^{-1} \left(G^T\right)^{-1} G^T X^T$

$\displaystyle = X \left(X^T X\right)^{-1} X^T$

$\displaystyle = H$

1.e

(i)

$\displaystyle \hat{y} = X' \beta' + \epsilon$

$\displaystyle X \beta + \epsilon = X G \beta' + \epsilon$

$\displaystyle X \beta = X G \beta'$

$\displaystyle (X^T X)^{-1} X^T X \beta = (X^T X)^{-1} X^T X G \beta'$

$\displaystyle \beta = G \beta'$

$\displaystyle \beta' = G^{-1} \beta$

(ii)

We look for $G^{-1}$ with $G_{ij} = \delta_{i j} - \delta_{i 1} \bar{x}_{j-1}$ and $\bar{x}_0 = 0$.

$\displaystyle G^{-1} G = I$

$\displaystyle \sum_{k=1}^{p+1} (G^{-1})_{i k} G_{k j} = \delta_{i j}$

$\displaystyle \sum_{k=1}^{p+1} (G^{-1})_{i k} \left(\delta_{k j} - \delta_{k 1} \bar{x}_{j-1}\right) = \delta_{i j}$

$\displaystyle \sum_{k=1}^{p+1} (G^{-1})_{i k} \delta_{k j} - \sum_{k=1}^{p+1} (G^{-1})_{i k} \delta_{k 1} \bar{x}_{j-1} = \delta_{i j}$

$\displaystyle (G^{-1})_{i j} - (G^{-1})_{i 1} \bar{x}_{j-1} = \delta_{i j}$

Now we have different cases:

First column (j = 1)

$\displaystyle (G^{-1})_{i 1} - (G^{-1})_{i 1} \bar{x}_0 = \delta_{i 1}$

$\displaystyle (G^{-1})_{i 1} = \delta_{i 1}$

So the first column has 1 in the first row, and 0 in all the others.

First row (i = 1)

$\displaystyle (G^{-1})_{1 j} - (G^{-1})_{1 1} \bar{x}_{j-1} = \delta_{1 j}$

$\displaystyle (G^{-1})_{1 j} = \delta_{1 j} + \bar{x}_{j-1}$

$\displaystyle (G^{-1})_{1 j} = \bar{x}_{j-1}\quad(j \ne 1)$

So the first row has 1 in the first column and $\bar{x}_{j-1}$ in the others.

Diagonal (i = j)

$\displaystyle (G^{-1})_{i i} - (G^{-1})_{i 1} \bar{x}_{i-1} = \delta_{i i}$

$\displaystyle (G^{-1})_{i i} - \delta_{i 1} \bar{x}_{i-1} = 1$

$\displaystyle (G^{-1})_{i i} = 1$

So the diagonal is all 1.

General values (i != j) (i != 1) (j != 1)

$\displaystyle (G^{-1})_{i j} = 0$

General

$\displaystyle (G^{-1})_{i j} = \delta_{i j} + \delta_{i 1}\bar{x}_{j-1}$

Checking:

$\displaystyle \sum_{k=1}^{p+1} (G^{-1})_{i k} G_{k j} = \sum_{k=1}^{p+1} \left(\delta_{i k} + \delta_{i 1}\bar{x}_{k-1}\right)\left( \delta_{k j} - \delta_{k 1} \bar{x}_{j-1}\right)$

$\displaystyle = \sum_{k=1}^{p+1} \delta_{i k} \delta_{k j} - \delta_{i k}\delta_{k 1}\bar{x}_{j-1} + \delta_{i 1}\bar{x}_{k-1} \delta_{k j} - \delta_{i 1}\bar{x}_{k-1} \delta_{k 1} \bar{x}_{j-1}$

$\displaystyle = \delta_{i j} - \delta_{i 1}\bar{x}_{j-1} + \delta_{i 1}\bar{x}_{j-1} - \delta_{i 1}\bar{x}_0 \delta_{k 1} \bar{x}_{j-1}$

$\displaystyle = \delta_{i j}$

(iii)

$\displaystyle \beta' = G^{-1}\beta$

$\displaystyle \beta'_i = \sum_{j=1}^p (G^{-1})_{i j}\beta_j$

$\displaystyle \beta'_i = \sum_{j=1}^p\left(\delta_{i j} + \delta_{i 1}\bar{x}_{j-1}\right)\beta_j$

$\displaystyle \beta'_i = \sum_{j=1}^p \delta_{i j} \beta_j + \sum_{j=1}^p \delta_{i 1}\bar{x}_{j-1} \beta_j$

$\displaystyle \beta'_i = \beta_i + \sum_{j=1}^p \delta_{i 1}\bar{x}_{j-1} \beta_j$

(iv)

We want to analyze $X'^T X'$. Let’s try to get the elements of $X'$ more formally. We know that

$\displaystyle X_{i j} = \delta_{1 j} + x_{i, j-1}\quad(x_{i 0} = 0)$

$\displaystyle G_{ij} = \delta_{i j} - \delta_{i 1} \bar{x}_{j-1}\quad(\bar{x}_{0} = 0)$,

so we will have

$\displaystyle X'_{i j} = \sum_{k=1}^{p+1} X_{i k} G_{k j}$

$\displaystyle = \sum_{k=1}^{p+1} \left(\delta_{1 k} + x_{i, k-1}\right) \left(\delta_{k j} - \delta_{k 1} \bar{x}_{j-1}\right)$

$\displaystyle = \sum_{k=1}^{p+1} \delta_{1 k} \delta_{k j} - \sum_{k=1}^{p+1} \delta_{1 k} \delta_{k 1} \bar{x}_{j-1} + \sum_{k=1}^{p+1} x_{i, k-1} \delta_{k j} - \sum_{k=1}^{p+1} x_{i, k-1} \delta_{k 1} \bar{x}_{j-1}$

$\displaystyle = \delta_{1 j} - \bar{x}_{j-1} + x_{i, j-1} - x_{i, 1-1} \bar{x}_{j-1}$

$\displaystyle = \delta_{1 j} - \bar{x}_{j-1} + x_{i, j-1} - x_{i, 0} \bar{x}_{j-1}$

$\displaystyle = \delta_{1 j} + x_{i, j-1} - \bar{x}_{j-1}$.

Analyzing the product:

$\displaystyle (X'^T)_{i j} = X'_{j i} = \delta_{1 i} + x_{j, i-1} - \bar{x}_{i-1}$

$\displaystyle \left(X'^T X'\right)_{i j} = \sum_{k=1}^n (X'^T)_{i k} X'_{k j}$

$\displaystyle = \sum_{k=1}^n \left(\delta_{1 i} + x_{k, i-1} - \bar{x}_{i-1}\right)\left(\delta_{1 j} + x_{k, j-1} - \bar{x}_{j-1}\right)$.

Let’s analyze the cases separately:

i = 1 and j = 1

$\displaystyle (X'^T X')_{1 1} = \sum_{k=1}^n \left(\delta_{1 1} + x_{k, 0} - \bar{x}_{0}\right)\left(\delta_{1 1} + x_{k, 0} - \bar{x}_{0}\right)$

$\displaystyle = \sum_{k=1}^n \left(1 + 0 - 0\right)\left(1 + 0 - 0\right)$

$\displaystyle = \sum_{k=1}^n 1$

$\displaystyle = n$

i = 1 and j != 1

$\displaystyle \left(X'^T X'\right)_{1 j} = \sum_{k=1}^n \left(\delta_{1 1} + x_{k, 0} - \bar{x}_{0}\right)\left(\delta_{1 j} + x_{k, j-1} - \bar{x}_{j-1}\right)$

$\displaystyle = \sum_{k=1}^n \left(1 + 0 - 0\right)\left(0 + x_{k, j-1} - \bar{x}_{j-1}\right)$

$\displaystyle = \sum_{k=1}^n \left(x_{k, j-1} - \bar{x}_{j-1}\right)$

$\displaystyle = \left(\sum_{k=1}^n x_{k, j-1}\right) - n \bar{x}_{j-1}$

$\displaystyle = \sum_{k=1}^n x_{k, j-1} - n \frac{1}{n}\sum_{k=1}^n x_{k, j-1}$

$\displaystyle = 0$

i != 1 and j = 1

$\displaystyle \left(X'^T X'\right)_{i 1} = \sum_{k=1}^n \left(\delta_{1 i} + x_{k, i-1} - \bar{x}_{i-1}\right)\left(\delta_{1 1} + x_{k, 0} - \bar{x}_{0}\right)$

$\displaystyle = \sum_{k=1}^n \left(0 + x_{k, i-1} - \bar{x}_{i-1}\right)\left(1 + 0 - 0\right)$

$\displaystyle = \left(\sum_{k=1}^n x_{k, i-1}\right) - n \bar{x}_{i-1}$

$\displaystyle = 0$ (same reason)

i != 1 and j != 1

$\displaystyle \left(X'^T X'\right)_{i j} = \sum_{k=1}^n \left(\delta_{1 i} + x_{k, i-1} - \bar{x}_{i-1}\right)\left(\delta_{1 j} + x_{k, j-1} - \bar{x}_{j-1}\right)$

$\displaystyle \left(X'^T X'\right)_{i j} = \sum_{k=1}^n \left(0 + x_{k, i-1} - \bar{x}_{i-1}\right)\left(0 + x_{k, j-1} - \bar{x}_{j-1}\right)$

$\displaystyle = \sum_{k=1}^n \left(x_{k, i-1} - \bar{x}_{i-1}\right)\left(x_{k, j-1} - \bar{x}_{j-1}\right)$

$\displaystyle = \sum_{k=1}^n \left(x_{k, i-1} - \bar{x}_{i-1}\right)\left(x_{k, j-1} - \bar{x}_{j-1}\right)$.

If we use

$\displaystyle \mathcal{X}_{i, j} = x_{i, j} - \bar{x}_{j}$,

then we get

$\displaystyle \left(X'^T X'\right)_{i, j} = \sum_{k=1}^n \mathcal{X}_{k, i-1} \mathcal{X}_{k, j-1}$

$\displaystyle = \sum_{k=1}^n \mathcal{X}^T_{i-1, k} \mathcal{X}_{k, j-1}$

$\displaystyle = \left(\mathcal{X}^T \mathcal{X}\right)_{i-1, j-1}$,

finishing the proof.