18.S096 Problem Set 3

We have \displaystyle H = X \left(X^T X\right)^{-1} X^T.

1.a

Let’s first prove that \left(X^T X\right)^{-1} is symmetric:

\displaystyle \left(\left(X^T X\right)^{-1}\right)^T (X^T X) = \left(\left(X^T X\right)^{-1}\right)^T (X^T X)^T

\displaystyle = \left(X^T X \left(X^T X\right)^{-1}\right)^T

\displaystyle = I^T

\displaystyle = I.

So we have \left(X^T X\right)^{-1} = \left(\left(X^T X\right)^{-1}\right)^T by uniqueness of the inverse.

Proving that H is a projection matrix:

\displaystyle H^T = \left(X \left(X^T X\right)^{-1} X^T\right)^T

\displaystyle = \left(X^T\right)^T \left(\left(X^T X\right)^{-1}\right)^T X^T

\displaystyle = X \left(X^T X\right)^{-1} X^T

\displaystyle = H

\displaystyle H^2 = X \left(X^T X\right)^{-1} X^T X \left(X^T X\right)^{-1} X^T

\displaystyle = X \left(X^T X\right)^{-1} (X^T X) \left(X^T X\right)^{-1} X^T

\displaystyle = X \left(X^T X\right)^{-1} X^T

\displaystyle = H.

1.b

\displaystyle \frac{d\hat{y}_i}{d y_i} = \frac{d}{d y_i}\sum_{j=1}^n H_{ij} y_j = \sum_{j=1}^n H_{ij} \frac{d y_j}{d y_i} = \sum_{j=1}^n H_{ij} \delta_{j i} = H_{i i}

1.c

\displaystyle \mathrm{Avg}(H_{i i}) = \frac{1}{n}\mathrm{tr} H

\displaystyle = \frac{1}{n}\mathrm{tr} \left(X \left(X^T X\right)^{-1} X^T\right)

\displaystyle = \frac{1}{n}\mathrm{tr} \left(X^T X \left(X^T X\right)^{-1}\right)

\displaystyle = \frac{1}{n}\mathrm{tr} I_p

\displaystyle = \frac{p}{n}

1.d

\displaystyle H' = X' \left(X'^T X'\right)^{-1} X'^T

\displaystyle = X G \left((X G)^T X G\right)^{-1} (X G)^T

\displaystyle = X G \left(G^T X^T X G\right)^{-1} G^T X^T

\displaystyle = X G G^{-1} \left(X^T X\right)^{-1} \left(G^T\right)^{-1} G^T X^T

\displaystyle = X \left(X^T X\right)^{-1} X^T

\displaystyle = H

1.e

(i)

\displaystyle \hat{y} = X' \beta' + \epsilon

\displaystyle X \beta + \epsilon = X G \beta' + \epsilon

\displaystyle X \beta = X G \beta'

\displaystyle (X^T X)^{-1} X^T X \beta = (X^T X)^{-1}  X^T X G \beta'

\displaystyle \beta = G \beta'

\displaystyle \beta' = G^{-1} \beta

(ii)

We look for G^{-1} with G_{ij} = \delta_{i j} - \delta_{i 1} \bar{x}_{j-1} and \bar{x}_0 = 0.

\displaystyle G^{-1} G = I

\displaystyle \sum_{k=1}^{p+1} (G^{-1})_{i k} G_{k j} = \delta_{i j}

\displaystyle \sum_{k=1}^{p+1} (G^{-1})_{i k} \left(\delta_{k j} - \delta_{k 1} \bar{x}_{j-1}\right) = \delta_{i j}

\displaystyle \sum_{k=1}^{p+1} (G^{-1})_{i k} \delta_{k j} - \sum_{k=1}^{p+1} (G^{-1})_{i k} \delta_{k 1} \bar{x}_{j-1} = \delta_{i j}

\displaystyle (G^{-1})_{i j} - (G^{-1})_{i 1} \bar{x}_{j-1} = \delta_{i j}

Now we have different cases:

First column (j = 1)

\displaystyle (G^{-1})_{i 1} - (G^{-1})_{i 1} \bar{x}_0 = \delta_{i 1}

\displaystyle (G^{-1})_{i 1} = \delta_{i 1}

So the first column has 1 in the first row, and 0 in all the others.

First row (i = 1)

\displaystyle (G^{-1})_{1 j} - (G^{-1})_{1 1} \bar{x}_{j-1} = \delta_{1 j}

\displaystyle (G^{-1})_{1 j} = \delta_{1 j} + \bar{x}_{j-1}

\displaystyle (G^{-1})_{1 j} = \bar{x}_{j-1}\quad(j \ne 1)

So the first row has 1 in the first column and \bar{x}_{j-1} in the others.

Diagonal (i = j)

\displaystyle (G^{-1})_{i i} - (G^{-1})_{i 1} \bar{x}_{i-1} = \delta_{i i}

\displaystyle (G^{-1})_{i i} - \delta_{i 1} \bar{x}_{i-1} = 1

\displaystyle (G^{-1})_{i i} = 1

So the diagonal is all 1.

General values (i != j) (i != 1) (j != 1)

\displaystyle (G^{-1})_{i j} = 0

General

\displaystyle (G^{-1})_{i j} = \delta_{i j} + \delta_{i 1}\bar{x}_{j-1}

Checking:

\displaystyle \sum_{k=1}^{p+1} (G^{-1})_{i k} G_{k j} = \sum_{k=1}^{p+1} \left(\delta_{i k} + \delta_{i 1}\bar{x}_{k-1}\right)\left( \delta_{k j} - \delta_{k 1} \bar{x}_{j-1}\right)

\displaystyle = \sum_{k=1}^{p+1} \delta_{i k} \delta_{k j} - \delta_{i k}\delta_{k 1}\bar{x}_{j-1} + \delta_{i 1}\bar{x}_{k-1} \delta_{k j} - \delta_{i 1}\bar{x}_{k-1} \delta_{k 1} \bar{x}_{j-1}

\displaystyle = \delta_{i j} - \delta_{i 1}\bar{x}_{j-1} + \delta_{i 1}\bar{x}_{j-1} - \delta_{i 1}\bar{x}_0 \delta_{k 1} \bar{x}_{j-1}

\displaystyle = \delta_{i j}

(iii)

\displaystyle \beta' = G^{-1}\beta

\displaystyle \beta'_i = \sum_{j=1}^p (G^{-1})_{i j}\beta_j

\displaystyle \beta'_i = \sum_{j=1}^p\left(\delta_{i j} + \delta_{i 1}\bar{x}_{j-1}\right)\beta_j

\displaystyle \beta'_i = \sum_{j=1}^p \delta_{i j} \beta_j + \sum_{j=1}^p \delta_{i 1}\bar{x}_{j-1} \beta_j

\displaystyle \beta'_i = \beta_i + \sum_{j=1}^p \delta_{i 1}\bar{x}_{j-1} \beta_j

(iv)

We want to analyze X'^T X'. Let’s try to get the elements of X' more formally. We know that

\displaystyle X_{i j} = \delta_{1 j} + x_{i, j-1}\quad(x_{i 0} = 0)

\displaystyle G_{ij} = \delta_{i j} - \delta_{i 1} \bar{x}_{j-1}\quad(\bar{x}_{0} = 0),

so we will have

\displaystyle X'_{i j} = \sum_{k=1}^{p+1} X_{i k} G_{k j}

\displaystyle = \sum_{k=1}^{p+1} \left(\delta_{1 k} + x_{i, k-1}\right) \left(\delta_{k j} - \delta_{k 1} \bar{x}_{j-1}\right)

\displaystyle = \sum_{k=1}^{p+1} \delta_{1 k} \delta_{k j} - \sum_{k=1}^{p+1} \delta_{1 k} \delta_{k 1} \bar{x}_{j-1} + \sum_{k=1}^{p+1} x_{i, k-1} \delta_{k j} - \sum_{k=1}^{p+1} x_{i, k-1} \delta_{k 1} \bar{x}_{j-1}

\displaystyle = \delta_{1 j} - \bar{x}_{j-1} + x_{i, j-1} - x_{i, 1-1} \bar{x}_{j-1}

\displaystyle = \delta_{1 j} - \bar{x}_{j-1} + x_{i, j-1} - x_{i, 0} \bar{x}_{j-1}

\displaystyle = \delta_{1 j} + x_{i, j-1} - \bar{x}_{j-1}.

Analyzing the product:

\displaystyle (X'^T)_{i j} = X'_{j i} = \delta_{1 i} + x_{j, i-1} - \bar{x}_{i-1}

\displaystyle \left(X'^T X'\right)_{i j} = \sum_{k=1}^n (X'^T)_{i k} X'_{k j}

\displaystyle = \sum_{k=1}^n \left(\delta_{1 i} + x_{k, i-1} - \bar{x}_{i-1}\right)\left(\delta_{1 j} + x_{k, j-1} - \bar{x}_{j-1}\right).

Let’s analyze the cases separately:

i = 1 and j = 1

\displaystyle (X'^T X')_{1 1} = \sum_{k=1}^n \left(\delta_{1 1} + x_{k, 0} - \bar{x}_{0}\right)\left(\delta_{1 1} + x_{k, 0} - \bar{x}_{0}\right)

\displaystyle = \sum_{k=1}^n \left(1 + 0 - 0\right)\left(1 + 0 - 0\right)

\displaystyle = \sum_{k=1}^n 1

\displaystyle = n

i = 1 and j != 1

\displaystyle \left(X'^T X'\right)_{1 j} = \sum_{k=1}^n \left(\delta_{1 1} + x_{k, 0} - \bar{x}_{0}\right)\left(\delta_{1 j} + x_{k, j-1} - \bar{x}_{j-1}\right)

\displaystyle = \sum_{k=1}^n \left(1 + 0 - 0\right)\left(0 + x_{k, j-1} - \bar{x}_{j-1}\right)

\displaystyle = \sum_{k=1}^n \left(x_{k, j-1} - \bar{x}_{j-1}\right)

\displaystyle = \left(\sum_{k=1}^n x_{k, j-1}\right) - n \bar{x}_{j-1}

\displaystyle = \sum_{k=1}^n x_{k, j-1} - n \frac{1}{n}\sum_{k=1}^n x_{k, j-1}

\displaystyle = 0

i != 1 and j = 1

\displaystyle \left(X'^T X'\right)_{i 1} = \sum_{k=1}^n \left(\delta_{1 i} + x_{k, i-1} - \bar{x}_{i-1}\right)\left(\delta_{1 1} + x_{k, 0} - \bar{x}_{0}\right)

\displaystyle = \sum_{k=1}^n \left(0 + x_{k, i-1} - \bar{x}_{i-1}\right)\left(1 + 0 - 0\right)

\displaystyle = \left(\sum_{k=1}^n x_{k, i-1}\right) - n \bar{x}_{i-1}

\displaystyle = 0 (same reason)

i != 1 and j != 1

\displaystyle \left(X'^T X'\right)_{i j} = \sum_{k=1}^n \left(\delta_{1 i} + x_{k, i-1} - \bar{x}_{i-1}\right)\left(\delta_{1 j} + x_{k, j-1} - \bar{x}_{j-1}\right)

\displaystyle \left(X'^T X'\right)_{i j} = \sum_{k=1}^n \left(0 + x_{k, i-1} - \bar{x}_{i-1}\right)\left(0 + x_{k, j-1} - \bar{x}_{j-1}\right)

\displaystyle = \sum_{k=1}^n \left(x_{k, i-1} - \bar{x}_{i-1}\right)\left(x_{k, j-1} - \bar{x}_{j-1}\right)

\displaystyle = \sum_{k=1}^n \left(x_{k, i-1} - \bar{x}_{i-1}\right)\left(x_{k, j-1} - \bar{x}_{j-1}\right).

If we use

\displaystyle \mathcal{X}_{i, j} = x_{i, j} - \bar{x}_{j},

then we get

\displaystyle \left(X'^T X'\right)_{i, j} = \sum_{k=1}^n \mathcal{X}_{k, i-1} \mathcal{X}_{k, j-1}

\displaystyle = \sum_{k=1}^n \mathcal{X}^T_{i-1, k} \mathcal{X}_{k, j-1}

\displaystyle = \left(\mathcal{X}^T \mathcal{X}\right)_{i-1, j-1},

finishing the proof.

Advertisements