VIII Some Distribution and Matrix Theorems

Definition: A k-dimensional random variable, y, is said to have a k-variate normal distribution if and only if every linear combination of y has a univariate normal distribution.

Definition: The density function for the univariate normal random variable, y, with mean m and variance s2 is

Definition: The density function for the multivariate normal random variable, y, with n dimensional mean vector m and nxn dimensional covariance matrix W is

Theorem: If y has a k-variate normal distribution then its mean vector, Ey = m, and its variance covariance matrix, Eyy' = W, exist. Further, m and W completely specify the distribution.

Theorem: If y has a k-variate normal distribution, y ~ Nk(m,W), and P is any lxk matrix then the l-vector, Py, has the l-variate normal distribution Nl(Pm,PWP').

Theorem: If y ~ Nk(m,D) with D a diagonal matrix then the components of y: (y1, y2, ..., yk) are independent and each is a univariate normal.

Definition: A k-variate normal vector, y, is said to have rank M if Y can be expressed as Y = m + Bz with B: kxM matrix of rank M and z a vector of independent N(0,1) random variables.

NOTE: In the above definition k M. (Why?) Also, if y has rank M it means that the variation in y is confined to an M dimensional subspace of k.

Theorem: If y ~ Nk(m,W) is of rank M, then r(W) = M and there exists a kxM matrix B such that BB' = W.

Theorem: If y ~ Nk(m,W) is of rank k, then W is nonsingular and there exists a nonsingular matrix B:kxk such that BB' = W.
Theorem: If y ~ Nk(m,W) is of rank k, then W is nonsingular and there exists a kxk nonsingular matrix B such that BB' = W and B'W-1B = I. Also, if we denote C = B-1 then C'C = W-1 and CWC' = I.

Theorem: a) if x ~ N(0,1) then x2 ~ c12
b) if y1 is cm2 and y2 is cn2 and y1 and y2 are independent then y1 + y2 is c2m+n.

Theorem: a) If x ~ N(0,1) and x1, x2, ..., xn is a random sample then .
b) If y ~ N(m,s2) and y1, y2, ..., yn is a random sample then .

Theorem: If the nx1 vector x is N(0,In) then x'x is cn2.

Theorem: If the nx1 vector x is N(0,In) and A is an nxn idempotent matrix of rank r then x'Ax ~ cr2.

Theorem: Let x be an nx1 vector distributed as N(0,s2In), A an nxn idempotent matrix, and B an nxn matrix such that BA = 0, then Bx ~ Nn(0,s2BB') is distributed independently of the quadratic form x'Ax ~ cn2.

Theorem: Let x ~ Nn(m,s2In). A and B are idempotent matrices of rank r and s respectively and AB = 0. Then x'Ax ~ cr2 and x'Bx ~ cs2 are independent of each other.

Theorem: Let x ~ Nn(0,s2In), r(A) = r, r(B) = s, AB = 0 and both are idempotent. Then


Theorem: If x ~ Fn,m then Y = 1/x ~ Fm,n.

Theorem: If y1, y2, ..., yn are independent normal variables with means m1, m2, ..., mn and variances equal to unity, then Syi2 is distributed as c2 with n degrees of freedom and non-centrality parameter l = Smi2.

Theorem: If the nx1 vector y is N(m,s2In) then y'Ay/s2, where A is idempotent of rank k, is distributed as c2 with k degrees of freedom and non-centrality parameter l = m'Am/2s2.

Theorem: If the nx1 vector y is N(m,s2In), the nx1 vector x is N(0,1), and A has rank k, then

with noncentrality parameter l = m'Am/2s2.

Theorem: If the nx1 vector y is N(m,s2In), A has rank k, B has rank m, AB = 0, then

with two non-centrality parameters
l1 = m'Am/2s2 and l2 = m'Bm/2s2

NOTE: The effect of the noncentrality parameters in the c2 and F distributions is to make them more skew. That is, for any given number in the domain of either the c2 or F distribution, the area in the upper tail is greater for the noncentral distribution than for the central distribution.
Since the Student's t distribution has a chi square in the denominator it too could be non-central.
Recall that in almost all situations we reject null hypotheses for extreme values of the observed test statistic. Thus, central and non-central distributions play a role in hypothesis testing. When we test an hypothesis we transform our data so that when the null is true our test statistics are constructed from N(0,1) random variables. But if the null is false then our test statistics will be constructed from random variables which are really N(m,s2), not N(0,1). The consequence is that when the null is false we are more likely to observe extreme values of the test statistics.

Theorem: If B is 1xn and A is nxn and y ~ N(m,sI) then By is independent of y'Ay if BA = 0.
Proof: Suppose P:nxn is an orthogonal matrix such that P'AP = D, where D is diagonal. Let P'y = z ~ N(P'm,s2I). Let C = BP and

where D1 has nonzero terms only on the diagonal. If BA = 0 then BA = BAP. We can write this as BAPP'AP = CD.

Since CD = 0, it must be the case that C11D1 = 0 and C21D1 = 0. We know D1 0, therefore C11 and C21 are both null matrices. So

Now By = BPP'y = CZ

and y'Ay = y'PP'APP'y = z'Dz

We already know that z1 and z2 are independent by construction, from above.

Example: In this example we consider the distribution of some sampling statistics computed from a random sample drawn from a normal distribution. Suppose that xi ~ N(m,s2), i = 1, 2, ..., n, but neither m nor s2 is known to us. We do have a hypothesis about the unknown value for the mean of these random variables. Namely, we believe the mean to be mo. From each of the xi we construct . Each of these random variables has a mean of 0 and a variance of 1. In order to test our hypothesis we must examine the following two random variables

where

and

where

To show that the two random variables are independent we need to check the product AB.

Since AB = 0 the two random variables are independent. The first is a c2 divided by its degrees of freedom. Since its construction does not depend on the null hypothesis it is a central c2. The second is a normal random variable. Therefore

has a Student's t distribution with n-1 degrees of freedom.