VIII Some Distribution and Matrix Theorems
Definition: A k-dimensional
random variable, y, is said to have a k-variate normal distribution
if and only if every linear combination of y has a univariate
normal distribution.
Definition: The density
function for the univariate normal random variable, y, with mean
m and variance s2
is
Definition: The density
function for the multivariate normal random variable, y, with
n dimensional mean vector m and nxn
dimensional covariance matrix W is
Theorem: If y has a k-variate
normal distribution then its mean vector, Ey = m,
and its variance covariance matrix, Eyy' = W,
exist. Further, m and W
completely specify the distribution.
Theorem: If y has a k-variate
normal distribution, y ~ Nk(m,W),
and P is any lxk matrix then the l-vector, Py, has the l-variate
normal distribution Nl(Pm,PWP').
Theorem: If y ~ Nk(m,D)
with D a diagonal matrix then the components
of y: (y1, y2, ..., yk) are independent
and each is a univariate normal.
Definition: A k-variate
normal vector, y, is said to have rank M if Y can be expressed
as Y = m + Bz with B: kxM matrix of
rank M and z a vector of independent N(0,1) random variables.
NOTE: In the above definition k ³
M. (Why?) Also, if y has rank M it means that the variation in
y is confined to an M dimensional subspace of Âk.
Theorem:
If y ~ Nk(m,W)
is of rank M, then r(W)
= M and there exists a kxM matrix B such that BB' = W.
Theorem: If y ~ Nk(m,W)
is of rank k, then W is nonsingular
and there exists a nonsingular matrix B:kxk such that BB' = W.
Theorem: If y ~ Nk(m,W)
is of rank k, then W is nonsingular
and there exists a kxk nonsingular matrix B such that BB' = W
and B'W-1B = I. Also, if
we denote C = B-1 then C'C = W-1
and CWC' = I.
Theorem: a) if x ~ N(0,1)
then x2 ~ c12
b) if y1 is cm2
and y2 is cn2
and y1 and y2 are independent then y1
+ y2 is c2m+n.
Theorem: a) If x ~ N(0,1)
and x1, x2, ..., xn is a random
sample then .
b) If y ~ N(m,s2)
and y1, y2, ..., yn is a random
sample then .
Theorem: If the nx1 vector
x is N(0,In) then x'x is cn2.
Theorem: If the nx1 vector
x is N(0,In) and A is an nxn idempotent matrix of rank
r then x'Ax ~ cr2.
Theorem: Let x be an
nx1 vector distributed as N(0,s2In),
A an nxn idempotent matrix, and B an nxn matrix such that BA =
0, then Bx ~ Nn(0,s2BB')
is distributed independently of the quadratic form x'Ax ~ cn2.
Theorem: Let x ~ Nn(m,s2In).
A and B are idempotent matrices of rank r and s respectively and
AB = 0. Then x'Ax ~ cr2
and x'Bx ~ cs2
are independent of each other.
Theorem: Let x ~ Nn(0,s2In),
r(A) = r, r(B)
= s, AB = 0 and both are idempotent. Then
Theorem: If x ~ Fn,m
then Y = 1/x ~ Fm,n.
Theorem: If y1,
y2, ..., yn are independent normal variables
with means m1, m2,
..., mn and variances equal
to unity, then Syi2
is distributed as c2 with
n degrees of freedom and non-centrality parameter l
= ½Smi2.
Theorem: If the nx1 vector
y is N(m,s2In)
then y'Ay/s2, where A is
idempotent of rank k, is distributed as c2
with k degrees of freedom and non-centrality parameter l
= m'Am/2s2.
Theorem: If the nx1 vector
y is N(m,s2In),
the nx1 vector x is N(0,1), and A has rank k, then
with noncentrality parameter l = m'Am/2s2.
Theorem: If the nx1 vector
y is N(m,s2In),
A has rank k, B has rank m, AB = 0, then
with two non-centrality parameters
l1 = m'Am/2s2
and l2 = m'Bm/2s2
NOTE: The effect of the noncentrality
parameters in the c2 and
F distributions is to make them more skew. That is, for any given
number in the domain of either the c2
or F distribution, the area in the upper tail is greater for the
noncentral distribution than for the central distribution.
Since the Student's t distribution has a chi square in the denominator
it too could be non-central.
Recall that in almost all situations we reject null hypotheses
for extreme values of the observed test statistic. Thus, central
and non-central distributions play a role in hypothesis testing.
When we test an hypothesis we transform our data so that when
the null is true our test statistics are constructed from N(0,1)
random variables. But if the null is false then our test statistics
will be constructed from random variables which are really N(m,s2),
not N(0,1). The consequence is that when the null is false we
are more likely to observe extreme values of the test statistics.
Theorem:
If B is 1xn and A is nxn and y ~ N(m,sI)
then By is independent of y'Ay if BA = 0.
Proof: Suppose P:nxn is an orthogonal matrix such that
P'AP = D, where D is diagonal. Let P'y = z ~ N(P'm,s2I).
Let C = BP and
where D1 has nonzero terms only on the diagonal. If
BA = 0 then BA = BAP. We can write this as BAPP'AP = CD.
Since CD = 0, it must be the case that C11D1
= 0 and C21D1 = 0. We know D1
¹ 0, therefore C11
and C21 are both null matrices. So
Now By = BPP'y = CZ
and y'Ay = y'PP'APP'y = z'Dz
We already know that z1 and z2 are independent
by construction, from above.
Example: In this example
we consider the distribution of some sampling statistics computed
from a random sample drawn from a normal distribution. Suppose
that xi ~ N(m,s2),
i = 1, 2, ..., n, but neither m nor
s2 is known to us. We do
have a hypothesis about the unknown value for the mean of these
random variables. Namely, we believe the mean to be mo.
From each of the xi we construct .
Each of these random variables has a mean of 0 and a variance
of 1. In order to test our hypothesis we must examine the following
two random variables
where
and
where
To show that the two random variables are independent we need
to check the product AB.
Since AB = 0 the two random variables are independent. The first
is a c2 divided by its degrees
of freedom. Since its construction does not depend on the null
hypothesis it is a central c2.
The second is a normal random variable. Therefore
has a Student's t distribution with n-1 degrees of freedom.