A Brief Introduction to Vector Autoregressions
Introduction
The basic premise is that we can use the interaction among several variables to improve our forecast of each individual variable. At time t the forecast of a variable y is a function of its own past values as well as the present and past values of other variables in the system. The basic premise is that structural systems models of the economy have done a poor job of forecasting. As an alternative, some modelers have proposed using atheoretic vector autoregressive (VAR) models. A VAR does not come with the set of exclusion restrictions necessary to identify and estimate a structural model. As a consequence, both Keynesians and Monetarists could use the same VAR to forecast GDP.
As in univariate analysis of the last chapter, the simplest model is a first order process, VAR(1). Since this is the easiest specification to deal with, we'll use it almost exclusively. Most of the results stated here can be generalized to higher order VAR processes. We'll write the first order process as
The matrix A is an nxn matrix of unknown coefficients. The error terms have expectation zero, are independent over time, but may be contemporaneously correlated, so the nxn error covariance matrix need not be diagonal. A little thought probably suggests to you that the AR system is just a reduced form for some unstated structural model.
If y_{t} is a stationary random vector then an equivalent moving average representation can be derived. The process y_{t} is stationary if and only if the roots of the determinantal equation
where L is an n-dimensional lag operator, lie outside the unit circle. The infinite order MA representation is
where B = (I-AL).
Conversely, if y_{t} is a finite MA and the roots of |B|=0 are outside the unit circle, then the process is invertible and can be rewritten as a possibly infinite order autoregression.
To pursue the different representations a bit further, the process might be a mixed AR(p) and MA(q). Hence
.
If the AR part is stationary then the model can be rewritten as a MA, possibly of infinite order. Similarly if the MA part is invertible then the process can be written as an AR, possibly of infinite order.
To avoid the problem of determining p and q in the vector ARMA, we often write the model as a VAR of arbitrary finite order. Since the right hand side variables will all be past realizations of y:(nx1), it is possible to apply OLS to estimate the unknown coefficients in the matrix A.
Alternative Representations
To proceed we'll consider the two variable VAR(1) given by
According to the model, an increase in u_{t} will cause x_{t} to increase by one unit and will cause y_{t} to increase by r units as well. Because of the contemporaneous covariance between u and v, a change in v cannot be attributed to a pure innovation in y.
The moving average representation of the system can be derived by recursive substitution and written as
Notice that the MA representation of x does not include the current value of v, nor does the MA representation of y include the current value of u. Is this contrary to the earlier assertion that a one unit innovation in x_{t} will cause y_{t} to increase by r units? No, since the contemporaneous correlation between u and v is r. A one unit change in u produces an r unit change in v, which does enter the MA rpresentation of y. To pursue this, we proceed by first writing the model in matrix form:
.. 1 |
The innovations in this system of equations are given by
The information set for x_{t} is x_{t-1} and y_{t-1}. Similarly for y_{t} it is x_{t-1} and y_{t-1}.
The Cholesky Decompostion, the lower triangular square root such that , of the error covariance matrix is
Multiply both sides of the VAR by H to get the recursive model
.. 2 |
The new error terms are given by
and they have zero correlation. The innovations in this second model are given by
The information set for x and y is now different than that for the first representation.
In model ..2 there are no endogenous variables in x_{t}, but the current value of x_{t} appears in y_{t}. Hence an innovation in x in the first equation has a direct effect on y. In representation 2 a change in u_{t} has no effect on y_{t} by construction.
The moving average representation of the system has the same sort of
characteristic. Derive the MA model from ..2
.. 3 |
The current value of e_{2} does not appear in x_{t}, but the current value of e_{1} does appear in y_{t}. We have a second triangularization of the same system. Now a change in e_{1t} by one unit produces a change of r units in y, although e_{1} and e_{2} are still uncorrelated.
Finally, these three triangularizations are not unique. Their representations depend on the ordering of the variables in the model. If we were to put y first, then we would get a different set of relationships. The only circumstance under which ordering doesn't matter is when u and v are uncorrelated.
Exogeneity
With all of these representations, an obvious question is whether x or y or vice are exogenous. In this usage we interpret the word 'expgenous' in the sense of a variable being determined outside the system. There are several different notions of exogeneity. In order to understand them all we'll first have to consider the general vector ARMA model, then look at our specific two variable case.
.. 4 |
The nx1 dimensional random vector y has been partitioned into y_{1}:n_{1}x1 and y_{2}:n_{2}x1. The error vector has been partitioned conformably. y_{1} is strictly exogenous with respect to y_{2} when it is not affected either directly or indirectly by lagged values of y_{2}. If c_{12}(L)=0, then lagged values of y_{2} do not affect y_{1}. If d_{12}(L)=0 and d_{21}(L)=0 then lagged values of y_{2} do not enter the y_{1} equation. These three restriction matrices are necessary and sufficient. Let's impose the restrictions and multiply through by the inverse of the MA coefficient matrix to get the VAR representation:
This gives us the necessary and sufficient condition for the exogeneity of y_{1} with respect to y_{2}. In the VAR representation, y_{1} is exogenous with respect to y_{2} if and only if the coefficients on y_{2} are all zero.
Now impose the exogeneity restrictions and multiply the ARMA in ..4 by the inverse of the AR coefficient matrix to get the MA representation of the system.
Hence, in the MA representation of the system, y_{1} is exogenous with respect to y_{2} if and only if the coefficients on are all zero.
Sometimes you will see reference to the notion of weak exogeneity. The idea is similar to that of sufficient statistics. A statistic is sufficient if the likelihood function can be factored into a part that depends on the statistic and the unknown parameter and a part that depends only on the data. The weak exogeneity of y1 with respect y2 requires knowledge of the joint likelihood function for y_{1} and y_{2}, say . If this can be factored into the product of two likelihood functions such that can take any values in its parameter space given a specific admissable value for .
There is also the notion of a variable being predetermined. The variable y_{2} is said to be predetermined if it is unaffected by present and future values of but is affected by past values of .
To cement these notions we again consider a simple model
.. 5 .. 6 |
The disturbances u_{t} and v_{t} are independent and serially uncorrelated. Since x_{t} is correlated with the error term in eqn 5, and similarly for y_{t} in eqn 6, we need to solve for the reduced forms.
Note that these reduced forms are actually two AR(1)'s that are related through their error terms and are a special case of the alternative representations that we considered earlier. If then the reduced form for x_{t} becomes . x_{t} is now no longer contemporaneously correlated with the error u_{t}, although it will be affected by past values of u through y_{t-1}. We say that x_{t} is predetermined.
Suppose that and so that x_{t} = v_{t}. Now x_{t} is independent of past, present and future values of u_{t}, and it is said to be strictly exogenous.
Suppose that all we really needed was a consistent and asymptotically efficient of in equation 5. If in equation 6 then x_{t} and u_{t} are contemporaneously uncorrelated so an OLS estimate of will be consistent and asymptotically efficient. In this case xt is said to be weakly exogenous for . If u and v had a joint normal distribution you could prove this using the factorization suggested above.
Suppose we now wish to use our estimate of and equation 5 to forecast y_{t+1 }for a given x_{t+1 .} If , then y_{t} tells us what x_{t+1} must be and the forecasted y_{t+1} tells us what x_{t+2} via equation 6. As a result of this feedback we are not able to pick any value we wish for x_{t+2 . }We cannot use equation 5 in isolation from equation 6 for forecasting purposes. If , then we are able to use equation 5 in isolation for forecasting. In this instance x_{t} is said to be strongly exogenous in equation 5.
Finally, suppose x_{t} in equation 5 is a policy variable. If is invariant to policy changes then x_{t} is said to be super-exogenous.
Causality
To examine the notion of Wiener - Granger causality in a two variable system we introduce two summary tables. The first lays out the notation and models, the second presents the results. Each pair of rows is a model. Hence there are four models.
Table
1 The Models |
||
Var(u_{jt})=T_{j} |
Var(w_{jt})=G_{j} |
Cov(u_{2t}, w_{2t})=C |
The third model, the 5^{th} and 6^{th} rows are found by pre-multiplying the second model, the 3^{rd} and 4^{th} rows, by .
Table
2 Wiener Granger Causality |
||
(1) z causes x if and only if |
||
for all s |
or |
G_{1} > G_{2}_{
} |
(2) x causes z if and only if |
||
for all s |
or |
T_{1} > T_{2} |
(3) z does not cause x if and only if |
||
for all s |
or |
G_{1} = G_{2} |
(4) x does not cause z if and only if |
||
for all s |
or |
T_{1} = T_{2} |
(5) There is instantaneous causality between x and z if and only if |
||
for all s |
or |
T_{2} > T_{3} |
Exogeneity and causality are not the same thing. Tests for the absence of a causal ordering can be used to refute strict exogeneity in a given specification, but such tests cannot be used to eatablish it. Furthermore, unidirectional causality is neither necessary nor sufficient for inference to proceed on a subset of variables, all we require is that the variables be weakly exogenous.