Time series analysis is about the identification, estimation and diagnostic checking
of stationary time series. By way of review we offer the following definitions:
Definition: The sequence {yt} is said to be
covariance stationary if for all t and ts
That is, the mean, variance and covariance are invariant to the time origin.
Definition: Suppose we have the sequence {y_{t}}
(t=0,1,2,…) with mean m and variance s^{2}. Then the autocorrelation function
or correlogram is given by
Suppose we have a series {y_{t}} which we know to have been generated by an AR(1)
process, say,
where and e_{t}
is white noise. We can estimate the parameters in (1) by OLS: Our estimator is efficient
and the series is stationary since . We
could use a tstatistic to test the hypothesis
This is a legitimate test since the null is a refutable hypothesis, even though the power
against a local alternate is negligible.
But suppose that the data was really generated by
Upon recursive substitution this can be rewritten as
which is nonstationary since as t gets large. Now we would want to test
There is a problem, however, since the center of mass of the usual estimator would be bounded away from 1. We would tend to err on the side of rejecting too many H_{0 . }The question of the presence of a unit root is particularly problematic in regression models of the sort
We usually assume that {y_{t}} and {z_{t}} are both stationary and that e_{t} is white noise. If the two variables are nonstationary then we will likely get spurious results: high R^{2} and statistically significant coefficients even though there may not really be a meaningful relationship between y and z. There are four cases to consider
Both {y_{t}} and {z_{t}} are unit root processes but y_{t}z_{t} = e_{yt}e_{zt} is stationary. We will leave case 4 until the chapter on cointegration. For now we will concern ourselves with determining whether or not the series {y_{t}} has a unit root.
Consider the data generating process
And the associated question, is a_{1} = 1? Subtract y _{t1} from both
sides to get
g=0 implies a_{1}=1 implies a unit root in {y_{t}}.
We can allow for drift by including an intercept
Definition: The term stochastic drift comes from the
following: Suppose that the process is
We can rewrite this as
In the next period, i.e. t+1, the intercept is a_{o}a_{1}^{t+1}
larger, to which we add a stochastic term. We have seen this idea of a stochastic
intercept elsewhere. Namely in the random effects model.
We can allow for a linear trend with drift
In any event, our test of hypothesis is
The test statistic we use for the test of hypothesis is constructed as a tstatistic.
That is
The critical values come from a set of tables prepared by Dickey and Fuller. The
tables were generated empirically. We are accustomed to doing tests with critical values
we have determined analytically by integration of a known distribution function.
The particular table to be used depends on whether the model has an intercept or a trend
in it. However, the critical values are not changed by including terms on the right hand side.
To guide you in the testing procedure, consider the following flow diagram from Walter
Enders, Applied Econometric Time Series, Wiley, 1995.
One starts in the top left corner with the most general model, that includes a stochastic
drift and a deterministic trend. Either the trend or the drift can produce the appearance
of a unit root in their own right, so they must be included at the outset. Recall that an
excluded relevant variable introduces bias, but an included irrlevant variable only has a
cost in terms of efficiency. If the null of a root is not rejected, then proceed by
testing for the significance of the trend term in the presence of a unit root. If the
trend term is not significant, then test for the significance of the drift term. If along
the way we find that either the trend or the drift is not zero then we proceed immediately
to to test for the significance of g.
Example
The following models have been fit to the Federal Reserve Bank production index for
the period 1950:1  1977:4, a total of 112 observations. In all three models the numbers
in parentheses are standard errors. The most general mode, corresponding to the start of
the flow diagram is
At the 5% level of test (2.5% in each tail) the critical value for the coefficient on
y_{t1} for a model with drift and trend is 3.73, compared to an observed test
statistic of 3.6, so we fail to reject the null. For the moment we believe there to be a
unit root. Next we fit a model which imposes the restriction that g=0,
and test to see if the trend coefficient is zero. Note that on the basis of a conventional
ttest, the trend coefficient is highly significant.
A model with drift but no trend and which supposes that there is a unit root is
Now the test of hypothesis is
H_{o}: unit root, no trend
H_{1}: one or both not true
The appropriate test statistic is constructed as though it was an Ftest, but the critical
value is read from a different set of tables.
The critical value at the 5% level is 6.49, so we fail to reject the null. Our conclusion
to this point is that there is a unit root and that the trend should be excluded.
A model with neither drift nor trend, but which presumes a unit root is
The test of hypothesis is
H_{o}: unit root, no trend, no drift
H_{1}: one or more belongs
The critical value at the 1% level of test is 6.50. Since our observed test statistic
is smaller than the critical value, we fail to reject the null. Our conclusion is that
there is a unit root, there is neither trend nor drift.
Extension of DickeyFuller
Suppose that the data generating process is
This is quite a bit more general than the process we started with. It will also admit
a multiplicity of roots. We need to augment DickeyFuller in order to test for this
possibility. Let us consider the AR(3) process
We will add and subtract a_{3}y_{t2} to get
Now add and subtract (a_{2}+a_{3})y_{t1} to get
Finally, subtract y_{t1} from both sides
Now we can test for the presence of a unit root. We know that if the coefficients in a
difference equation sum to one then at least one root is unity. In the present context
this amounts to testing g=0, as in the simpler case. The
critical values for this augmented model remain the same as before. Parenthetically,
adding a time trend causes a headache when it comes time to derive the large sample
properties of the OLS estimator since x'x will no longer be finite elementwise.
Problems with DF and augmented DF
1. The error term may have a moving average term in it.
Suppose A(L)y_{t}=C(L)e_{t} and the roots
of C(L) all lie outside the unit circle so C(L) is invertible. Then
Unfortunately D(L) will be of infinite order, but we can use our earlier procedure
to write
With our finite data sets we might be in trouble if not for the fact that it has
been shown empirically that a good approximation will cut off the distributed lag
at the T/3 term.
The problem of too many lags reduces the efficiency of the estimator. This is
a much less serious problem than using too few lags. As pointed out before,
excluding relevant variables will affect the bias and consistency of the OLS
estimator.
3. DF tests to see if there is at least one root. Suppose there are more?
For example, one could estimate the parameters of the model
(1L)^{2}y_{t} = b_{1}(1L)y_{t1} + e_{t} . One would then use the DF
statistics, as appropriate
to the case, to test b_{1} = 0. If b1=0 then there are
2 unit roots, if it is not
zero then one must go on and test to see if there is a single unit root.
The procedure is generalized in the obvious way.
4. How do we know which deterministic regressors belong in the model?
The procedures used in the FRB production example and in problems 2 and 3 use
cascading tests of hypothesis. As is shown in Theil, Principles of Econometrics, Wiley,
1971, this reduces the purported significance level of the test in each succeeding step.
Along the same lines, Judge and his numerous coauthors would argue that the procedure
outlined in the flow chart puts in the realm of pretesting and hence higher squared error
loss over a large part of the parameter space. Nonetheless, in applied work we often
disregard these caveats and use the process in the flow chart.
Under PPP, the rate of currency depreciation is approximately equal to the difference
between the domestic and foreign inflation rates. The PPP model implies
e_{t}= p_{t}  p^{*}_{t} + d_{t
}where p_{t} = log of the US price level
p^{*}_{t} = log of the foreign price level
e_{t} = log of the dollar price of foreign exchange
d_{t} = deviation from PPP at time t
The three data series apply the log transformation so that we are using inflation rates.
In certain PPP models it is possible for real shocks to either demand or supply to cause
permanent deviations. Intuitively the deviations should not persist or there would be
substantial opportunities for profit taking. And anyway such profit taking and arbitrage
would restore PPP eventually. A popular procedure in the empirical modeling of PPP is to
construct the series
r_{t} = e_{t} + p_{t}^{*}  p_{t}
If PPP is to hold then r_{t} must be stationary with a zero mean. Furthermore
there can be neither trend nor stochastic drift. To digress and anticipate the material in
another section, e_{t}, p_{t} and p^{*}_{t} are said to be
cointegrated when the PPP model is true. This specific formulation of the model imposes a
specific cointegrating vector on the three variables.
Fit the model
to monthly data for the pre (1960.1  1971.4, T=136) and post (1973.1  1986.11, T=167)
Bretton Woods eras to get the following results, with coefficient standard errors in
parentheses:
g 
a_{0 } 
a_{2 } 
SD 
SEE 
t 

197386 
.047 (.074) 
1.25 
.297 
10.44 
2.81 
.63 
196071 
.030 (.028) 
.97 
.017 
.005 
1.04 

In this example we fail to reject the null of a unit root. We cannot believe in the
PPP model. But our testing procedure is predicated on the constant variance of the error
term, which does not appear to be the case.
Phillips and Perron have devised corrected test statistics for the instances in which the
error is an MA, is perhaps heterogeneous, or there is a structural break in the data.
How can we tell the difference between a series which has a structural break in it,
but would otherwise be stationary, and a series which is not stationary, but which due to
an impulse seems to evolve like the first series?
Consider a model in which there is a shift in the intercept
y_{t} = .5 y_{t1} + e_{t} + D_{L}
where D_{L} is one for many consecutive periods and zero otherwise. An example is the following figure.
The red line is the original series. The blue line is the simple regression of y_{t}
on time (a=3.543, b=.189). In the regression of y_{t} on y_{t1} we get
Apparently the structural break causes the coefficient on y_{t1} to be biased
toward one. For all appearances y_{t} is not stationary, although we know it to be
stationary both before and after the break at t=50. Even without doing the test for this
case, we would not expect DickeyFuller to be very robust against these models with a
structural break in them. Indeed the observed test statistic is t=.507
Now consider a nonstationary model in which there has been a once and done pulse
y_{t} = y_{t1} + e_{t} + D_{P}
where D_{P} is one in a given period and zero otherwise? An example is in the
following figure:
The red line is the original series. The blue line is the simple regression of y_{t}
on time (a=8.086, b=.233). There is an apparent break at t=50. The regression of y_{t}
on its lagged value gives us
Even without a formal test, the size of the coefficient leads us to suspect a unit
root, which is indeed the case. Without a statistical test we really cannot distinguish
this case from the prior instance.
Phillips and Perron have developed a test for this problem. Consider the working model
in which D_{P} is a pulse equal to one in a period and zero otherwise, D_{L}
is one for some consecutive periods and zero otherwise.
Step 1. Estimate the coefficients of the full model.
Step 2. Compare the tstatistics to the critical values in Perron. Of particular interest
will be the coefficient a_{1}.
When Perron used this method to analyze the PlosserNelson data he found that most macro
time series are trend stationary.