- At any point in time a variable, like household spending on Pringles, is a random variable.
- Time flows in only one direction.
- Since household spending on Pringles is a random variable and time flows in only one direction we only ever get to see one realization of the data; the experimant cannot be repeated.
- In effect our time series sample is composed of a sequence of individual draws from the probability distribution that exists at each point in time. There is an ensemble of distributions and we get
**a**data point from each member of the ensemble. - In cross section data all the sample observations are drawn from the same distribution that exists at that point in time, so we can use the sample drawn from a single distribution to infer something about that distribution's parameters.
- In order to make headway in our job of inference we need to introduce the idea of staionarity.

There are two flavors of stationarity.

**Strong stationarity**- The variable's distribution function is the same at every point in time with the same mean and variance. Therefore, our one realization of the data series can be used to make inferences.**Weak Stationarity**- Sometimes referred to as covariance stationarity. Whatever the distributiona are at each point, they all have the same mean, the same variance, and the covariances between periods do not depend on the absolute point in time, but only on the relative positions of the to periods.

In the definition of weak stationarity we said that **Cov(z _{t}, z_{t-h})** depends on

Our model now has both contemporaneous and lagged observations on the independent variable.

With reference to the above model can you explain * impact multiplier* and the

What assumptions are necessary for unbiasedness and consistency of OLS? Is OLS efficient? What assumptions are necessary? What are the meanings of * contemporaneous* and

Does a model like satisfy strict exogeneity?

What is the meaning of **serial correlation**^{\2} in the error term?

Is there any new news when it comes to the use of functional forms in time series specifications?

In our earlier examples of the use of dummy variables we used them to categorize, perhaps ordinally, groups in the data. What is their function in time series models?

**1. Trends**

Linear trend: How do you interpret the 'slope' coefficient?

Exponential trend: . How do you interpret the 'slope' coefficient in this specification?

**2. Common or Shared Trends**

Don't confuse causation for the fact that two variables may exhibit trends in the same or opposite directions.

**3. Seasonality**

Retail sales of candy canes are always greater in December than in other months of the year. Therefore, in discussing the increase in candy cane sales we want to be sure to compare the change in sales after adjusting for the fact that sales are always higher in December.

A strongly dependent series would be one that is said to have a unit root. Such series are not stationary. An example would be a random walk, in which this period's realization of the variable is the best predictor for next period's realization of the random variable.

The logical implication of this is that there is no systematic component to the variable of interest that can be learned in order to gain an advantage over others trying to make the same prediction. In particular, see Burton Malkiel's **A Random Walk Down Wall Street**.

Another example of strong dependence is a random walk with stochastic drift.

There are transformations that can be used to solve the problem of strong dependence. The most common is differencing the data: .

To see how we test (we are, after all, in the business of testing hypotheses) for the presence of a unit root, strong dependence, you'll have to read Chapter 18 in Wooldridge.

In olden times we had a model of the sort

and the error had the properties that

The times they are a changin' and we now believe

, this is not as stringent as strict exogeneity,

The first expression tells us that the error this period is a fraction of last period's error plus a white noise error, e. White noise simply means that the error term e has a mean of zero, is uncorrelated with the independent variables, and has constant variance.

In this brave new world OLS is still unbiased and consistent as long as the right hand side variables are strictly exogenous (See above or footnote 1). If we just rely on no correlation between the exogenous variables and e then OLS is still consistent.

To repeat something that we have said many times before

And the variance of this estimator is

The obvious consequence is that OLS is not efficient because it does not account for all of the stuff in square brackets. It is also paramount that you remember that by **default** any computer regression software reports the coefficient variance as , which is clearly wrong.

Again, in this modern era we suppose to be true, but hope that it is not. With this in mind, our null and alternate hypothesis are

1. Run the OLS regression for your posited model and save the residual series as .

2. Run the regression .

3. Do a t-test on your estimate of rho.

The null and alternate are as stated above. The is computed for you and reported in the regression output for any software package. The DW statistic is centered on 2. There are both upper and lower critical values, as in a z-score or t-stat, notwithstanding the remarks in your text. The null is rejected for very large or very small observed values of the test statistic.

1. Run the OLS regression for your posited model and save the residual series as .

2. Run the regression

3. Do a t-test on your estimate of rho.

Now you believe that this period's error depends, say, on last period's error and the error before that. Your suppostion can be written as

, but hope that it is not true. Now our null and alternate are

1. Run the OLS regression for your posited model and save the residual series as .

2. Run the regression

3. Compute the coefficient of determinaton for the auxiliary regression.

4. One's first guess might be to construct an F-test, but what we want is .

Once again our starting point is with the error structure . If we wanted to get rid of the autocorrelation in the error we might try transforming the model. Suppose that we know rho. In that case we could lag our model, multiply through by rho, then subtract the result from our starting model.

___________________________

In doing this operation we lose the first observation, so our dataset runs t = 2, ..., T. The consequence is that we need a little correction in order to get the first observation back into the data set, the details of which are in your textbook. The more serious question is how we come up with a best guess for rho. Operationally this is just some button clicks on any of the usual statistical programs. We can describe how it is done. Begin by estimating the proposed model and save the residuals. Use the LS residuals to estimate rho. Use the estimeated rho in the partial difference equation to reestimate the intercept and slope coefficients.

________________________________________

1. Strict exogeneity requires that the error be uncorrelated not just with the past values of z, but also the future values of z.