Most economics data is time series data. GDP, inflation, interest rates, exchange rates, stock prices, all of these unfold over time, and the order in which observations arrive matters enormously. The tools you use for cross-sectional data (where you observe many individuals at a single point in time) often fail completely when applied to time series. This guide explains why, and what to do about it.
The explanations here are written in the same style as the economaths video series: start with the intuition, then build the formal machinery. If something here is unclear, the videos are the best place to go next.
The single biggest mistake students make with time series data is treating it like cross-sectional data and running OLS without checking whether their series are stationary.
1. What makes time series data different?
In a cross-sectional dataset, you might have wages and education levels for 1,000 workers sampled randomly from the population. The key assumption that makes OLS work is that these observations are independent, knowing one person's wage tells you nothing about another's.
Time series data violates this immediately. If UK GDP was high last quarter, it is very likely to be high this quarter too. Economic variables have momentum, they tend to persist. This dependence across time is not a nuisance to be dismissed; it is the defining feature of time series, and understanding it is the whole challenge.
Key concept: the data generating process
We model a time series y_t as a sequence of random variables indexed by time t = 1, 2, ..., T. The joint distribution of all these random variables, how they depend on each other across time, is called the data generating process (DGP). The goal of time series econometrics is to estimate the DGP from observed data and use it to make inference and forecasts.
2. Stationarity, the most important concept in time series
A time series is stationary if its statistical properties do not change over time. More precisely, a series is covariance stationary (also called weakly stationary) if three conditions hold:
- The mean is constant: E[y_t] = μ for all t
- The variance is constant: Var(y_t) = σ² for all t
- The covariance between y_t and y_{t-k} depends only on the lag k, not on t
Stationarity matters for a simple reason: if the properties of a series change over time, any statistical summary you compute from the data (a mean, a regression coefficient) is only meaningful for the period you observe, not in general. You can't learn anything stable about the DGP from a non-stationary series without first transforming it.
Think about UK house prices. Between 1990 and 2025, average house prices rose from roughly £60,000 to over £280,000. The mean is not constant, it trends upwards. The variance is not constant, price swings are much larger in absolute terms in 2025 than in 1990. If you regressed anything on house prices without accounting for this, your results would be meaningless.
Stationary vs non-stationary: a quick test
Plot the series and look at it. A stationary series should:
- Appear to fluctuate around a fixed level
- Have roughly constant spread (variance) throughout
- Not show any long-run upward or downward drift
Non-stationary series typically show a trend, explosive growth, or permanent shifts in level. Many economic variables, prices, GDP, stock indices, are non-stationary in levels but become stationary when you take first differences.
3. Autoregressive processes and the AR(1)
The simplest model for a time series that captures persistence is the autoregressive model of order one, the AR(1):
y_t = ρ · y_{t-1} + ε_t
where ε_t is white noise, uncorrelated across time, mean zero, constant variance. The parameter ρ controls how strongly the past feeds into the present.
The key result: the AR(1) is stationary if and only if |ρ| < 1. When this holds, the effect of any shock dies away over time, the series keeps returning towards its mean. The autocovariances decline geometrically as the lag increases.
When ρ = 1, you have a random walk, and the series is non-stationary. Shocks are permanent: once the series is pushed up, it stays up. The variance grows without bound over time.
4. Unit roots, when ρ = 1
A series with ρ = 1 in its AR representation is said to have a unit root. The term comes from the characteristic equation of the autoregressive model, the value ρ = 1 corresponds to a root of exactly one.
Unit root processes have some deeply counterintuitive properties:
- The variance grows over time, so the series wanders ever further from its starting point
- Shocks are permanent, there is no tendency to revert to any long-run level
- The autocorrelations decline very slowly, barely at all for modest lags
- Standard OLS inference is invalid, t-statistics do not follow a t-distribution
Many economic variables appear to be unit root processes in levels: GDP, prices, exchange rates, interest rates, stock prices. Taking a first difference, computing Δy_t = y_t − y_{t-1}, often produces a stationary series. If y_t needs to be differenced once to achieve stationarity, it is called integrated of order one, written I(1).
Testing for unit roots: the Augmented Dickey-Fuller test
You cannot simply inspect the autocorrelation function to determine whether a series has a unit root, you need a formal test. The Augmented Dickey-Fuller (ADF) test runs a regression of the form:
Δy_t = α + δy_{t-1} + Σβ_j Δy_{t-j} + ε_t
and tests H₀: δ = 0 (unit root) against H₁: δ < 0 (stationarity). The critical values are non-standard, you cannot use the usual t-distribution tables. Under the null of a unit root, the t-statistic for δ has a Dickey-Fuller distribution, which is shifted to the left relative to a normal distribution.
Practical note: the augmentation lags (the Δy_{t-j} terms) are included to absorb serial correlation in the residuals. The number of lags is usually chosen by an information criterion such as AIC or BIC.
5. Spurious regression, why non-stationarity is dangerous
Here is one of the most important, and most counterintuitive, results in econometrics. If you regress one non-stationary I(1) series on another that is completely unrelated, you will typically find a statistically significant relationship. The R² will be high, the t-statistic will be large, and everything will look convincing. But the relationship is entirely spurious, an artifact of the shared trending behaviour, not evidence of any genuine connection.
This was documented formally by Granger and Newbold (1974), who ran simulations showing regressions of one random walk on another giving apparently significant coefficients far more often than they should. They called these spurious regressions.
Two series can appear to move together simply because they both trend over time. This is not a relationship, it is a statistical illusion.
The standard diagnostic for spurious regression is the Durbin-Watson statistic. A value close to zero on the residuals of a regression suggests very high autocorrelation, a warning sign that the regression is spurious. More formally, you should test your variables for unit roots before running any regression. If they are I(1), running OLS on levels is invalid unless they are cointegrated (more on that below).
6. ARIMA models, modelling non-stationary series
Once you have established that a series needs differencing to achieve stationarity, you can model the differenced series using autoregressive moving average (ARMA) methods. An ARMA(p, q) model for a stationary series has the form:
y_t = φ₁y_{t-1} + ... + φ_p y_{t-p} + ε_t + θ₁ε_{t-1} + ... + θ_q ε_{t-q}
The AR part (lags of y) captures the persistence of the series. The MA part (lags of ε) captures how shocks propagate across time.
When the series needs to be differenced d times before applying an ARMA model, we call the full model ARIMA(p, d, q). Autoregressive Integrated Moving Average. In practice, most economic series require at most one difference (d = 1), and the orders p and q are usually small (0, 1, or 2).
Selecting ARMA orders: the Box-Jenkins approach
Box and Jenkins proposed a systematic procedure for identifying the correct ARMA specification:
- Identification: inspect the autocorrelation function (ACF) and partial autocorrelation function (PACF). A pure AR(p) has a PACF that cuts off after lag p. A pure MA(q) has an ACF that cuts off after lag q. Mixed ARMA processes show gradual decay in both.
- Estimation: estimate the chosen model by maximum likelihood or conditional least squares.
- Diagnostic checking: test whether the residuals are white noise (using the Ljung-Box test). If not, the model is misspecified and you need to revisit the identification step.
In practice, model selection criteria (AIC, BIC) are now commonly used alongside the ACF/PACF approach, particularly for automated selection.
7. Autocorrelation, when OLS residuals are correlated
Even when working with stationary series, the error terms in a regression may be correlated across time. This is called autocorrelation (or serial correlation), and it violates one of the Gauss-Markov conditions that guarantees OLS is efficient.
The consequences of ignoring autocorrelation depend on what you are trying to do:
- OLS coefficient estimates remain unbiased (assuming the right regressors are included), but they are no longer the minimum variance linear unbiased estimator
- The OLS standard errors are wrong, they will typically understate the true uncertainty, making t-statistics look more significant than they are
- This means hypothesis tests and confidence intervals are invalid
The classic test for autocorrelation in OLS residuals is the Durbin-Watson test (for first-order autocorrelation) or the Breusch-Godfrey test (for higher orders). The practical fix is to use Newey-West standard errors, these are heteroskedasticity and autocorrelation consistent (HAC) standard errors that remain valid in the presence of both heteroskedasticity and serial correlation up to some specified lag.
8. Cointegration, when non-stationary series move together
Here is something remarkable. Two I(1) series, both non-stationary, can share a common stochastic trend, so that some linear combination of them is stationary. When this happens, the series are said to be cointegrated.
The economic intuition is that cointegrated series are tied together in the long run, even though they can diverge in the short run. Think about short-term and long-term interest rates: both trend over time (they are I(1)), but they tend to move together because of arbitrage, if they diverged too far, investors would shift between them. They are cointegrated.
Other classic examples of cointegration:
- Prices of a commodity in different markets (linked by trade)
- Wages and prices (linked by real wage determination)
- Household income and consumption (linked by the permanent income hypothesis)
When series are cointegrated, OLS on the levels is actually consistent, the spurious regression problem does not apply. There is a genuine long-run relationship to estimate. But you still need to model the short-run dynamics separately, typically using an error correction model (ECM).
Testing for cointegration: Engle-Granger
The simplest cointegration test is the Engle-Granger two-step procedure:
- Step 1: Regress y_t on x_t using OLS. Save the residuals û_t.
- Step 2: Test whether û_t is stationary using an ADF test.
If the residuals are stationary (reject the null of a unit root), the series are cointegrated. The critical values for the ADF test in Step 2 are non-standard, you must use the special Engle-Granger tables, not the usual Dickey-Fuller tables, because the residuals are estimated rather than observed.
For more than two series, the Johansen test is typically used, which can also determine how many cointegrating relationships exist.
9. Regression assumptions for time series data
The Gauss-Markov assumptions need revisiting for time series. The key modifications are:
- No strict exogeneity: in time series, it is common to include lagged dependent variables as regressors. This means the regressors cannot be strictly exogenous (uncorrelated with all past, present and future errors), but OLS may still be consistent under the weaker condition of contemporaneous exogeneity.
- Stationarity and weak dependence: instead of assuming i.i.d. errors, we need the series to be stationary and the autocorrelations to die away fast enough (weak dependence) for the law of large numbers and central limit theorem to apply.
- No unit roots: as discussed, non-stationary regressors invalidate standard OLS inference. Always check for unit roots first.
FAQ: What is a random walk?
A random walk is a process where each value is the previous value plus a random shock: y_t = y_{t-1} + ε_t. It is the simplest unit root process. It has no tendency to revert to any level, shocks are permanent. Many financial asset prices (particularly in efficient markets) are modelled as random walks, which is the statistical content of the "efficient market hypothesis".
FAQ: What is the difference between AR and ARMA?
An autoregressive (AR) model explains the current value using only past values of the same variable. A moving average (MA) model uses past values of the error term. An ARMA model combines both. In practice, many time series can be approximated well by a low-order ARMA process, and the Box-Jenkins methodology provides a systematic way to identify which specification fits best.
FAQ: How do I know if I need to difference my data?
Start by plotting the series. If it trends clearly, it is almost certainly non-stationary. Then run formal unit root tests (ADF or Phillips-Perron). If you reject the null of stationarity, difference the series and test again. Most economic series in levels need one difference; very few need two.
FAQ: What are Newey-West standard errors?
Newey-West standard errors (also called HAC standard errors, heteroskedasticity and autocorrelation consistent) are robust standard errors that remain valid when OLS residuals are autocorrelated or heteroskedastic. They use a weighted average of sample autocovariances to estimate the true variance of the OLS estimator. Most econometrics software computes them with a single option.
FAQ: What is an error correction model?
An error correction model (ECM) is a way of modelling series that are cointegrated. It decomposes the relationship into a long-run component (the cointegrating relationship) and short-run dynamics. The "error correction" term is the deviation from the long-run equilibrium, when series drift apart, the ECM shows how quickly they return. The Granger representation theorem states that any cointegrated system has an ECM representation.
Watch the videos
The economaths time series playlist covers these topics through worked video lectures. Each video is concise and focused on building understanding, not just mechanical computation.
Topics covered: stationarity, unit roots, the ADF test, spurious regression, ARIMA models, autocorrelation, Newey-West standard errors, cointegration.
Watch on YouTube →Struggling with time series econometrics?
Unit roots, cointegration and ARIMA models are topics where a single tutorial session can save weeks of confusion. I offer 1-1 online tuition at every level, from introductory courses through to PhD-level time series.
Book a free consultation