Time Series Analysis

Topics

Stylized Facts
- Stylized Facts
- Log-Returns
Stationarity
- Stationarity
- Asymptotics of Stationary Sequences
- Standard Facts on Conditional Expectation
- MDS
- Wold Decomposition
AR Processes
- ACF
- Bartlett’s Formula
- Ljung-Box Test
- AR(1)
- Causal Processes
- AR(2)
- Weak Stationarity of AR(p)
- Partial Correlation Coefficients
- PACF
MA Processes
- MA(q)
- Invertibility of MA Processes
- Formal Notations
ARMA Models
- AMRA Models
- ARMA(1, 1)
- ARMA(p, q) Analysis
- ARIMA, Differencing to Obtain Stationarity
- Dickey-Fuller Test
- Parameter Estimation
- Yule Walker Equations
- Likelihood Methods
- statsmodels
- Forecasting
Non-Stationary to Stationary
- Box-Cox Transformation
- Trend and Seasonal Components
- Differencing
ARCH/GARCH Modeling
- Motivation
- ARCH(1)
- AR(1)/ARCH(1)
- ARCH(p)
- ARCH Properties
- ARCH and Stylized Facts
- Weaknesses of ARCH Model
- From ARCH to GARCH
- GARCH(1, 1)
- Fitting ARCH to S&P 500 Data
- GARCH(p, q)
- GARCH Forecasting
- Engle Test for ARCH Effects
- GARCH Forecasting Example in Risk Management
- Other Volatility Models
Multivariate Time Series
- Multivariate Time Series
- Vector Autoregressioive Processes
- Stationarity of VAR(1) Processes
Cointegration
- Cointegration
- Johansen Test
- Cryptocurrency Example
State Space Modeling
- State Space Models
- Kalman Recursions: Kalman Prediction & Filtering
- Example (Linear Regression)
- AR, MA in State Space Form
- Bayesian Background to Kalman Methods
- Stochastic Volatility

Notes

Stylized Facts
- Stylized Facts
- Log-Returns
Stationarity
- Stationarity
  - Weakly stationary
    - mean and var are constant
    - Cov$(X_s, X_t)$ only depends on the lag $|s-t|$
    - weak stationarity $+$ jointly normal distributions $\implies$ strict stationarity
  - WN$(0, \sigma^2)$
    - weakly stationary process with mean 0
    - ACovF is $\{\sigma^2, 0, 0, \ldots\}$
- Asymptotics of Stationary Sequences
- Standard Facts on Conditional Expectation
- MDS
  - Martingale: $E(X_{t+1}|\mathcal F_t) = X_t\quad\forall t\ge 0$
  - MDS: $E(X_{t+1}|\mathcal F_t) = 0\quad\forall t\ge 0$, hence $E(X_{t+1}) = 0$.
  - 3 types of noise processes: iid, MDS and weakly stationary processes
  - {iid, zero mean} $\subset$ {MDS}
  - {Common Finite Variance MDS} $\subset$ {White Noise Processes}
  - MDS with common finite variance has CLT
- Wold Decomposition
  - If $\cap_{j=1}^\infty \mathcal F_{t-j} = \{\phi, \infty\}$, every weakly stationary $X_t$ is MA($\infty$) :nbsphinx-math:`begin{align*}
    &X_t = mu + sum^infty_{j=0} psi_jepsilon_{t-j},\ &psi_0 = 1,quadsum_{j=0}^infty psi_j^2 < infty. end{align*}`
AR Processes
- ACF
- Bartlett’s Formula
- Ljung-Box Test :nbsphinx-math:`begin{align*}
  &H_0: rho(1)=rho(2)=cdots=rho(m)=0\ &H_1: text{at least one of $rho(i)$ is nonzero, $1le ile m$} end{align*}`
- AR(1) :nbsphinx-math:`begin{align*}
  X_t = phi_0 + phi_1 X_{t-1} + epsilon_t end{align*}`
  - stationary iff $|\phi_1| < 1$
  - $E(X_t) = \phi_0/(1-\phi_1), \quad |\phi_1| < 1$
  - $Var(X_t) = \phi_{\epsilon}^2/(1-\phi_1^2), \quad |\phi_1| < 1$
  - $\gamma(h) = \phi_1^{|h|}\frac{\sigma_{\epsilon}^2}{1-\phi_1^2}, \quad |\phi_1| < 1$
  - $\rho(h) = \phi_1^{|h|}, \quad |\phi_1| < 1$
  - if either $E(X_0)$ or $Var(X_0)$ differ from the stationary values but $|\phi_1| < 1$ then the process is only asymptotically stationary
  - remove the mean: define $\mu = \phi_0/(1-\phi_1), Y_t = X_t - \mu$
- Causal Processes
- AR(2) :nbsphinx-math:`begin{align*}
  X_t &= phi_0 + phi_1 X_{t-1} + phi_2 X_{t-2} + epsilon_t, \ mu &= frac{phi_0}{1 - phi_1 - phi_2}\ Y_t &= X_t - mu end{align*}`
  - assume moment structure is constant
  - ACF: Multiply $Y_{t+h} = \phi_1Y_{t+h-1} + \phi_2Y_{t+h-2} + \epsilon_{t+h}$ by $Y_t$, take expectation, and divide by $\gamma(0)$: :nbsphinx-math:`begin{align*}
    rho(h) = begin{cases} 1 &mbox{ if } h=0\ phi_1/(1-phi_2) &mbox{ if } h=1\ phi_1rho(h-1) + phi_2rho(h-2) &mbox{ if } hge 2 end{cases} end{align*}`
  - AR polynomial: Plug $z=1/\lambda$ into the chf of the recurrence relation :nbsphinx-math:`begin{align*}
    phi(z) = 1 - phi_1 z - phi_2 z^2 end{align*}`
  - $X_t$ is stationary iff all roots of $\phi(z) = 0$ (the characteristic roots) have modulus strictly greater than 1
  - recurrence relation is $\phi(B)\rho(h) = 0$
  - matrix form: :nbsphinx-math:`begin{align*}
    mathbf X_t &= (X_t, X_{t-1})^T, quadmathbf mu = (phi_0, 0)^T, quadmathbfepsilon_t = (epsilon_t, 0)^T, \ mathbf X_t &= mathbf mu + mathbf M mathbf X_{t-1} + mathbfepsilon_t, \ mathbf M &= begin{pmatrix} phi_1 & phi_2\ 1 & 0 end{pmatrix} end{align*}`
- Weak Stationarity of AR$(p)$
  - assuming constant mean, :nbsphinx-math:`begin{align*}
    X_t &= phi_0 + phi_1 X_{t-1} + cdots + phi_p X_{t-p} + epsilon_t, \ mu &= frac{phi_0}{1 - sum_{i=1}^pphi_i}, quadsum_{i=1}^pphi_i < 1\ Y_t &= X_t - mu end{align*}`
  - AR polynomial :nbsphinx-math:`begin{align*}
    phi(z) = 1 - sum_{i=1}^p phi_i z^i end{align*}`
  - matrix form :nbsphinx-math:`begin{align*}
    mathbf M = begin{pmatrix} phi_1 & phi_2 & cdots & phi_{p-1} & phi_p\ 1 & 0 & cdots & 0 & 0 \ 0 & 1 & cdots & 0 & 0 \ vdots & vdots & vdots & vdots & vdots \ 0 & 0 & cdots & 1 & 0 end{pmatrix} end{align*}`
- Partial Correlation Coefficients $\rho(X, Y|\vec Z)$
  1. regress $X$ on $\vec Z$
  2. regress $Y$ on $\vec Z$
  3. compute correlation coefficient of the residuals
- PACF for AR$(p)$
  - estimate $\hat \phi_{k, k}$ in :nbsphinx-math:`begin{align*}
    X_t = phi_{0, k} + phi_{1, k}X_{t-1} + cdots + phi_{k, k}X_{t-k} + epsilon_{k, t} end{align*}`
  - $p+1$ is the smallest $k$ such that the test concludes $\phi_{k, k} = 0$
MA Processes
- MA$(q)$ :nbsphinx-math:`begin{align*}
  X_t = mu + sum_{i=1}^q theta_iepsilon_{t-i} + epsilon_t end{align*}`
  - weakly stationary for all $\{\theta_i\}$
  - $E(X_t) = \mu$
  - $Var(X_t) = \sigma_\epsilon^2(1 + \sum_{i=1}^q \theta_i^2), \quad\forall t$ :nbsphinx-math:`begin{align*}
    gamma(h) &= begin{cases} sigma_epsilon^2 sum_{i=1}^{q-|h|} theta_itheta_{i+|h|} &mbox{ if }qle |h|\ 0 &mbox{ if }q>|h| end{cases},\ rho(h) &= gamma(h)/gamma(0) end{align*}`
- Invertibility of MA Processes
  - the two MA(1) processes have the same ACF: :nbsphinx-math:`begin{align*}
    Y^{(1)}_t &= epsilon_t - theta_1 epsilon_{t-1}\ Y^{(2)}_t &= epsilon_t - frac{1}{theta_1} epsilon_{t-1} end{align*}`
  - write residuals as an AR process :nbsphinx-math:`begin{align*}
    epsilon_t &= Y_t^{(1)} + sum_{i=1}^infty theta_1^i Y_{t-i}^{1}\ epsilon_t &= Y_t^{(2)} + sum_{i=1}^infty frac{1}{theta_1^i} Y_{t-i}^{2} end{align*}`
  - MA$(q)$ is invertible if the residuals can be represented by an AR process with convergent coefficients
  - MA polynomial :nbsphinx-math:`begin{align*}
    theta(z) = 1 - theta_1 z - theta_2 z^2 - cdots - theta_q z^q end{align*}`
  - An MA process is invertible iff all roots of $\theta(z) = 0$ have modulus great than 1
- Formal Notations
  - ARMA$(p, q): Y_t - \phi_1 Y_{t-1} - \cdots - \phi_pY_{t-p} = \epsilon_t + \theta_1\epsilon_{t-1} + \cdots + \theta_q \epsilon_{t-q}$ :nbsphinx-math:`begin{align*}
    phi(B)Y_t = theta(B)epsilon_t end{align*}`
  - ARIMA$(p, d, q)$ :nbsphinx-math:`begin{align*}
    phi(B)(1-B)^dY_t = theta(B)epsilon_t end{align*}`
ARMA Models
- AMRA Models :nbsphinx-math:`begin{align*}
  &X_t - phi_0 - phi_1 X_{t-1} - cdots - phi_pX_{t-p} = epsilon_t + theta_1epsilon_{t-1} + cdots + theta_q epsilon_{t-q}\ &E(X_t) = mu = frac{phi_0}{1-(phi_1+cdots+phi_p)} end{align*}`
- ARMA(1, 1) :nbsphinx-math:`begin{align*}
  X_t = phi_0 + phi_1X_{t-1} + theta_1epsilon_{t-1} + epsilon_t end{align*}`
  - assuming stationary, :nbsphinx-math:`begin{align*}
    E(X_t) &= phi_0/(1-phi_1)\ Var(X_t) &= gamma(0) = frac{(1 + theta_1^2 + 2phi_1theta_1)theta_epsilon^2}{1-phi_1^2}\ rho(1) &= frac{gamma(1)}{gamma(0)} = frac{(1 + phi_1theta_1)(phi_1 + theta_1)}{1 + theta_1^2 + 2phi_1theta_1}\ rho(h) &= phi_1^{h-1}rho(1), quad hge 2 end{align*}`
- ARMA$(p, q)$ Analysis
- ARIMA, Differencing to Obtain Stationarity
  - $X_t$ is $\mathcal I(k)$ if $\nabla^{k-1}X_t$ is non-stationary but $\nabla^{k}X_t$ is stationary, where $\nabla = (1-B)$
  - $\mathbf X_t$ is $\mathcal I(k)$ if at least one of its coordinates is $\mathcal I(k)$ and all the others are $\mathcal I(j)$ for some $j\le k$
- Dickey-Fuller Test :nbsphinx-math:`begin{align*}
  H_0 &: text{a unit root is present}\ H_1 &: text{no unit root} end{align*}`
- Parameter Estimation: OLS for AR$(p)$ :nbsphinx-math:`begin{align*}
  Y_t = phi_1Y_{t-1} + phi_2Y_{t-2} + cdots + phi_pY_{t-p} + epsilon_t end{align*}`
  - assuming the errors are white noise, the least square estimate $\hat \phi$ is asymptotically normal :nbsphinx-math:`begin{align*}
    &sqrt{n}(hatphi - phi) implies N_p(mathbf 0, sigma_epsilon^2mathbf Gamma_p^{-1}), \ &mathbfGamma_p = E(mathbf Y^Tmathbf Y), \ &mathbf Y = (Y_1, Y_2, ldots, Y_p) end{align*}`
  - the $(i, j)$ element of the matrix is $E(Y_iY_j) = \gamma(i-j)$
- Yule Walker Equations
  - $\rho(k) = \phi_1\rho(k-1) + \phi_2\rho(k-2) + \cdots + \phi_p\rho(k-p), \quad\forall 1\le k\le p$
  - solve the $p\times p$ linear system to obtain an estimate $\hat \phi$: :nbsphinx-math:`begin{align*}
    mathbf rho &= mathbf Rmathbf phi, \ mathbf rho &= (rho(1), rho(2), ldots, rho(p))^T, \ mathbf R_{i, j} &= rho(i-j) end{align*}`
  - can be used as the initial guess for numerical root finding in MLE
- Likelihood Methods
- statsmodels
- Forecasting
Non-Stationary to Stationary
- Box-Cox Transformation
  - Box-Cox Transformation :nbsphinx-math:`begin{align*}
    X^{(lambda)} = begin{cases} (X^lambda - 1)/lambda &mbox{ if }lambdane 0\ log(X) &mbox{ if } lambda = 0 end{cases} end{align*}`
  - Box-Cox only fixes the variance, not the mean, for example larger variance for higher values
- Trend and Seasonal Components
  - $X_t = m_t + Y_t$
    - linear trend: $\mu_t = \beta_0 + \beta_1 t$
    - quadratic trend: $\mu_t = \beta_0 + \beta_1 t + \beta_2 t^2$
    - moving average smoother :nbsphinx-math:`begin{align*}
      hat m_t &= frac{1}{2q+1}sum_{j=-q}^q X_{t+j}\ &= frac{1}{2q+1}sum_{j=-q}^q m_{t+j} + frac{1}{2q+1}sum_{j=-q}^q Y_{t+j}\ &approx m_t + text{small error} end{align*}`
  - seasonal component with period $d$ :nbsphinx-math:`begin{align*}
    hat X_t &= beta_0 + beta_1 t + sum_{j=2}^dbeta_j l_j(t)\ l_j(t) &= begin{cases} 1 &mbox{if $t$ mod $d$ is $j$}\ 0 &mbox{otherwise} end{cases}quad forall 1le jle d end{align*}`
  - there are January indicator function, February indicator function, and so on
  - one of the indicators is omitted as the sum of all indicators must be 0
- Differencing
  - $\nabla = (1-B)$ can remove polynomial trends; for example $\nabla^2$ can remove quadratic trends
  - $\nabla_d = (1-B^d)$ can remove seasonal trend: if $X_t = \beta_0 + \beta_1 t + s_t + \epsilon_t$ where $s_t$ is the seasonal term such that $s_t = s_{t-d}$, then $\nabla_d X_t$ is weakly stationary
  - $\nabla_d \ne \nabla^d = (1-B)^d$
ARCH/GARCH Modeling
- Motivation
  - ARIMA has non-constant $E(X_t|\mathcal F_{t-1})$ but constant $Var(X_t|\mathcal F_{t-1})$, GARCH is the opposite
  - deterministic models: $Var(X_t|\mathcal F_{t-1})$ is deterministic
  - stochastic volatility models: $Var(X_t|\mathcal F_{t-1})$ is a stochastic process
  - GARCH by itself does not explain the JPM GS situation
- ARCH(1) :nbsphinx-math:`begin{align*}
  a_t &= sigma_tepsilon_t\ sigma_t &= sqrt{omega + alpha a_{t-1}^2}, quadomega > 0, 0le alpha < 1 end{align*}`
  - $\epsilon_t$ is iid with mean 0 and variance 1
  - $E(X_t|\mathcal F_{t-1}) = 0$
  - $Var(X_t|\mathcal F_{t-1}) = \sigma_t^2 = \omega + \alpha a_{t-1}^2$
  - assuming weak stationarity, ARCH(1) is a white noise: $\gamma_a(0) = E(\sigma_t^2) = E(\omega + \alpha a_{t-1}^2) = \omega + \alpha\gamma_a(0)$, so :nbsphinx-math:`begin{align*}
    gamma_a(0) &= frac{omega}{1-alpha}\ gamma_a(h) &= 0 end{align*}`
  - $\alpha$ controls the mean reversion of $\sigma^2_t$
- AR(1)/ARCH(1) :nbsphinx-math:`begin{align*}
  X_t = mu + beta(X_{t-1} - mu) + a_t, quad |\beta| < 1 end{align*}`
  - $\rho_X(h) = \beta^{|h|}, \rho_{a^2} = \alpha^{|h|}$
  - non-constant conditional mean and variance
- ARCH(p)
- ARCH(1) Properties
  - $a_t^2$ is an AR(1) if $E(\epsilon_t^4) < \infty$: :nbsphinx-math:`begin{align*}
    a_t^2 = omega + alpha a_{t-1}^2 + sigma_t^2(epsilon_t^2 - 1), end{align*}`
  - $\nu_t = \sigma_t^2(\epsilon_t^2 - 1)$ can be shown to be a white noise
  - when $\epsilon_t$ is iid $N(0, 1)$, the unconditional kurtosis > 3: Following AR(1) properties, we have :nbsphinx-math:`begin{align*}
    E(a_t^2) &= frac{omega}{1-alpha}, \ Var(a_t^2) &= frac{2E(sigma_t^4)}{1-alpha^2}, \ E(sigma_t^4) &= E((omega + alpha a_{t-1}^2)^2) \ &= frac{omega^2(1+alpha)}{(1-3alpha^2)(1-alpha)}\ &= 3(E(a_t^2))^2frac{1-alpha^2}{1-3alpha^2} > 3(E(a_t^2))^2 end{align*}`
  - ARCH Effect: $a_t^2$ and $a_{t+h}^2$ are positively correlated
- ARCH and Stylized Facts
  - ARCH does not support asymmetry or the leverage effect
- Weaknesses of ARCH Model
- From ARCH to GARCH :nbsphinx-math:`begin{align*}
  a_t &= sigma_tepsilon_t, \ sigma^2_t &= omega + sum_{i=1}^p alpha_i a_{t-i}^2 + sum_{j=1}^q beta_jsigma_{t-j}^2, quadomega ge 0, alpha_i ge 0, beta_j > 0 end{align*}`
  - $\epsilon_t$ is iid $N(0, 1)$
- GARCH(1, 1) squared is ARMA(1, 1) :nbsphinx-math:`begin{align*}
  a_t^2 - c &= (alpha + beta)(a_{t-1}^2 - c) - betaeta_{t-1} + eta_t, end{align*}`
  - $c = \omega/(1-\alpha-\beta), \eta_t = a_t^2 - \sigma_t^2$
  - ARMA(1, 1) with mean $c$ and coefficients $\phi_1 = \alpha + \beta, \theta_1 = -\beta$
- Fitting ARCH to S&P 500 Data
- GARCH$(p, q)$ squared is ARMA$(p, q)$ :nbsphinx-math:`begin{align*}
  a_t^2 - c &= sum_{i=1}^{max(p, q)}(alpha_i + beta_i)(a_{t-i}^2 - c) - sum_{i=1}^{max(p, q)}beta_ieta_{t-i} + eta_t, end{align*}`
  - $c = \omega/(1-\sum_{j=1}^{\max(p, q)}(\alpha_i + \beta_i)), \eta_t = a_t^2 - \sigma_t^2$
  - given $\alpha_i > 0, \beta_i \ge 0$, $a_t^2$ is weakly stationary if $\sum_{i=1}^p\alpha_i + \sum_{j=1}^q\beta_j < 1$
- GARCH Forecasting
  - 1-step ahead forecast of the conditional variance $\sigma_{t+1}^2$ is already given by the model
  - for GARCH(1, 1), let $\lambda = \alpha + \beta < 1$, the $k$-step ahead forecast is :nbsphinx-math:`begin{align*}
    hat sigma_{t+k}^2 &= omega + lambda hat sigma_{t+k-1}^2\ &= omega(1 + lambda + cdots + lambda^{k-2}) + lambda^{k-1} hat sigma_{t+1}^2 \ &rightarrow frac{omega}{1-lambda}quad text{ as }krightarrow infty end{align*}`
  - half-life of the volatility difference is approximately $\lambda^T = 1/2$, so $T\approx -\frac{\log 2}{\log\lambda}$
- Engle Test for ARCH Effects
- GARCH Forecasting Example in Risk Management
- Other Volatility Models
  - GARCHM :nbsphinx-math:`begin{align*}
    X_t &= mu + csigma_t^2 + a_t\ a_t &= epsilon_tsigma_t\ sigma_t^2 &= omega + alpha a_{t-1}^2 + beta sigma_{t-1}^2 end{align*}`
  - EGARCH :nbsphinx-math:`begin{align*}
    g(epsilon_t) &= thetaepsilon_t + gamma(|\epsilon_t| - E(|\epsilon_t|))\ &= begin{cases} (theta + gamma)epsilon_t - gamma(|\epsilon_t|) mbox{ if } epsilon_tge 0\ (theta - gamma)epsilon_t - gamma(|\epsilon_t|) mbox{ if } epsilon_t < 0 end{cases},\ a_t &= sigma_tepsilon_t\ log(sigma_t^2) &= omega + sum_{i=1}^pbeta_i log(sigma_{t-i}^2) + sum_{j=1}^q g_j(epsilon_{t-j}) end{align*}`
Multivariate Time Series
- Multivariate Time Series
  - weak stationary: mean vector and autocovariance function (now a matrix) are independent of $t$ :nbsphinx-math:`begin{align*}
    mathbf X_t &= (X_{1,t}, X_{2,t}, ldots, X_{m,t})\ mathbf Gamma(t+h, t) &= E((mathbf X_{t+h}-mathbf mu_{t+h})(mathbf X_{t}-mathbf mu_{t})^T)\ rho_{i, j}(h) &= frac{gamma_{i, j}(h)}{sqrt{gamma_{i, i}(0)gamma_{j, j}(0)}} end{align*}`
  - the diagonal elements are the ACovF of the individual component time series
  - white noise: weak stationary + zero mean + zero ACF $\forall h\ne 0$
  - $\rho_{i, j}(h) = \rho(X_{i,(t+h)}, X_{j, t}) = \rho_{i, j}(-h)$
  - the sample mean of a weakly stationary process converges and is asymptotically normal
- Vector Autoregressioive Processes :nbsphinx-math:`begin{align*}
  mathbf X_t = mathbf a_0 + sum_{i=1}^pmathbf A_imathbf X_{t-i} + epsilon_t end{align*}`
  - stationarity condition: roots of :nbsphinx-math:`begin{align*}
    detleft(I - sum_{i=1}^p mathbf A_i x^iright) = 0 end{align*}` have modulus strictly larger than 1
Cointegration
- Cointegration
  - the components of a multivariate time series $X_t$ is CI$(d, b)$ if
    1. all components are $\mathcal I(d)$
    2. there exists a nonzero $\vec\alpha$ (the cointegrating vector) such that $\vec\alpha X_t$ is $\mathcal I(d-b)$ with $b>0$
  - two cointegrated time series $X_t, Y_t$ can have small correlation: :nbsphinx-math:`begin{align*}
    W_t &= W_{t-1} + epsilon_t\ X_t &= W_t + epsilon_{X, t}\ Y_t &= W_t + epsilon_{Y, t} end{align*}`
  - both $\mathcal I(1)$ but $X_t - Y_t$ is stationary :nbsphinx-math:`begin{align*}
    Cov(X_t, Y_t) = frac{tsigma^2}{sqrt{(tsigma^2 + sigma_X^2)(tsigma^2 + sigma_Y^2)}} end{align*}`
  - if $m=2$, $\vec\alpha$ is unique up to scale
  - cointegration does not imply high correlation: :nbsphinx-math:`begin{align*}
    X_t &= X_{t-1} + epsilon_{X, t}\ Y_t &= Y_{t-1} + epsilon_{Y, t}\ Z_t &= X_t + Y_t end{align*}`
  - $X_t$ and $Z_t$ are not cointegrated but $\rho_{X, Z} = 1/\sqrt{1+\sigma_Y^2/\sigma_X^2}$ which will be large if $\sigma_Y/\sigma_X$ is small
- Johansen Test
  - difference the time series until it’s $\mathcal I(1)$
  - $\mathbf X_t$ is VAR$(p)$ :nbsphinx-math:`begin{align*}
    nablamathbf X_t &= mathbf a + (mathbf A_1 - I) nablamathbf X_{t-1} + (mathbf A_1 + mathbf A_2 - I)mathbf X_{t-2} + sum_{i=3}^pmathbf A_imathbf X_{t-i} + mathbf epsilon_t = cdots\ mathbf B_i &= (mathbf A_1 + cdots + mathbf A_i -I) end{align*}`
  - $\mathbf B_i\mathbf X_{t-i}$ is stationary iff the rows of $B_i$ are cointegrating vectors or 0
  - $\mathbf B$ can not be full rank or otherwise taking inverse we find $X_{t-i}$ to be stationary
  - if $rank(\mathbf B) = 0$, no cointegrating vector :nbsphinx-math:`begin{align*}
    H_0 &: rank(B)=0\ H_1 &: rank(B)>0 end{align*}`
- Cryptocurrency Example
State Space Modeling
- State Space Models :nbsphinx-math:`begin{align*}
  mathbf X_{t+1} &= mathbf F_t mathbf X_t + mathbf V_t\ mathbf Y_t &= mathbf G_t mathbf X_t + mathbf W_t end{align*}`
  - $\mathbf V_t$ and $\mathbf W_t$ are uncorrelated WN
- Kalman Recursions: Kalman Prediction & Filtering
  - Prediction: Estimate $\mathbf X_{t+1}$ or $\mathbf X_{t+k}$ using $\mathbf Y_0, \mathbf Y_1, \ldots, \mathbf Y_t$; denoted $\hat{\mathbf X}_{t+k|t}$
  - Filtering: Estimate $\mathbf X_t$ using $\mathbf Y_0, \mathbf Y_1, \ldots, \mathbf Y_t$; denoted $\hat{\mathbf X}_{t|t}$
  - Smoothing: Estimate $\{\mathbf X_t\}_{t=1}^{T-1}$ using $\mathbf Y_0, \mathbf Y_1, \ldots, \mathbf Y_T$; denoted $\hat{\mathbf X}_{t|T}$
- Example (Linear Regression)
- AR, MA in State Space Form
  - AR(2) with zero mean: $X_t = \phi_1X_{t-1} + \phi_2X_{t-2} + \epsilon_t$ :nbsphinx-math:`begin{align*}
    &mathbf X_t = (X_t, X_{t-1})^T, quadmathbf epsilon_t = (epsilon_t, 0)^T, \ &mathbf X_t = mathbf Fmathbf X_{t-1} + epsilon_t, \ &Y_t = (1, 0)mathbf X_t, \ &mathbf F = begin{pmatrix} phi_1 & phi_2\ 1 & 0 end{pmatrix} end{align*}`
  - MA(2) with zero mean: $X_t = \theta_1\epsilon_{t-1} + \theta_2\epsilon_{t-2} + \epsilon_t$ :nbsphinx-math:`begin{align*}
    &mathbf X_t = (epsilon_{t}, epsilon_{t-1}, epsilon_{t-2})^T, quadmathbf epsilon_t = (epsilon_t, 0, 0)^T, \ &mathbf X_t = mathbf Fmathbf X_{t-1} + epsilon_t, \ &Y_t = (1, theta_1, theta_2)mathbf X_t, \ &mathbf F = begin{pmatrix} 0 & 0 & 0\ 1 & 0 & 0\ 0 & 1 & 0 end{pmatrix} end{align*}`
- Bayesian Background to Kalman Methods
- Stochastic Volatility