時間序列ARMA中p，q選擇

對平穩時間序列yn，求得其自相關函式（acf）和偏自相關函式（pacf）序列。

若pacf序列滿足在p步截尾，且acf序列被負指數函式控制收斂到0，則yn為ar(p)序列。

若acf序列滿足在q步截尾，且pacf序列被負指數函式控制收斂到0，則yn為ma(q)序列。

若acf序列和pacf序列滿足皆不截尾，但都被負指數函式控制收斂到0，則yn為arma序列。

對於有n個觀察值的序列，求得相應於ar(p)、ma(q) 和 arma(p,q)三種模型的殘差方差，出現模型最小殘差方差時的模型階數就是各個模型的最佳階數。

ar模型的引數可以根據acf序列構成的矩陣及其矩陣之間的轉化關係求得。

ma模型的引數採用線性迭代法即可求出。

arma模型引數估計方法是按上述求解ar模型和ma模型引數的方法分別對ar和ma模型進行引數估計，即可得到arma模型的引數。

根據對應的模型以及估計引數等帶入估計函式計算出估計值。

最近工作中用到時間序列模型中的arma（arima），需要自動選擇p，q值，但是我查到的資料都是根據自相關圖和偏自相關圖來觀察拖尾和截尾，以此來選擇p，q值，剛開始時一籌莫展，後來靈機一動，為何不看下匯入的statsmodel庫中對應畫圖的函式呼叫時用的源**呢：

import statsmodels
print statsmodels.__file__

acf_x, confint = pacf(x, nlags=nlags, alpha=alpha, method=method)
ifuse_vlines:
ax.vlines(lags, [0], acf_x, **kwargs)
ax.axhline(**kwargs)
# center the confidence interval todo: do in acf?
confint = confint - confint.mean(1)[:,none]
kwargs.setdefault('marker', 'o')
kwargs.setdefault('markersize', 5)
kwargs.setdefault('linestyle', 'none')
ax.margins(.05)
ax.plot(lags, acf_x, **kwargs)
ax.fill_between(lags, confint[:,0], confint[:,1], alpha=.25)

從畫圖部分可以看到置信區間上界為confint[:,0]，下界線為 confint[:,1]，又有：

confint =confint - confint.mean(1)[:,none]

因此可以寫個迴圈，當出現截尾時返回當前p,q值，也就可以確定選擇ar或ma模型了，如果兩者都拖尾，則需要用赤池資訊準則，或者貝葉斯資訊準則或別的準則來進一步判斷，那就是另一回事了，除了執行時間稍微有點長外並沒有什麼困難的了。

**如下：

#用來檢查時間序列穩定性的，**中選用的臨界值為5%，p-value選用的為0.1，這個可以根據實際進行修改。
from statsmodels.tsa.arima_model import arima
from statsmodels.tsa.stattools import adfuller, acf, pacf
defteststationarity
(timeser):
stationarity = false
dftest = adfuller(timeser)
dfoutput = series(dftest[:4], index=[
'test statistic', 'p-value', 'lags', 'nobs'])
for key, value in dftest[4].items():
dfoutput['critical values (%s)' % key] = value
if dfoutput['test statistic'] < dfoutput['critical values (5%)']:
if dfoutput['p-value'] < 0.1:
stationarity = true
return stationarity

（由於自動判斷，因此收斂速度狀況並沒有做判斷，這個是因為我使用的資料在滿足第乙個在置信區間內的值後即是區域性最優，實際中因資料的不同**需要部分修改，這裡只是提供了乙個思路。）

def
p_q_choice
(timeser, nlags=40, alpha=.05):
kwargs = 
acf_x, confint = acf(timeser, **kwargs)
acf_px, confint2 = pacf(timeser, **kwargs)
confint = confint - confint.mean(1)[:, none]
confint2 = confint2 - confint2.mean(1)[:, none]
for key1, x, y, z in zip(range(nlags), acf_x, confint[:,0], confint[:,1]):
if x > y and x < z:
q = key1
break
for key2, x, y, z in zip(range(nlags), acf_px, confint2[:,0], confint[:,1]):
if x > y and x < z:
p = key2
break
return p, q

附錄：

arima模型運用的流程

根據時間序列的散點圖、自相關函式和偏自相關函式圖識別其平穩性。

對非平穩的時間序列資料進行平穩化處理。直到處理後的自相關函式和偏自相關函式的數值非顯著非零

根據所識別出來的特徵建立相應的時間序列模型。

平穩化處理後，若偏自相關函式是截尾的，而自相關函式是拖尾的，則建立ar模型；

若偏自相關函式和自相關函式均是拖尾的，則序列適合arma模型。

引數估計，檢驗是否具有統計意義。

假設檢驗，判斷（診斷）殘差序列是否為白雜訊序列。

利用已通過檢驗的模型進行**。

時間序列ARMA中p，q選擇

時間序列 ARMA模型

STL中序列容器的選擇

時間序列的截尾和拖尾時間序列中p,d,q的確定

時間序列ARMA中p，q選擇

時間序列 ARMA模型

STL中序列容器的選擇

時間序列的截尾和拖尾 時間序列中p,d,q的確定

相關推薦

時間序列的截尾和拖尾時間序列中p,d,q的確定