Об оптимальном адаптивном прогнозе многомерного процесса АРМА(1,1)
Рассматривается проблема асимптотической эффективности адаптивных одношаговых прогнозов многомерного устойчивого процесса АРМА(1,1) с неизвестными параметрами динамики. Прогнозирование основано на методе усечённого оценивания матрицы. Усечённые оценки являются модификацией усечённых последовательных оценок, позволяющей достичь заданной точности на выборках фиксированного размера. Критерий оптимальности прогнозов основан на функции потерь, определённой как линейная комбинация размера выборки и выборочного среднего квадрата ошибки прогноза. Изучены случаи известной и неизвестной дисперсии шума. В последнем случае оптимальный объём наблюдения записывается как момент остановки.
On optimal adaptive prediction of multivariate ARMA(1,1) process.pdf According to Ljung's concept of construction of complete probabilistic models of dynamic systems, the prediction is a crucial part of it (see [1, 2]). A model is said to be useful if it allows one to make predictions of high statistical quality. Models of dynamic systems often have unknown parameters, which demand estimation in order to build adaptive predictors. The quality of adaptive prediction is explicitly dependent on the chosen estimators of model parameters. There is a wide variety of possible estimation methods. For example, the sequential estimation method makes it possible to obtain estimators with guaranteed accuracy by samples of finite but random and unbounded size (see, e.g., [3] among others). The more modern truncated sequential estimation method yields estimators with prescribed accuracy by samples of random but bounded size (see, e.g., [4]). This work suggests predictors based upon the truncated estimators of parameters introduced in [5, 6] as a modification of the truncated sequential estimators. Truncated estimators were constructed for ratio type func-tionals and are designed to use samples of fixed (non-random) size and have guaranteed accuracy in the sense of the L2m -norm, m > 1. The requirement of both good quality of predictions and reasonable duration of observations needed to achieve one is formulated as a risk efficiency problem. The criterion is given by certain loss functions and optimization is performed based on it. The loss function describing sample mean of squared prediction errors and sample size as well as the corresponding risk as applied to scalar AR(1) were examined in [7]. It was shown that the least squares estimators of the dynamic parameter are asymptotically risk efficient. Later, this result was refined and extended to other stochastic models in [8], using the sequential estimators of unknown parameters. In this paper we construct and investigate real-time predictors based on truncated estimators in the case of more general model. We consider the problem of the risk minimization associated with size of a sample and predictions of values of a stable multivariate ARMA(1,1) process with unknown dynamic matrix parameter. The proposed procedure is shown to be asymptotically risk efficient as the cost of prediction error tends to infinity. № 1 (30) ВЕСТНИК ТОМСКОГО ГОСУДАРСТВЕННОГО УНИВЕРСИТЕТА The same problem for scalar AR(1) case was considered in [9], multivariate AR(1) in [10]. The ARMA model was studied in [1, 2] among others. A thorough review of risk efficient parameter estimation and adaptive prediction problem for autoregressive processes was recently made in [11] (see the references therein as well). (1) 1. Problem statement Consider the multivariate stable ARMA(1,1) process satisfying the equation x(k) = Лх(к -1) +^(к) + M 1(к -1), к > 1, where Л and M are p x p matrix parameters with eigenvalues from the unit circle to provide the process stability (henceforth we shall refer to such matrices as "stable" ones). We assume the parameter Л to be unknown and M to be known. The random vectors %(k) for k > 1 are independent and identically distributed (i.i.d.) with zero mean and finite variance ст2 = E || %(1) ||2, we also assume the components % j (k), j = 1, p, to be uncorrelated and i.d. so that the covariance matrix E= E%(1)%'(1) is diagonal with elements ст2 / p. Denote the Л stable region Л0 с Mpxp. It is known that the optimal in the mean square sense one-step predictor is the conditional expectation of the process with respect to its past, i.e. xopt (k) = Лх(7 -1) + M%(k -1), k > 1. Since both the parameter Л and the value of %(k -1) are unknown, it is natural to replace them with some estimators Лk and %(k -1), which we specify in Section 2 below. Define adaptive predictors as the following (see, e.g., [1, 12]): x(k) =Лk-1 x(k -1) + M% (k -1), k > 1, (2) for which the corresponding prediction errors have the following form e(k) = x(k) - xx(k) = (Л - Лk-1)x(k -1) + M(%(k -1) -%(k -1)) + %(k). Let e2 (n) denote the sample mean of squared prediction error e2(n) =1 £ || e(k) ||2 . n k=1 Define the loss function A2 L„ =- e (n) + n, n where the parameter A(> 0) is the cost of prediction error. The corresponding risk function R = E0 Ln = -Ee e2(n) + n, (3) n E0 denotes expectation under the distribution Pe with the given parameter 8 = (Xu,...,Xpp...,црр,ст2). Define the set © such that for 8e© the matrices Л and M are stable and ст2 > 0. The main aim is to minimize the risk Rn on the sample size n . We consider the cases of known and unknown ст2. 2. Main result In this section we solve the stated optimization problem under different conditions on model parameters. We use, similarly to [10], the truncated estimation method introduced in [5]. This method makes it possible to obtain the ratio type estimators with guaranteed accuracy using a sample of fixed size. Such quality may essentially simplify investigation of analytical properties in various adaptive procedures. Let the truncated estimators of the autoregressive parameter Л be based on the following Yule-Walker type estimators Л k =ФДЛ k > 2, Л 0 = Л, = 0, (4) = 7-7£x(/)x'(i - 2), Gk = J-£x(i -1)x'(i - 2) k 1 i= 2 k 1 i=2 and have the form Rk =Лkx(|A7|> Hk), k > 2. (5) Here Ak = det(Gk), the notation %(B) means the indicator function of the set B and Hk = log-1/2 k. (6) We note that according to [5], Hk can be taken as any decreasing slowly changing positive function. We take the estimators for %(k) in the following form i(к) = £ (-MУ (x(k-i) -ЛкХ(к -1 -i)), к > 1. (7) i=0 This way the prediction error can be rewritten as е(к) = £(к) + (-M)к i(0) - £ (-M) (Лк- - Л)х(к -1 - i). i=0 2.1. Known ст2 case If the noise variance ст2 is known, instead of Лк in (2) we shall use the projection of estimators (5) onto a closed ball B e Rpxp, such that Л0 с B Лкк = РЩу-^ к > ensuring II л;-Л||< dB, (8) where dB is the diameter of B. Given that ст2 is known, the property (8) allows one to weaken the noise moment conditions compared to the more general case of unknown ст2 (see Section 2.2 below). Rewrite the formulae accordingly к -1 (к) = £ (-M )'■ (х(к - i) - л;х(к -1 - i)), x; (к) = Л;-1 х(к -1) + M §* (к -1), i=0 к-1 e; (к) = х(к) - x; (к) = £(к) + (-M)к i(0) - £ (-M)' (Лк-1 - Л)х(к -1 - i), i=0 e» =1 £ II (к) ||2, Ln = -e,2(n) + n, Rn = L„ = AEee2,(n) + n. n к=1 n n To minimize the risk Rn we rewrite it in the form R„ = - (ст2 + D„) + n, (9) n where i i _n У к-1 II2 D„ = - £ E0 II x; (к) - xopt (к) II2 = - £ eJ (-M)к i(0)-£ (-My (Л;к-, - Л) х(к -1 - i) . n к =1 n к=1 У i=0 || We shall use the properties of the estimators Лк given in Lemma 1 below. Define к0 = max{p, [e|A| 2 ]j}, where [a]j denotes the integer part of a and Д = lim Дк, Pe - a.s. к ^да Now we establish the conditions on the system parameters, under which Д ф 0. It can be shown, similarly to, e.g., [13], that due to ergodicity of the process (х(к ))к >0: -L ^x (i -1) x '(i - 2) -1 i=2 к Ок = £ x (i -1) x '(i - 2) ,, ... > G, Pe - a.s., к - 1 i=2 where G = Л^ + ME, F = £ Л 'БЛ'', S = ЛSM'+ ,УЕЛ'+ E + MEM'. (10) i>0 The condition for Д ф 0 is thus nondegeneracy of G. For example, in the scalar case p = 1 we have ^ (Л+M )(1 + Л^) 2 G =-:-ст , 1 -Л2 which is the first order autocovariance; the condition is Л + M ф 0 as stability of the process implies 1+ ЛM ф 0. From here on C denotes those non-negative constants, the values of which are not critical. Lemma 1. Assume the model (1) and let for some integer m > 1 the conditions E I i(1) II4pm 0, P8- a.s. n «^да 7 8 Similar arguments are used to show vn-> 0, P8 - a.s. n м^да 7 8 The relation (28) obviously follows from these facts, the representation (27) and strong law of large numbers. From the definition (18) of TA it follows that with P8 -probability one TA ^да as A ^да. Therefore, by (28) we have 1. U" 2log A' Denote П n { ( 11 » r l|2 M'" ^ mn =^ £ n ;=i 2 (k) + (k -1) ||2 - By the definition of TA and (27) we have EJA < Па + £ Pe [n2A-1 < £ II i(k) + Mi(k -1) II2 +Wn + Vя п>пл l П k=1 + Pe (W +|Vj I> Уа,п , < + £ M П2A-1 < Cm£ II i(k) + Mi(k -1) II2 +yл n> па [ l n k=1 < + £ {Pe( П 2 A-1 Пл + Pe (I V П | > y A, П / 2) + Pe (W > У л, П / 2) + Pe (| m„ | > y ) }. Note that £ Pe[П2A-1 ст2 + 2y а,п } = 1 A1' 2ст| 1 + +1. log A -1 Therefore + £ Pe(n2A-1 0. The probability Pe (Wn > yAn) is treated analogously. As for the probability Pe(| mn |> yA,n), note that mn is sum of martingales, thus the Chebyshev inequality and the Burkholder inequality yield (CM Pe(К|>Уа,п) < Pe >У A,n < £ (II (i(k) + Mi(k -1) II2 - (ст2 + IIM II ст2)) к =1 < Cy-A„n-1 = 4CA2 log2 A • n-5. Therefore, by assumptions on A n A-1/2 £ Pe(K|>yA,п )< CA3/2 log2 A £ n-5 < < CA3/2 log2 A • nA < CA-log-6 A - > 0. Then from (29) it follows that - ET lim EtL < 1. л^да A12ст Same arguments can be used to show ET lim ETl > i А^да A ст and thus, in view of (30) the assertion (24) holds. Regarding (25), rewrite its left-hand side using (17) and (22) R- -Eq T-/(Ta )+EqT- (32) (33) Rn„ 2 А1' 2ст + 0( A1'4 log1'2 A) From (24) and (31) it follows that to prove (25) it suffices to show the convergence a1/2 eq T- 7(T-) --- 1. -ст Define N' = [(ст-e)A1/2]., N" = [(ст + e)A1/2]. +1, 0 5., forsome n > nA )< < £ PQ f I mn I>yJ+£ PQ (I v и l> 5. / 2). n> nA V ^ J n> nA Consider the first summand. By the Chebyshev inequality, (26) and the Burkholder inequality • £ (|| (%(k) + M%(k -1) 112 -(ст2 + 11M 11 ст2)) (35) (36) < Pq < C £ ^ n|4 < C £ n-2 < CA-r log-2 A. n> nA n n-nA The following is proved analogously to how (29) is treated (34) >1) £ PqI^ n sup E8 (n log-1 n |vn I)2 5. /2) < C £ n-2log2 n < Cn- log2 A < CA-r. The first property of (33) follows from (34)-(36). Prove the second property of (33). Denote 52 = (ст + e)2 -ст2. Then, by definition (18) of TA and (27) Pq (T- > N") < PQ Nr£ || %(k) + M%(k -1) ||2 +WN,+vN, > A-1( N" )2 N k=1 < pq "M£ || %(k) + M%(kk -1) ||2 +1 WN, + vN. |> (ст + e)2 < V N k=1 J < pq (I mN, |> 52 /2) + pQ (I WN. + vn. |> 52 /2). By the Chebyshev and Burkholder inequalities pq (l mN, |> 52 / 2) < C(N")-2 = O(A-1), pQ (| WN, +vN, |> 52 / 2) < C(N")-2 = O(A-1). Thus, the second assertion in (33) holds true. To prove (32) we show that 1 "Г, A1/2Ee - e2(T, )x(Ta < N') -- 0, A1/2E,- e2(TA )%(TA > N'') T -L A . 1. A1/2 Ee-1- e2(TA )X( N '< Ta < N'') TACT 1 т (37) (38) 0, T Prove the first assertion in (37). By the definition of e2(k) we get A1/2Ee T-~\TA )x(Ta < N') = T A 1 ta | к-1 II2 = A1/2 Ee - £ (-M )k i(0)-£ (-M) (Л к- - Л) x(k -1 - i) x(Ta < N') + i=0 T. +A"2Ee -j-£ II i(k) II2 X(Ta < N'). (39) T, A к=1 T, A к=1 | T. + 2 A"2 Ee - £i'(k)[ (-M )k i(0)-£ (-M ) (Лк-, - Л) x(k -1 - i) Ix(Ta < N') + lA к =1 1 Consider the first summand. By the Cauchy-Schwarz-Bunyakovsky inequality and the definition of TA assumptions on nA and r, the properties (33) and Lemma 1 we have T, II k-1 1(Ta < N') < T 1 A2Ee -2£ (-M)k i(0) - £ (-MУ (Лк-! - Л)х(к -1 - i) A к =1 I < A12Pe"2(Ta < N') -l £ Ee nA к=1 V 4 (-M)k i(0) - £ (-mу (Лк-! - Л)х(к -1 - i) . Examine the expression under the root square. The most significant summand is treated using the Cauchy-Schwarz-Bunyakovsky inequality k-1 II4 Ee £(-M У (Л к-! -Л) х(к -1 - i) < < Ee II Л к-! -Л II8 Ee|£(-M )х(к -1 - iJ < eJ|£££(-M )'х(к -1 - i )|| . \ II'=0 II к V II'=0 II It can be easily shown, employing the Holder inequality, that EB £1 (-M)ix(k -1 - i) < Ee[£ Ml \\x(k -1 - i)|| I < C (£1 \Mi/2\\ I •£ |Mi/2|f < C and hence, by the assumptions on nA and r, the properties (33) and Lemma 1 we have 1 Ta || к-1 ||2 A2Ee T2 £ £ (-M) (Лк-! - Л)x(k -1 - i)\\ x(Ta < N') < TA к =1 || i =0 || < CA1/2Pe1/2(TA < N')П.£££ioii < CA-1*- log-2 A ---+ 0. nA к=1 к Consider the second summand of (39). The Doob's maximal inequality for martingales (see, e.g., [14]) and the Cauchy-Schwarz-Bunyakovsky inequality yield 1 A12 Ee Ta2 £ Ee £i'(k)[ (-M )k i(0)-£ (-m у (Л к-! -Л) х(к -1 - i) |X(ta < N ') < " JT^Ee max' [ ££ i'(k) [ (-M )k i(0) - £ (-M )l (Л к - Л) x(k -1 - i) || < < стЛ- (-m)k i(0) -£ (-Mу (Лк-j - Л)х(к -1 - i)i(0) < 0. < "logA -..да Consider the last summand of (39). We have . TA N -- A1/2Eq/т£|| %(k) ||2 X(T- N'') to get 1 T- II k-1 ||2 A1/2Eq -£ (-M)k%(0)-£ (-M) (Л;-1 -Л)x(k -1 -i) x(T- > N') < CAA3 --- 0, TA k=1 II i =0 II 1 T- f k-i Л A1/2 £%'(k)| (-M)k %(0)-£ (-M )i (Л;-. -Л) x(k -1 - i) ]x(T- > N') < CA-3log A --- 0, T A k =1 1 T- , A1/2Eq -£ || %(k) ||2 X(T- > N') < CA-1 0 A k=1 and to (38) with x(TA < N') replaced by z(N' < TA < N'') to get 1 Ta II k-1 ||2 A1/2Eq -£ (-M)k %(0) - £ (-M) (Л;-1 - Л)x(k -1 - i) x(N' < T- < N'') < CA-* log2 A - 0, TA ст k=1 1 T- f k-1 Л , A1/2Eq -5- £%'(k)| (-M)k%(0)-£(-M)(Л;- -Л)x(k-1 -i) ]x(N'
Ключевые слова
адаптивные прогнозы,
асимптотическая риск-эффекитвность,
многомерный АРМА,
момент остановки,
оптимальный размер выборки,
усечённое оценивание,
adaptive predictors,
asymptotic risk efficiency,
multivariate ARMA,
optimal sample size,
stopping time,
truncated parameter estimatorsАвторы
Кусаинов Марат Исламбекович | Томский государственный университет | аспирант факультета прикладной математики и кибернетики | rjrltsk@gmail.com |
Всего: 1
Ссылки
Ljung L., Soderstrom T. System Identification Theory for the User. Prentice Hall. Upper Saddle River, 1983.
Ljung L. Theory and Practice of Recursive Identification. Cambridge : The MIT Press, 1987.
Konev V., Pergamenshchikov S. On the Duration of Sequential Estimation of Parameters of Stochastic Processes in Discrete Time // Stochastics. 1986. V. 18. Is. 2. P. 133-154.
Konev V., Pergamenshchikov S. Truncated Sequential Estimation of the Parameters in Random Regression // Sequential Analysis. 1990. V. 9, issue 1. P. 19-41.
Vasiliev V.A. Truncated Estimation Method with Guaranteed Accuracy // Annals of Institute of Statistical Mathematics. 2014. V. 66. P. 141-163.
Vasiliev V. Guaranteed Estimation of Logarithmic Density Derivative By Dependent Observations // Topics in Nonparametric Statis tics: Proceedings of the First Conference of the International Society for Non-parametric Statistics / eds. by M.G. Akritas et al. New York : Springer, 2014.
Sriram T. Sequential Estimation of the Autoregressive Parameter in a First Order Autoregressive Process // Sequential Analysis. 1988. V. 7. Is. 1. P. 53-74.
Konev V., Lai T. Estimators with Prescribed Precision in Stochastic Regression Models // Sequential Analysis. 1995. V. 14. Is. 3. P. 179-192.
Vasiliev V., Kusainov M. Asymptotic Risk-Efficiency of One-Step Predictors of a Stable AR(1) // Proceedings of XII All-Russian Conference on Control Problems. Moscow, 2014.
Kusainov M., Vasiliev V. On Optimal Adaptive Prediction of Multivariate Autoregression // Sequential Analysis. 2015. V. 34. Is. 2. (to be published).
Sriram T., IaciR. Sequential Estimation for Time Series Models. Sequential Analysis. 2014. V. 33. Is. 2. P. 136-157.
Box G., Jenkins G., Reinsel G. Time Series Analysis: Forecasting and Control. Wiley, Hoboken, 2008.
Pergamenshchikov S. Asymptotic Properties of the Sequential Plan for Estimating the Parameter of an Autoregression of the First Order // Theory of Probability and Its Applications. 1991. V. 36. Is. 1. P. 42-53.
Liptser R., Shiryaev A. Statistics of Random Processes. New York : Springer, 1977.
Gikhman I., SkorokhodA. Introduction to the Theory of Random Processes. Saunders, Philadelphia, 1969.