Large Deviations for the Method of Empirical Means in Stochastic Optimization Problems with Continuous Time Observations

Knopov, Pavel S.; Kasitskaya, Evgenija J.

doi:10.1007/978-3-319-68640-0_13

Pavel S. Knopov⁶ &
Evgenija J. Kasitskaya⁶

Part of the book series: Springer Optimization and Its Applications ((SOIA,volume 130))

1274 Accesses
2 Citations

Abstract

In this paper we consider the large deviation problem for the method of empirical means in stochastic optimization with continuous time observations. For discrete time models this problem was studied in Knopov and Kasitskaya (Cybern Syst Anal 4:52–61, 2004; Cybern Syst Anal 5:40–45, 2010).

Access provided by CONRICYT-eBooks. Download chapter PDF

Large Deviations of Empirical Estimates in the Stochastic Programming Problem with Nonstationary Observations and Continuous Time

Article 01 September 2019

Least-Squares Estimation for the Subcritical Heston Model Based on Continuous-Time Observations

Article 05 November 2018

Stochastic Maximum Principle

1 The Approach

Consider the following stochastic minimization problem: minimize the function

$$\displaystyle \begin{aligned} F\left(x\right)=\mathbb{E} f\left(x,\xi_0 \right),\quad x\in X,\end{aligned} $$

(1)

where $\left \{\xi _t,\, t\in {\mathbb R}\right \}$ is a stationary in the narrow sense stochastic process with continuous trajectories, defined on a complete probability space $\left (\varOmega ,\mathcal {G},P\right )$ with values in some metric space $\left (Y,\rho \right )$, where X is a non-empty compact subset of ${\mathbb R}$, and $f:X\times Y\to {\mathbb R}$ is a continuous function.

Approximate the above problem by the following one: minimize the function

$$\displaystyle \begin{aligned} F_{T} \left(x\right)=\frac{1}{T} \int _{0}^{T}f\left(x,\xi_t\right)dt ,\end{aligned} $$

where $\left \{\xi _t,0\le t\le T\right \}$ are the observations of the process ξ _t, T > 0.

Clearly, there exists the minimum point $x_{T} =x_{T} \left (\omega \right )$, which is a measurable function.

Suppose that

$$\displaystyle \begin{aligned} \mathbb{E} \max \left\{\left|f\left(x,\xi_0\right)\right|,x\in X\right\}<\infty .\end{aligned} $$

Then the function (1) is continuous, and has at least one minimum point x ₀. Let us assume that this point is unique.

Theorem 1 ([1])

Let $\left \{\xi _t,\; t\in R\right \}$ be a stationary in the narrow sense random process, defined on a probability space $\left (\varOmega ,\Im ,P\right )$ , and assume that there exists the unique point x ₀ ∈ X which is the unique minimum point of the function F(x).

Then for all T > 0 and ω ∈ Ω′, $P\left (\varOmega '\right )=1$ , there exists at least one vector x _T ∈ X, on which the minimal value of the function F _T(x) is attained.

Moreover, for each T > 0 the function x _T is $\mathcal {G}^{\prime }_{T} $ -measurable, where $\mathcal {G}^{\prime }_{T} =\mathcal {G}_{T} \bigcap \varOmega '$ , $\mathcal {G}_{T} =\sigma \left \{\xi _t,\; 0\le t\le T\right \}$.

Then

$$\displaystyle \begin{aligned} P\left\{\mathop{\lim }_{T\to \infty } x_{T} =x_{0} \right\}=1, \quad \quad P\left\{\mathop{\lim }_{T\to \infty } F_{T} (x_{T} )=F(x_{0} )\right\}=1. \end{aligned} $$

Now we study the probability of large deviations of x _T and the minimal value F _T for $x_{0} ,F\left (x_{0} \right )$.

For any y we can assume that $f\left (\circ ,y\right )$ belongs to the space of continuous functions $C\left (X\right )$. Suppose that there exists such compact convex $K\subset C\left (X\right )$ that

$$\displaystyle \begin{aligned} f\left(\circ ,y\right)-F\left(\circ \right)\in K,\quad y\in Y. \end{aligned}$$

Then $F_{T} \left (\circ \right )-F\left (\circ \right )\in K$.

In what follows F _T − F is considered as random elements on $\left (\varOmega ,\mathcal {G},P\right )$ with values in the set K.

We use the well-known results from function analysis.

Definition 1 ([2])

Let $\left (V,\left \| \circ \right \| \right )$ be a normed linear space, $B\left (x,r\right )$ be a closed ball of radius r with center in x; $f\_ :V\to \left [-\infty ,+\infty \right ]$ is some function, and x _f is its minimum point on V. The improved function ψ for f in the point x _f is a monotone non-decreasing function, such that $\psi :\left [0,+\infty \right )\to \left [0,+\infty \right ],\psi \left (0\right )=0,$ and there exists r > 0, for which for any $x\in B\left (x_{f} ,r\right )$ we have

$$\displaystyle \begin{aligned}f\left(x\right)\ge f\left(x_{f} \right)+\psi \left(\left\| x-x_{f} \right\| \right).\end{aligned}$$

Let V ₀ ⊂ V . Define

$$\displaystyle \begin{aligned} \delta _{V_{0} } \left(x\right)= \begin{cases} 0,& x\in V_{0}, \\ +\infty ,& x\notin V_{0}. \end{cases} \end{aligned}$$

Theorem 2 ([2])

Let $\left (V,\left \| \circ \right \| \right )$ be a linear normed space, V ₀ ⊂ V is closed, and $f_{0} ,g_{0} :V\to {\mathbb R}$ are continuous on V functions. Suppose that

$$\displaystyle \begin{aligned}\varepsilon =\sup \left\{\left|f_{0} \left(x\right)-g_{0} \left(x\right)\right|,\quad x\in V_{0} \right\}.\end{aligned}$$

Let $f,g:V\to \left (-\infty ,+\infty \right ]:$

$$\displaystyle \begin{aligned}f =f_{0} +\delta _{V_{0} } ,\quad g=g_{0} +\delta _{V_{0} } . \end{aligned}$$

Then

$$\displaystyle \begin{aligned}\left|\inf \left\{f\left(x\right),\,\, x\in V\right\}-\inf \left\{g\left(x\right),\,\, x\in V\right\}\right|\le \varepsilon .\end{aligned}$$

Let x _f be the minimum point of f on V, ψ be the improving function for f at the point x _f with coefficient r. If ε is small enough such that

$$\displaystyle \begin{aligned}\psi \left(\left\| x-x_{f} \right\| \right)\le 2\varepsilon \quad \Longrightarrow \quad \left\| x-x_{f} \right\| \le r,\end{aligned} $$

then for any $x_{g} \in \arg \min \left \{g\left (x\right ),x\in B\left (x_{f} ,r\right )\right \}$ we have $\psi \left (\left \| x_{f} -x_{g} \right \| \right )\le 2\varepsilon .$ For convex and strictly increasing on $\left [0,r\right ]$ function ψ,

$$\displaystyle \begin{aligned}\psi ^{-1} \left(2\varepsilon \right)\le r \quad \Longrightarrow \quad \left\| x_{f} -x_{g} \right\| \le \psi ^{-1} \left(2\varepsilon \right)\end{aligned}$$

$$\displaystyle \begin{aligned}\forall x_{g} \in \arg \min \left\{g\left(x\right),\,\, x\in B\left(x_{f} ,r\right)\right\}.\end{aligned} $$

We need some statements from the large deviations theory.

Theorem 3 ([3, p. 53])

Let μ _ε, ε > 0 be a family of probability measures on a compact closed subspace H of a separable Banach space E. Suppose that there exists

$$\displaystyle \begin{aligned}\varLambda \left(\lambda \right)\equiv \mathop{\lim }_{\varepsilon \to 0} \varepsilon \varLambda _{\mu _{\varepsilon } } \left(\frac{\lambda }{\varepsilon } \right)\end{aligned} $$

for any λ ∈ E ^∗ —the dual space of E, where

$$\displaystyle \begin{aligned}\varLambda _{\mu } \left(\lambda \right)=\ln \left(\int _{E}\exp \left\{\left\langle \lambda ,x\right\rangle \right\}\mu \left(dx\right) \right)\end{aligned} $$

for any probability measure μ on E, and $\left \langle \lambda ,x\right \rangle $ is the duality relation. Define

$$\displaystyle \begin{aligned}\varLambda^*\left(q\right)=\sup \left\{\left\langle \lambda ,q\right\rangle -\varLambda \left(\lambda \right),\lambda \in E^*\right\},\quad q\in H.\end{aligned} $$

Then Λ ^∗ is non-negative, convex, lower semi-continuous, and for any compact set A ⊂ H

$$\displaystyle \begin{aligned}\overline{\lim }\left\{\varepsilon \ln \left(\mu _{\varepsilon } \left(A\right)\right),\varepsilon \to 0\right\}\le -\inf \left\{\varLambda^*\left(q\right),\,\, q\in A\right\}.\end{aligned}$$

Definition 2 ([3])

Let Σ be a separable Banach space, $\left \{\xi _t,t\in {\mathbb R}\right \}$ is a stationary in the narrow sense stochastic process on $\left (\varOmega ,\mathcal {G},P\right )$ with values in Σ. Denote $B_{t_{1} t_{2} } =\sigma \left \{\xi _t,t_{1} \le t\le t_{2} \right \}.$ For τ > 0 the random variables η ₁, …, η _p, p ≥ 2, are called τ-measurably separated, if

$$\displaystyle \begin{aligned}-\infty \le t_{1} \le s_{1} <t_{2} \le s_{2} <\ldots<t_{p} \le s_{p} \le +\infty; \quad t_{j} -s_{j-1} \ge \tau ,\end{aligned} $$

where η _j is $B_{t_{j} s_{j} } $–measurable.

Definition 3 ([3])

A stochastic process $\left \{\xi _t\right \}$ from Definition 2 is said to satisfy Hypothesis (H-1) of hypermixing, if there exist $\tau _{0} \in {\mathbb N}\cup \left \{0\right \}$ and a non-increasing $\alpha :\left \{\tau >\tau _{0} \right \}\to \left [1,+\infty \right )$, such that

$$\displaystyle \begin{aligned} \mathop{\lim }_{\tau \to \infty } \alpha \left(\tau \right)=1, \end{aligned}$$

$$\displaystyle \begin{aligned} \left\| \eta _{1} \times \ldots\times \eta _{p} \right\| _{L^{1} } \le \prod _{j=1}^{p}\left\| \eta _{j} \right\| _{L^{\alpha \left(\tau \right)} } \end{aligned} $$

(H-1)

for any p ≥ 2, τ > τ ₀, η ₁, …, η _p τ-measurably separated,

$$\displaystyle \begin{aligned}\left\| \eta \right\| _{L^{r} } =\left(\mathbb{E} \left\{\left|\eta \right|{}^{r} \right\}\right)^{1/r} .\end{aligned}$$

Let X be a compact subset of ${\mathbb R}.$ It is known (cf. [4]), that $\left (C\left (X\right )\right )^*=M\left (X\right )$, where M(X) is a collection of signed measures on X, and also for any $g\in C\left (X\right ),Q\in M \left (X\right )$

$$\displaystyle \begin{aligned} \left\langle g,Q\right\rangle =\int _{X}g\left(x\right)Q\left(dx\right). \end{aligned}$$

We need the following auxiliary statement.

Theorem 4

Let $\left \{\xi _t,t\in {\mathbb R}\right \}$ be a stationary in the narrow sense ergodic stochastic process with continuous trajectories, which satisfies the hyper-mixing hypothesis (H-1) on $\left (\varOmega ,\mathcal {G},P\right ),$ with values in a compact convex set $K\subset C\left (X\right )$ , ξ _t(x) ⊂ K and $\mathcal {G}_t$ -measurable. Then for any measure $Q\in M \left (X\right )$ there exists

$$\displaystyle \begin{aligned}\varLambda \left(Q\right)=\mathop{\lim }_{T\to \infty } \frac{1}{T} \ln \left(\mathbb{E} \exp \left\{\int _{X}\int _{0}^{T}\xi_t(x)dtQ\left(dx\right) \right\}\right),\end{aligned}$$

and for any closed A ⊂ K

$$\displaystyle \begin{aligned}\overline{\lim }\left\{\frac{1}{T} \ln P\left\{\frac{1}{T} \int _{0}^{T}\xi_tdt \in A\right\},T\to \infty \right\}\le -\inf \left\{\varLambda^*\left(g\right),g\in A\right\},\end{aligned}$$

where $\varLambda ^*\left (g\right )=\sup \left \{\int _{X}g\left (x\right )Q\left (dx\right ) -\varLambda \left (Q\right ),Q\in M \left (X\right )\right \}$ is a non-negative convex lower semi-continuous function.

Proof

Fix $Q\in M \left (X\right ).$ Let τ ₀ be a constant from the hyper-mixing condition, τ > τ ₀, S > τ, S < T. Then

$$\displaystyle \begin{aligned}T=N_{T} S+r_{T} ,\quad N_{T} \in {\mathbb N},\quad r_{T} <S.\end{aligned}$$

Define

$$\displaystyle \begin{aligned} \begin{aligned}{} f_{T} &=\ln \mathbb{E} \exp \left\{\int _{X}\int _{0}^{T}\xi_t(x)dtQ\left(dx\right) \right\},\\ c&=\max \left\{\left\| g\right\| ,g\in K\right\},\quad \left\| g\right\| =\max \left\{\left|g\left(x\right)\right|,x\in X\right\}, \quad g\in C\left(X\right). \end{aligned} \end{aligned} $$

(2)

Denote (cf. also [4])

$$\displaystyle \begin{aligned}v\left(Q,X\right)=\sup \left\{\sum _{i=1}^{k}\left|Q\left(E_{i} \right)\right| ,E_{i} \cap E_{j} =\emptyset ,i\ne j,E_{i} \in B\left(X\right),k\in {\mathbb N}\right\}{<}\,\infty ,Q\in M \left(X\right).\end{aligned} $$

We have

$$\displaystyle \begin{aligned}f_{T} =\ln \mathbb{E} \exp \left\{\int _{X}\left(\sum _{j=0}^{N_{T} -1}\int _{jS}^{\left(j+1\right)S-\tau }\xi_t(x)dt +\right. \right. \end{aligned}$$

$$\displaystyle \begin{aligned} +\sum _{j=0}^{N_{T} -1}\int _{\left(j+1\right)S-\tau }^{\left(j+1\right)S}\xi_t(x)dt +\int _{N_{T} S}^{T}\xi_t(x)dt \left. \right)Q\left(dx\right)\left. \right\}. \end{aligned} $$

(3)

By (2),

$$\displaystyle \begin{aligned}\max \left\{\left|\xi_t(x)\right|,\, x\in X\right\}\le c.\end{aligned}$$

Therefore, for all ω

$$\displaystyle \begin{aligned} \left|\int _{X}\int _{0}^{T}\xi_t(x)dtQ\left(dx\right) \right|\le cTv\left(Q,X\right). \end{aligned} $$

(4)

It follows from (4) that for any ω

$$\displaystyle \begin{aligned} \sum _{j=0}^{N_{T} -1}\int _{X}\int _{\left(j+1\right)S-\tau }^{\left(j+1\right)S}\xi_t(x)dtQ\left(dx\right) \le cv\left(Q,X\right)\tau N_{T} , \end{aligned} $$

(5)

$$\displaystyle \begin{aligned} \int _{X}\int _{N_{T} S}^{T}\xi_t(x) dtQ\left(dx\right) \le cr_{T} v\left(Q,X\right). \end{aligned} $$

(6)

For any ω denote

$$\displaystyle \begin{aligned}A_{1} =\sum _{j=0}^{N_{T} -1}\int _{X}\int _{jS}^{\left(j+1\right)S-\tau }\xi_t(x)dtQ\left(dx\right) ,\end{aligned}$$

$$\displaystyle \begin{aligned}A_{2} =\sum _{j=0}^{N_{T} -1}\int _{X}\int _{\left(j+1\right)S-\tau }^{\left(j+1\right)S}\xi_t(x)dtQ\left(dx\right) ,\end{aligned}$$

$$\displaystyle \begin{aligned}A_{3} =\int _{X}\int _{N_{T} S}^{T}\xi_t(x)dtQ\left(dx\right) .\end{aligned}$$

By (5) and (6), for any ω

$$\displaystyle \begin{aligned} \begin{aligned}{} \exp \big(A_{1}& +A_{2} +A_{3} \big)=\exp A_{1} \exp A_{2} \exp A_{3}\\ &\le \exp A_{1} \exp \left\{cv\left(Q,X\right)\tau N_{T} \right\}\exp \left\{cv\left(Q,X\right)r_{T} \right\}. \end{aligned} \end{aligned} $$

(7)

Further,

$$\displaystyle \begin{aligned}\exp A_{1} =\prod _{j=0}^{N_{T} -1}\exp \left\{\int _{X}\int _{jS}^{\left(j+1\right)S-\tau }\xi_t(x)dtQ\left(dx\right) \right\} ,\omega \in \varOmega.\end{aligned}$$

We have

$$\displaystyle \begin{aligned}\mathbb{E} \prod _{j=0}^{N_{T} -1}\exp \left\{\int _{X}\int _{jS}^{\left(j+1\right)S-\tau }\xi_t(x)dtQ\left(dx\right) \right\} \end{aligned}$$

$$\displaystyle \begin{aligned} \le \prod _{j=0}^{N_{T} -1}\left(\mathbb{E} \left\{\left(\exp \int _{X}\int _{jS}^{\left(j+1\right)S-\tau }\xi_t(x)dtQ\left(dx\right) \right)^{\alpha \left(\tau \right)} \right\}\right)^{1/\alpha \left(\tau \right)}. \end{aligned} $$

(8)

Inequality (8) follows from the hyper-mixing hypothesis (H-1). Further, due to the stationarity of ξ _t,

$$\displaystyle \begin{aligned}\mathbb{E} \exp \left\{\alpha \left(\tau \right)\int _{X}\int _{jS}^{\left(j+1\right)S-\tau }\xi_t(x)dtQ\left(dx\right) \right\}\end{aligned}$$

$$\displaystyle \begin{aligned} =\mathbb{E} \exp \left\{\alpha \left(\tau \right)\int _{X}\int _{0}^{S-\tau }\xi_t(x)dtQ\left(dx\right) \right\} \end{aligned} $$

(9)

for $j=\overline {0,N_{T} -1}$. From (8), (9) we have

$$\displaystyle \begin{aligned}\mathbb{E} \exp A_{1} \le \left(\mathbb{E} \exp \left\{\alpha \left(\tau \right)\int _{X}\int _{0}^{S-\tau }\xi_t(x)dtQ\left(dx\right) \right\}\right)^{\frac{N_{T} }{\alpha \left(\tau \right)} } .\end{aligned}$$

By (3),

$$\displaystyle \begin{aligned}\begin{aligned}{} f_{T}& \le cv\left(Q,X\right)\tau N_{T}+cv\left(Q,X\right)r_{T} +\frac{N_{T} }{\alpha \left(\tau \right)} \ln \mathbb{E} \exp \left\{\alpha \left(\tau \right)\int _{X}\int _{0}^{S-\tau }\xi_t(x)dtQ\left(dx\right) \right\}\\ &=cv\left(Q,X\right)\tau N_{T} +cv\left(Q,X\right)r_{T} \\ &\quad +\frac{N_{T} }{\alpha \left(\tau \right)} \ln \mathbb{E} \exp \left\{\left(\alpha \left(\tau \right)-1\right)\int _{X}\int_{0}^{S-\tau }\xi_t(x)dtQ\left(dx\right) +\int_{X}\int _{0}^{S-\tau }\xi_t(x)dtQ\left(dx\right) \right\} \\ &\le cv\left(Q,X\right)\tau N_{T} +cv\left(Q,X\right)r_{T} +\frac{N_{T} }{\alpha \left(\tau \right)} \left(\alpha \left(\tau \right)-1\right)\left(S-\tau \right)cv\left(Q,X\right)\\ &\quad +\frac{N_{T} }{\alpha \left(\tau \right)} \ln \mathbb{E} \exp \int_{X}\int_{0}^{S-\tau }\xi_t(x)dtQ\left(dx\right)\\ &\le cv\left(Q,X\right)\tau N_{T} +cv\left(Q,X\right)r_{T} +\frac{N_{T} }{\alpha \left(\tau \right)} \left(\alpha \left(\tau \right)-1\right)\left(S-\tau \right)cv\left(Q,X\right)\\ &\quad +\frac{N_{T} }{\alpha \left(\tau \right)} \ln \mathbb{E} \exp \int _{X}\left(\int _{0}^{S}\xi_t(x)dt -\int _{S-\tau }^{S}\xi_t(x)dt \right)Q\left(dx\right)\\ &\le cv\left(Q,X\right)\tau N_{T} +cv\left(Q,X\right)r_{T} +\frac{N_{T} }{\alpha \left(\tau \right)} \left(\alpha \left(\tau \right)-1\right)\left(S-\tau \right)cv\left(Q,X\right)\\ &\quad +\frac{N_{T} }{\alpha \left(\tau \right)} cv\left(Q,X\right)\tau +\frac{N_{T} }{\alpha \left(\tau \right)} \ln \mathbb{E} \exp \int _{X}\int _{0}^{S}\xi_t(x)dtQ\left(dx\right)\\ &=cv\left(Q,X\right)\tau N_{T} +cv\left(Q,X\right)r_{T} +\frac{N_{T} }{\alpha \left(\tau \right)} \left(\alpha \left(\tau \right)-1\right)\left(S-\tau \right)cv\left(Q,X\right)\\ &\quad +\frac{N_{T} }{\alpha \left(\tau \right)} cv\left(Q,X\right)\tau +\frac{N_{T} }{\alpha \left(\tau \right)} f_{S}. \end{aligned}\end{aligned} $$

(10)

By (10),

$$\displaystyle \begin{aligned}\frac{f_{T} }{T} &\le\frac{2cv\left(Q,X\right)\tau N_{T} }{T} +\frac{cv\left(Q,X\right)r_{T} }{T} +\frac{\left(\alpha \left(\tau \right)-1\right)N_{T} \left(S-\tau \right)cv\left(Q,X\right)}{T} +\frac{N_{T} }{\alpha \left(\tau \right)} \frac{f_{S} }{T}\\ &\le \frac{2cv\left(Q,X\right)\tau N_{T} }{N_{T} S} +\frac{cv\left(Q,X\right)r_{T} }{T} +\frac{\left(\alpha \left(\tau \right)-1\right)N_{T} Scv\left(Q,X\right)}{N_{T} S} +\frac{N_{T} f_{S} }{\alpha \left(\tau \right)N_{T} S}\\ &=\frac{2cv\left(Q,X\right)\tau }{S} +\frac{cv\left(Q,X\right)r_{T} }{T} +\left(\alpha \left(\tau \right)-1\right)cv\left(Q,X\right)+\frac{f_{S} }{\alpha \left(\tau \right)S}. \end{aligned} $$

Then

$$\displaystyle \begin{aligned}\overline{\lim_{T\to\infty} }\frac{f_{T} }{T} \le \frac{2cv\left(Q,X\right)\tau }{S} +\left(\alpha \left(\tau \right)-1\right)cv\left(Q,X\right)+\frac{f_{S} }{\alpha \left(\tau \right)S} .\end{aligned}$$

Letting S →∞, we derive

$$\displaystyle \begin{aligned}\overline{\lim_{T\to\infty} }\frac{f_{T} }{T}\le \left(\alpha \left(\tau \right)-1\right)cv\left(Q,X\right)+\frac{1}{\alpha \left(\tau \right)} \underset{{S\to\infty}}{\underline{\lim}}\frac{f_{S} }{S} .\end{aligned}$$

Passing to the limit as τ →∞, we get

$$\displaystyle \begin{aligned}\overline{\lim_{T\to\infty} }\frac{f_{T} }{T}\le \underset{{S\to\infty}}{\underline{\lim}}\frac{f_{S} }{S}.\end{aligned}$$

Therefore,

$$\displaystyle \begin{aligned}\mathop{\lim }_{T\to \infty } \frac{f_{T} }{T} =\varLambda \left(Q\right).\end{aligned}$$

We use Theorem 3. We have

$$\displaystyle \begin{aligned}H=K,\quad E=C\left(X\right),\quad E^*=\mathbb{E} \left(X\right),\quad \left\langle Q, g\right\rangle =\int _{X}g\left(x\right)Q\left(dx\right) ,\quad \varepsilon =\frac{1}{T} .\end{aligned}$$

Further, μ _ε = μ _1/T is the probability measure on K, defined by the distribution $\frac {1}{T} \int _{0}^{T}\xi _tdt$. We get

$$\displaystyle \begin{aligned} \begin{aligned}{} \mathop{\lim }_{\varepsilon \to 0} \varepsilon \varLambda _{\mu _{\varepsilon } } \left(\frac{Q}{\varepsilon } \right)&=\mathop{\lim }_{T\to \infty } \frac{1}{T} \ln \int _{K}\exp \left\{\int _{X}g\left(x\right)TQ\left(dx\right) \right\}\mu _{1/T} \left(dg\right)\\ &=\mathop{\lim }_{T\to \infty } \frac{1}{T} \ln \mathbb{E} \exp \int _{X}\int _{0}^{T}\xi_t(x)dtTQ\left(dx\right) =\mathop{\lim }_{T\to \infty } \frac{f_{T} }{T} \\ &=\varLambda \left(Q\right). \end{aligned} \end{aligned} $$

(11)

By (11), the proof follows from Theorem 2.

Let us come back to problem (1).

Theorem 5

Suppose that the process ξ _t satisfies the hyper-mixing hypothesis (H-1). Then for any ε > 0

$$\displaystyle \begin{aligned}\overline{\lim_{T\to\infty}} \frac{1}{T} \ln P\left\{\left\| F_{T} -F\right\| \ge \varepsilon \right\}\le -\inf \left\{I\left(z\right),\,\,z\in A_{\varepsilon } \right\},\end{aligned}$$

where $I\left (z\right )=\varLambda ^*\left (z\right )=\sup \left \{\int _{X}z\left (x\right )Q\left (dx\right ) -\varLambda \left (Q\right ),\,\,Q\in M \left (X\right )\right \}$ is a non-negative lower semi-continuous convex function,

$$\displaystyle \begin{aligned}\varLambda \left(Q\right)=\mathop{\lim }_{T\to \infty } \frac{1}{T} \ln \mathbb{E}\Big\{ \exp \int _{X}\int _{0}^{T}\left(f\left(x,\xi_t\right)-F\left(x\right)\right)dt Q\left(dx\right)\Big\} ,\end{aligned}$$

$$\displaystyle \begin{aligned}A_{\varepsilon } =\left\{z\in K:\left\| z\right\| \ge \varepsilon \right\}.\end{aligned}$$

Proof

Note that A _ε is a closed subspace of K. The process

$$\displaystyle \begin{aligned}\zeta \left(t\right)=f\left(\circ ,\xi_t\right)-F\left(\circ \right),\quad t\in {\mathbb R},\end{aligned}$$

taking values in K, is a measurable function of ξ _t and hence, satisfies the conditions of Theorem 4. Therefore, the statement of the theorem follows from Theorem 4. Theorem is proved.

Theorem 6

Suppose that the conditions of Theorem 5 are satisfied. Then

$$\displaystyle \begin{aligned} \begin{aligned}{} \overline{\lim_{T\to\infty}} \frac{1}{T} \ln P\left\{\left|\mathop{\min }_{x\in X} F\left(x\right)-\mathop{\min }_{x\in X} F_{T} \left(x\right)\right|\ge \varepsilon \right\} \le -\inf \left\{I\left(z\right),z\in A_{\varepsilon } \right\}, \end{aligned} \end{aligned} $$

(12)

where $I\left (\circ \right ),A_{\varepsilon } $ is defined in Theorem 5.

Suppose that there exists an improving function ψ for F at the point x ₀ with some constant r. Let x _T be the minimum point of F _T on the set $B\left (x_{0} ,r\right ).$ If ε is small enough, such that the condition

$$\displaystyle \begin{aligned} \psi \left(\left|x-x_{0} \right|\right)\le 2\varepsilon \quad \Rightarrow \quad \left|x-x_{0} \right|\le r, \end{aligned}$$

is satisfied, then

$$\displaystyle \begin{aligned} \overline{\lim_{T\to\infty}} \frac{1}{T} \ln P\left\{\psi \left(\left|x_{T} -x_{0} \right|\right)\ge 2\varepsilon \right\} \le -\inf \left\{I\left(z\right),z\in A_{\varepsilon } \right\}. \end{aligned} $$

(13)

Proof

By Theorem 2, for all ω

$$\displaystyle \begin{aligned} \left|\min \left\{F\left(x\right),x\in X\right\}-\min \left\{F_{T} \left(x\right),x\in X\right\}\right|\le \left\| F_{T} -F\right\| . \end{aligned}$$

Then by Theorem 5 we derive (12).

Further, by Theorem 2, for any ω

$$\displaystyle \begin{aligned} \psi \left(\left|x_{0} -x_{T} \right|\right)\le 2\left\| F_{T} -F\right\| , \end{aligned}$$

and by Theorem 5 we get (13). Theorem is proved.

Remark 1

If, besides of conditions of Theorem 6, the function ψ is convex an strictly increasing on $\left [0,r\right ],$ then we get

$$\displaystyle \begin{aligned} \overline{\lim_{T\to\infty}} \frac{1}{T} \ln P\left\{\left|x_{T} -x_{0} \right|\ge \psi ^{-1} \left(2\varepsilon \right)\right\} \le -\inf \left\{I\left(z\right),z\in A_{\varepsilon } \right\}. \end{aligned} $$

(14)

Indeed, by Theorem 2, for all ω

$$\displaystyle \begin{aligned}\left|x_{T} -x_{0} \right|\le \psi ^{-1} \left(2\left\| F_{T} -F\right\| \right).\end{aligned}$$

Then

$$\displaystyle \begin{aligned}P\left\{\left|x_{T} -x_{0} \right|\ge \psi ^{-1} \left(2\varepsilon \right)\right\} \end{aligned}$$

$$\displaystyle \begin{aligned}\le P\left\{\psi ^{-1} \left(2\left\| F_{T} -F\right\| \right)\ge \psi ^{-1} \left(2\varepsilon \right)\right\}=P\left\{\left\| F_{T} -F\right\| \ge \varepsilon \right\},\end{aligned}$$

and (14) follows from Theorem 5.

2 Non-stationary Version

In this part we consider the non-stationary version of the method of empirical means with continuous time observations.

Let {ξ(t), t ∈ [0, T]} be a strictly stationary random process defined on a probability space $(\varOmega , \mathcal {F}, \mathbb {P})$ with values in some metric space (Y, ℑ), $X=[a,b]\subset \mathbb {R}$, and the function $h(t,x,y): (0,\infty )\times X\times Y\mapsto \mathbb {R}$ is convex with respect to the second variable and measurable with respect to the third one.

Consider the following problem:

$$\displaystyle \begin{aligned} \min_{x\in X} \Big\{ F_T(x)\equiv F_T(x,\xi)= \frac{1}{T} \int_0^T h(t,x,\xi(t)) dt\Big\}. \end{aligned} $$

(15)

Assume that the following conditions are satisfied.

1.
$\mathbb {E} |h(t,x,\xi (0)|<\infty $ for all t > 0, x ∈ X;
2.
For all x ∈ X there exists
$$\displaystyle \begin{aligned} F(x)= \lim_{T\to\infty} F_T(x); \end{aligned}$$
3.
There exists $\bar x\in X$, c > 0 such that
$$\displaystyle \begin{aligned} F(x) \geq F(\bar x) + c|x-\bar x| \quad \text{for all} \ x\in X. \end{aligned}$$

From condition 3 it follows that there exists a unique solution to the minimization problem

$$\displaystyle \begin{aligned} \min_{x\in X} F(x), \end{aligned}$$

and this solution is achieved at some point $\bar x$. Besides, for any T and w the function F _T(x) = F _T(x, w) is convex, and for any T the function $\mathbb {E} F_T(x)$ is convex.

For any function g :→ define

$$\displaystyle \begin{aligned} g_+(x)= \lim_{\varDelta \to 0} \frac{g(x+\varDelta)-g(x)}{\varDelta}, \end{aligned} $$

(16)

$$\displaystyle \begin{aligned} g_-(x)= \lim_{\varDelta \to 0} \frac{g(x-\varDelta)-g(x)}{\varDelta}. \end{aligned} $$

(17)

Put $g_T(x)=\mathbb {E} F_T(x)$, x ∈ X. Since by convexity of h(t, x, y) the limits in (16) and (17) exist, the following limits exist as well:

for all t, y for the function h(t, x, y);
for each t for the function $\mathbb {E} h(t,\cdot , \xi (t))$;
for any t, w for the function F _T(⋅);
for each t for g _T(⋅).

The following lemma holds true.

Lemma 1

Suppose that there exists a function u : X × Ω →, convex with respect to the first argument and measurable with respect to the second one. Assume that $\mathbb {E} |u(x,\omega )|<\infty $ for any x ∈ X. Denote $v(x)= \mathbb {E} u(x,\omega )$ . Then

$$\displaystyle \begin{aligned} v^{\prime}_+(x)= \mathbb{E} u^{\prime}_+(x,\omega),\quad v^{\prime}_- (x) = \mathbb{E} u^{\prime}_-(x,\omega). \end{aligned}$$

Proof

We have

$$\displaystyle \begin{aligned} v^{\prime}_+(x) =\lim_{\varDelta \to +0} \frac{\mathbb{E} u(x+\varDelta,\omega)- \mathbb{E} u(x,\omega)}{\varDelta}= \lim_{\varDelta \to +0}\mathbb{E} \frac{u(x+\varDelta,\omega)- \mathbb{E} u(x,\omega)}{\varDelta}. \end{aligned}$$

Since u is convex with respect to x for all ω,

$$\displaystyle \begin{aligned} u^{\prime}_+ (x,\omega) =\inf_{\varDelta>0} \frac{u(x+\varDelta,\omega)- u(x,\omega)}{\varDelta}, \end{aligned} $$

(18)

$$\displaystyle \begin{aligned} u^{\prime}_- (x,\omega) =\inf_{\varDelta>0} \frac{u(x-\varDelta,\omega)- u(x,\omega)}{\varDelta},\end{aligned} $$

(19)

the fractions in the right-hand sides of (18) and (19) are decreasing monotone as Δ → +0. Then by the monotone convergence theorem

$$\displaystyle \begin{aligned} \lim_{\varDelta \to +0}\mathbb{E} \frac{u(x+\varDelta,\omega)- \mathbb{E} u(x,\omega)}{\varDelta}= \mathbb{E} u^{\prime}_+(x,\omega), \quad \varDelta \to + 0.\end{aligned} $$

The same arguments is applie to v−′. Lemma is proved.

By Lemma 1 we have that

$$\displaystyle \begin{aligned} \big(\mathbb{E} h(t,x,\xi(t))\big)^{\prime}_+ = \mathbb{E} h^{\prime}_+ (t,x,\xi(t)), \end{aligned}$$

$$\displaystyle \begin{aligned} \big(\mathbb{E} h (t,x,\xi(t))\big)^{\prime}_- = \mathbb{E} h^{\prime}_- (t,x,\xi(t)),\end{aligned} $$

and for any small t ∈ [0, T], x ∈ X

$$\displaystyle \begin{aligned} g^{\prime}_{T+} = \mathbb{E} F^{\prime}_{T+}(x), \quad g^{\prime}_{T-} = \mathbb{E} F^{\prime}_{T-}(x). \end{aligned}$$

Lemma 2

Suppose that conditions 1–3 and the statements a)–c) below hold true:

a)
$h^{\prime }_+(t,\bar x, \xi (t))- \mathbb {E} h^{\prime }_+(t,\bar x, \xi (t))$ and $h^{\prime }_-(t,\bar x, \xi (t))- \mathbb {E} h^{\prime }_-(t,\bar x, \xi (t))$ , t ∈ [0, T]

satisfy the strong mixing condition with the mixing coefficient [ 1 ]
$$\displaystyle \begin{aligned} \alpha(\tau)\leq \frac{c_0}{ 1+ \tau^{1+\epsilon}}, \quad \epsilon>0,\quad \tau>0. \end{aligned}$$
b)
there exists δ > 2/𝜖 such that for any t > 0
$$\displaystyle \begin{aligned} \mathbb{E} |h^{\prime}_+(t,\bar x, \xi(0))|{}^{2+\delta} <\infty, \quad \mathbb{E} |h^{\prime}_-(t,\bar x, \xi(0))|{}^{2+\delta} <\infty. \end{aligned}$$
c)
$$\displaystyle \begin{aligned} g^{\prime}_{T+} (\bar x) \to F^{\prime}_+(\bar x), \quad g^{\prime}_{T-}(\bar x) \to F^{\prime}_- (\bar x), \quad T\to \infty. \end{aligned}$$

Then

$$\displaystyle \begin{aligned} \mathbb{P} \{ F^{\prime}_{T+} (\bar x) \to F^{\prime}_+ (\bar x), \quad T\to\infty\}=1, \end{aligned}$$

$$\displaystyle \begin{aligned} \mathbb{P} \{ F^{\prime}_{T-} (\bar x) \to F^{\prime}_- (\bar x), \quad T\to\infty\}=1. \end{aligned}$$

The proof is analogous to that of Lemma 2 [5] in the discrete time setting.

Theorem 7

Suppose that conditions of Lemma 2 hold true. Then with probability 1 there exists T ^∗ = T ^∗(ω) such that for any T > T ^∗ problem (15) has the unique solution x _T and $x_T= \bar x$.

Proof

By assumption 2,

$$\displaystyle \begin{aligned} F^{\prime}_{+}(\bar x ) \geq e, \quad F^{\prime}_-(\bar x)\geq e. \end{aligned}$$

By Lemma 2,

$$\displaystyle \begin{aligned} F^{\prime}_{T+}(\bar x) >0, \quad F^{\prime}_{T-} >0 \end{aligned}$$

with probability 1, starting from some T ^∗. Since the function F _T(x) is convex, $\bar x$ is the unique minimum point of the function F _T(x). Theorem is proved.

Now we turn to the large deviation problem for (15).

Theorem 8

Suppose that condition 2 and the assumptions below hold true:

a)
the family {ξ(t), t ∈ [0, T]} satisfies the conditions of hypothesis (H-1).
b)
there exists L > 0 such that for all t ∈ [0, T] and y ∈ Y
$$\displaystyle \begin{aligned} |h^{\prime}_+(t,\bar x, y)|\leq L, \quad |h^{\prime}_-(t,\bar x, y)|\leq L. \end{aligned}$$

Then

$$\displaystyle \begin{aligned} \lim_{T\to \infty} \sup \frac{1}{T} \ln \big( \mathbb{P}\{ B_T^c \}\big) \leq -\inf_{g\in F} \varLambda^+(g), \end{aligned}$$

where

$$\displaystyle \begin{aligned} \varLambda^+(g) = \sup \{ g(x)- \varLambda(\varOmega), \quad \varOmega \in \mathbb{E} (x)\}, \end{aligned}$$

$$\displaystyle \begin{aligned} \varLambda(\varOmega)= \lim_{T\to \infty} \frac{1}{T} \ln \left[ \int_\varOmega \exp \Big\{ \varOmega(x)\int_0^T \min [ h^{\prime}_+ (t,\bar x, \xi(t)), h^{\prime}_- (t,\bar x, \xi(t))] \Big\} d\mathbb{P}\right], \end{aligned}$$

$$\displaystyle \begin{aligned} B_T:=\{ \omega: arg\min_{x\in X} F_T (x)=\{\bar x\}\}, \end{aligned}$$

$$\displaystyle \begin{aligned} B_T^c =\varOmega\backslash B_T. \end{aligned}$$

The proof follows the same line as that of Theorem 3 [6], with using Theorem 4.

Remark 2

Note that the statements of Theorems 5 and 6 for non-stationary observation model also hold true. The proofs are analogous to those of Theorems 5 and 6.

References

Knopov, P.S., Kasitskaya E.J.: Empirical Estimates in Stochastic Optimization and Identification. Kluwer, Dordrecht (2005)
Google Scholar
Kaniovski, Y.M., King, A.J., Wets, R.J.-B.: Probabilistic bounds (via large deviations) for the solutions of stochastic programming problems. Ann. Oper. Res. 56, 189–208 (1995)
Google Scholar
Deuschel J.D., Stroock D.W.: Large Deviations. Academic, Boston (1989)
Google Scholar
Dunford N., Schwartz J.: Linear Operators. P.I: General Theory. Interscience, New York (1957)
Google Scholar
Knopov, P.S., Kasitskaya E.J.: Large deviations of empirical estimates in stochastic programming with non-stationary observations. Cybern. Syst. Anal. 5, 40–45 (2010)
MATH Google Scholar
Knopov, P.S., Kasitskaya E.J.: On large deviations of empirical estimates in stochastic programming. Cybern. Syst. Anal. 4, 52–61 (2004)
MathSciNet MATH Google Scholar

Download references

Author information

Authors and Affiliations

V.M. Glushkov Institute of Cybernetics NAS of Ukraine, Kyiv, Ukraine
Pavel S. Knopov & Evgenija J. Kasitskaya

Authors

Pavel S. Knopov
View author publications
You can also search for this author in PubMed Google Scholar
Evgenija J. Kasitskaya
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Pavel S. Knopov .

Editor information

Editors and Affiliations

Industrial & Systems Engineering, Texas A&M University, College Station, Texas, USA
Sergiy Butenko
Department of Industrial and Systems Engineering, University of Florida, Gainesville, Florida, USA
Panos M. Pardalos
Methods of Discrete Optimization, V.M. Glushkov Institute of Cybernetics, Kyiv, Ukraine
Volodymyr Shylo

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Knopov, P.S., Kasitskaya, E.J. (2017). Large Deviations for the Method of Empirical Means in Stochastic Optimization Problems with Continuous Time Observations. In: Butenko, S., Pardalos, P., Shylo, V. (eds) Optimization Methods and Applications . Springer Optimization and Its Applications, vol 130. Springer, Cham. https://doi.org/10.1007/978-3-319-68640-0_13

Download citation

DOI: https://doi.org/10.1007/978-3-319-68640-0_13
Published: 07 December 2017
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-68639-4
Online ISBN: 978-3-319-68640-0
eBook Packages: Mathematics and StatisticsMathematics and Statistics (R0)

Publish with us

Policies and ethics

Large Deviations for the Method of Empirical Means in Stochastic Optimization Problems with Continuous Time Observations

Abstract

Similar content being viewed by others

Large Deviations of Empirical Estimates in the Stochastic Programming Problem with Nonstationary Observations and Continuous Time

Least-Squares Estimation for the Subcritical Heston Model Based on Continuous-Time Observations

Stochastic Maximum Principle

1 The Approach

Theorem 1 ([1])

Definition 1 ([2])

Theorem 2 ([2])

Theorem 3 ([3, p. 53])

Definition 2 ([3])

Definition 3 ([3])

Theorem 4

Proof

Theorem 5

Proof

Theorem 6

Proof

Remark 1

2 Non-stationary Version

Lemma 1

Proof

Lemma 2

Theorem 7

Proof

Theorem 8

Remark 2

References

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this chapter

Cite this chapter

Download citation

Share this chapter

Publish with us

Search

Navigation