1 Introduction

One of the most elegant results in the theory of approximation is Korovkin’s theorem, which provides a generalization of the well-known proof of Weierstrass’s classical approximation theorem as was given by Bernstein.

Theorem 1

(Korovkin [18, 19]) Let \((L_{n})_{n}\) be a sequence of positive linear operators that map C([0, 1]) into itself. Suppose that the sequence \((L_{n}(f))_{n}\) converges to f uniformly on [0, 1] for each of the test functions 1,  x and \(x^{2}\). Then, this sequence converges to f uniformly on [0, 1] for every \(f\in C([0,1])\).

Simple examples show that the assumption concerning the positivity of the operators \(L_{n}\) cannot be dropped. What about the assumption on their linearity?

Over the years, many generalizations of Theorem 1 appeared in a variety of settings including important Banach function spaces. A nice account on the present state of art is offered by the authoritative monograph of Altomare and Campiti [3] and the excellent survey of Altomare [2]. The literature concerning the subject of Korovkin-type theorems is really huge, a search on Google offering more than 26,000 results. However, except for Theorem 2.7 in the 1973 paper of Bauer [4], the extension of this theory beyond the framework of linear functional analysis remained largely unexplored.

Inspired by the Choquet theory of integrability with respect to a nonadditive measure, we will prove in this paper that the restriction to the class of positive linear operators can be relaxed by considering operators that verify a mix of conditions characteristic for Choquet’s integral.

As usual, for X, a Hausdorff topological space, we will denote by \(\mathcal {F}(X)\) the vector lattice of all real-valued functions defined on X endowed with the pointwise ordering. Two important vector sublattices of it are:

$$\begin{aligned} C(X)= & {} \left\{ f\in \mathcal {F}(X):\text { }f\text { continuous}\right\} \text { } \end{aligned}$$

and

$$\begin{aligned} C_{b}(X)= & {} \left\{ f\in \mathcal {F}(X):\text { }f\text { continuous and bounded}\right\} . \end{aligned}$$

With respect to the sup norm, \(C_{b}(X)\) becomes a Banach lattice. See [22] for the theory of these spaces.

Suppose that X and Y are two Hausdorff topological spaces and E and F are, respectively, vector sublattices of C(X) and C(Y). An operator \(T:E\rightarrow F\) is called:

  • sublinear if it is both subadditive, that is:

    $$\begin{aligned} T(f+g)\le T(f)+T(g)\quad \text {for all }\quad f,g\in E, \end{aligned}$$

    and positively homogeneous, that is:

    $$\begin{aligned} T(af)=aT(f)\quad \text {for all }\quad a\ge 0\,\text { and }\, f\in E; \end{aligned}$$
  • monotonic if \(f\le g\) in E implies \(T(f)\le T(g);\)

  • comonotonic additive if \(T(f+g)=T(f)+T(g)\) whenever the functions \(f,g\in E\) are comonotone in the sense that:

    $$\begin{aligned} (f(s)-f(t))\cdot (g(s)-g(t))\ge 0 \quad \text { for all } \quad s,t\in X. \end{aligned}$$

Our main result extends Korovkin’s results to the framework of operators acting on vector lattices of functions of several variables that play the properties of sublinearity, monotonicity, and comonotonic additivity. We use families of test functions including the canonical projections on \(\mathbb {R}^{N}\):

$$\begin{aligned} \mathop {\mathrm{pr}}\nolimits _{k}:(x_{1},\ldots ,x_{N})\rightarrow x_{k} ,{\quad }k=1,\ldots ,N. \end{aligned}$$

Theorem 2

(The nonlinear extension of Korovkin’s theorem: the several variables case) Suppose that X is a locally compact subset of the Euclidean space \(\mathbb {R}^{N}\) and E is a vector sublattice of \(\mathcal {F}(X)\) that contains the test functions \(1,~\pm {\text {*}}{pr}_{1},\ldots ,~\pm {\text {*}}{pr}_{N}\) and \(\sum _{k=1}^{N}{\text {*}}{pr}_{k}^{2}\).

  1. (i)

    If \((T_{n})_{n}\) is a sequence of monotone and sublinear operators from E into E, such that:

    $$\begin{aligned} \lim _{n\rightarrow \infty }T_{n}(f)=f\quad \text { uniformly on the compact subsets of }X \end{aligned}$$
    (1.1)

    for each of the \(2N+2\) aforementioned test functions, then the property (1.1) also holds for all nonnegative functions f in \(E\cap C_{b}(X)\).

  2. (ii)

    If, in addition, each operator \(T_{n}\) is comonotone additive, then \((T_{n}(f))_{n}\) converges to f uniformly on the compact subsets of X, for every \(f\in E\cap C_{b}\left( X\right) \).

Notice that in both cases (i) and (ii), the family of testing functions can be reduced to \(1,~-{\text {*}}{pr}_{1},\ldots ,~-{\text {*}}{pr}_{N}\) and \(\sum _{k=1}^{N}{\text {*}}{pr}_{k}^{2}\) when K is included in the positive cone of \(\mathbb {R}^{N}\). Also, the convergence of \((T_{n}(f))_{n}\) to f is uniform on X when \(f\in E\) is uniformly continuous and bounded on X.

The details of this result make the objective of Sect. 2.

Theorem 2 extends not only Korovkin’s original result (which represents the particular case where \(N=1,\) \(K=[0,1],\) all operators \(T_{n}\) are linear bounded and monotone, and the function \({\text {*}}{pr}_{1}\) is the identity of K) but also the several variable version of it due to Volkov [25]. It encompasses also the technique of smoothing kernels, in particular Weierstrass’ argument for the Weierstrass approximation theorem: for every bounded uniformly continuous function \(f:\mathbb {R\rightarrow R}:\)

$$\begin{aligned} \left( W_{\text {h}}f\right) (t)=\frac{1}{h\sqrt{\pi }}\int _{-\infty }^{\infty } f(s)e^{-\left( s-t\right) ^{2}/h^{2}}{\text {d}}s\longrightarrow f(t) \end{aligned}$$

uniformly on \(\mathbb {R}\) as \(h\rightarrow 0.\)

Applications of Theorem 2 in the nonlinear setting are presented in Sect. 3. They are all based on Choquet’s theory on integration with respect to a capacity. Indeed, this theory, which was initiated by Choquet [6, 7] in the early 1950s, represents a major source of comonotonic additive, sublinear, and monotone operators.

It is worth mentioning that, nowadays, Choquet’s theory provides powerful tools in decision-making under risk and uncertainty, game theory, ergodic theory, pattern recognition, interpolation theory, and very recently on transport under uncertainty. See Adams [1], Denneberg [8], Föllmer and Schied [9], Wang and Klir [26], Wang and Yan [27], Gal and Niculescu [13], as well as the references therein.

For the convenience of reader, we summarized in the Appendix at the end of this paper some basic facts concerning this theory.

Some nonlinear extension of Korovkin’s theorem within the framework of compact spaces are presented in Sect. 4.

2 Proof of Theorem 2

Before to detail the proof of Theorem 2, some preliminary remarks on the behavior of operators \(T:C_{b}(X)\rightarrow C_{b}(Y)\) are necessary.

If T is subadditive, then it verifies the inequality:

$$\begin{aligned} \left| T(f)-T(g)\right| \le T\left( \left| f-g\right| \right) \quad \text { for all } \quad f,g. \end{aligned}$$
(2.1)

Indeed, \(f\le g+\left| f-g\right| ~\)yields \(T(f)\le T(g)+T\left( \left| f-g\right| \right) ,\) i.e., \(T(f)-T(g)\le T\left( \left| f-g\right| \right) \), and interchanging the role of f and g, we infer that \(-\left( T(f)-T(g)\right) \le T\left( \left| f-g\right| \right) .\)

If T is linear, then the property of monotonicity is equivalent to that of positivity, whose meaning is:

$$\begin{aligned} T(f)\ge 0\quad \text { for all } \quad f\ge 0. \end{aligned}$$

If the operator T is monotone and positively homogeneous, then necessarily:

$$\begin{aligned} T(0)=0. \end{aligned}$$

Every positively homogeneous and comonotonic additive operator T verifies the formula:

$$\begin{aligned} T(f+a\cdot 1)=T(f)+aT(1)\quad \text { for all } \quad f\text { and all } \quad a\in [0,\infty ); \end{aligned}$$
(2.2)

indeed, f is comonotonic to any constant function.

Proof of Theorem 2

(i) To make more easier the handling of the test functions, we denote:

$$\begin{aligned} e_{0}=1,\text { }e_{k}=\mathop {\mathrm{pr}}\nolimits _{k}\text { (}k=1,\ldots ,N)\text { and }e_{N+1}=\sum _{k=1}^{N}\mathop {\mathrm{pr}}\nolimits _{k}^{2}. \end{aligned}$$

Replacing each operator \(T_{n}\) by \(T_{n}/T_{n}(e_{0}),\) we may assume that \(T_{n}(e_{0})=1\) for all n.

Let \(f\in E\cap C_{b}(\Omega )\) and let K be a compact subset of X. Then, for every \(\varepsilon >0\), there is \(\tilde{\delta }>0\), such that:

$$\begin{aligned} |f(s)-f(t)|\le \varepsilon \quad \text { for every }t\in K\text { and }s\in X\text { with }\Vert s-t\Vert \le \tilde{\delta }; \end{aligned}$$

this can be easily proved by reductio ad absurdum.

If \(\Vert s-t\Vert \ge \tilde{\delta }\), then:

$$\begin{aligned} |f(s)-f(t)|\le \frac{2\Vert f\Vert _{\infty }}{\tilde{\delta }^{2}}\cdot \Vert s-t\Vert ^{2}, \end{aligned}$$

so that letting \(\delta =2\Vert f\Vert _{\infty }/\tilde{\delta }^{2}\), we obtain the estimate:

$$\begin{aligned} |f(s)-f(t)|\le \varepsilon +\delta \cdot \Vert s-t\Vert ^{2} \end{aligned}$$
(2.3)

for all \(t\in K\) and \(s\in X.\) Since K is a compact set, it can embedded into an N-dimensional cube \([a,b]^{N}\) for suitable \(b\ge 0\ge a\) and the estimate (2.3) yields:

$$\begin{aligned} \left| f(s)-f(t)e_{0}\right|\le & {} \varepsilon e_{0}\\&+\delta (\varepsilon )\Big [ e_{N+1}^{2}(s)+2\sum _{k=1}^{N}\left( e_{k}(t)-a\right) \left( -e_{k}(s)\right) \\&-2a\sum _{k=1}^{N}e_{k}(s)+\left\| t\right\| ^{2}e_{0}(s)\Big ] . \end{aligned}$$

Taking into account the formula (2.1) and the fact that the operators \(T_{n}\) are subadditive and positively homogeneous, we infer that:

$$\begin{aligned} \left| T_{n}(f)(s)-f(t)\right|= & {} \left| T_{n}(f)(s)-T_{n} (f(t)e_{0})(s)\right| \le T_{n}\left( \left| f(s)-f(t)e_{0} \right| \right) \\\le & {} \varepsilon +\delta (\varepsilon )\Big [ T_{n}(e_{N+1}^{2})(s)+2\sum _{k=1}^{N}\left( e_{k}(t)-a\right) T_{n}(-e_{k})(s) \\&-2a\sum _{k=1}^{N}T_{n}\left( e_{k})(s)\right) +\left\| t\right\| ^{2}\Big ] \end{aligned}$$

for every \(n\in \mathbb {N}\) and \(s,t\in K.\) Here, we used the assumption that f is nonnegative. By our hypothesis:

$$\begin{aligned} T_{n}(e_{N+1}^{2})(s)+2\sum _{k=1}^{N}\left( e_{k}(s)-a\right) T_{n} (-e_{k})(s)-2a\sum _{k=1}^{N}T_{n}\left( e_{k})(s)\right) +\left\| s\right\| ^{2}\rightarrow 0 \end{aligned}$$

uniformly on K as \(n\rightarrow \infty .\) Therefore:

$$\begin{aligned} \underset{n\rightarrow \infty }{\lim \sup }\left| T_{n}(f)(s)-f(s)\right| \le \varepsilon \end{aligned}$$

whence we conclude that \(T_{n}(f)\rightarrow f\) uniformly on K, because \(\varepsilon \) was arbitrarily fixed.

(ii) Suppose, in addition, that each operator \(T_{n}\) is also comonotone additive. According to the assertion (i) : 

$$\begin{aligned} T_{n}(f+\Vert f\Vert e_{0})\rightarrow f+\Vert f\Vert e_{0},\quad \text { uniformly on }K. \end{aligned}$$

Since a constant function is comonotone with any arbitrary function, using the comonotone additivity of \(T_{n}\), it follows that \(T_{n}(f+\Vert f\Vert e_{0})=T_{n}(f)+\Vert f\Vert \cdot T_{n}(e_{0})\). Therefore, \(T_{n} (f)\rightarrow f\) uniformly on K. \(\square \)

When K is included in the positive cone of \(\mathbb {R}^{N}\), it can embedded into an N-dimensional cube \([0,b]^{N}\) for a suitable \(b>0\) and the estimate (2.3) yields:

$$\begin{aligned} \left| f(s)-f(t)e_{0}\right|\le & {} \varepsilon e_{0}\\&+\delta (\varepsilon )\left[ e_{N+1}^{2}(s)+2\sum _{k=1}^{N}e_{k}(t)\left( -e_{k}(s)\right) +\left\| t\right\| ^{2}e_{0}(s)\right] . \end{aligned}$$

Proceeding as above, we infer that:

$$\begin{aligned}&\left| T_{n}(f)(s)-f(t)\right| \\&\quad \le \varepsilon +\delta (\varepsilon )\left[ T_{n}(e_{N+1}^{2})(s)+2\sum _{k=1}^{N}e_{k}(t)T_{n}(-e_{k})(s)+\left\| t\right\| ^{2}\right] \end{aligned}$$

for every \(n\in \mathbb {N}\) and \(s,t\in K,\) provided that \(f\ge 0.\) As a consequence, in both cases (i) and (ii), the family of testing functions can be reduced to \(e_{0},-e_{1},\ldots ,-e_{N}\) and \(e_{N+1}.\)

When dealing with functions \(f\in E\) uniformly continuous and bounded on X,  an inspection of the argument above shows that f verifies an estimate of the form (2.3) for all \(s,t\in X\), a fact that implies the convergence of \((T_{n}(f))_{n}\) to f uniformly on X.

3 Applications of Theorem 2

We will next discuss several examples of operators illustrating Theorem 2. They are all based on Choquet’s theory of integration with respect to a capacity \(\mu \), in our case, the restriction of the submodular capacity:

$$\begin{aligned} \mu (A)=\left( \mathcal {L}(A)\right) ^{1/2} \end{aligned}$$

to various compact subintervals of \(\mathbb {R}\); here, \(\mathcal {L}\) denotes the Lebesgue measure on real line. The necessary background on Choquet’s theory is provided by the Appendix at the end of this paper.

The one-dimensional case of Theorem 2 is illustrated by the following three families of nonlinear operators, first considered in [11]:

  • the Bernstein–Kantorovich–Choquet operators act on C([0, 1]) by the formula:

    $$\begin{aligned} K_{n,\mu }(f)(x)=\sum _{k=0}^{n}\frac{(C)\int _{k/(n+1)}^{(k+1)/(n+1)}f(t)d\mu }{\mu ([k/(n+1),(k+1)/(n+1)])}\cdot {\left( {\begin{array}{c}n\\ k\end{array}}\right) }x^{k}(1-x)^{n-k}; \end{aligned}$$
  • the Szász–Mirakjan–Kantorovich–Choquet operators act on \(C([0,\infty ))\) by the formula :

    $$\begin{aligned} S_{n,\mu }(f)(x)=e^{-nx}\sum _{k=0}^{\infty }\frac{(C)\int _{k/n}^{(k+1)/n} f(t)d\mu }{\mu ([k/n,(k+1)/n])}\cdot \frac{(nx)^{k}}{k!}; \end{aligned}$$
  • the Baskakov–Kantorovich–Choquet operators act on \(C([0,\infty ))\) by the formula:

    $$\begin{aligned} V_{n,\mu }(f)(x)=\sum _{k=0}^{\infty }\frac{(C)\int _{k/n}^{(k+1)/n}f(t)d\mu }{\mu ([k/n,(k+1)/n])}\cdot {\left( {\begin{array}{c}n+k-1\\ k\end{array}}\right) }\frac{x^{k}}{(1+x)^{n+k}}. \end{aligned}$$

Since the Choquet integral with respect to a submodular capacity \(\mu \) is comonotone additive, sublinear and monotone, it follows that all above operators also have these properties.

Clearly, \(K_{n,\mu }(e_{0})(x)=1\) and by Corollary 3.6 (i) in [11], we immediately get that \(K_{n,\mu }(e_{2})(x)\rightarrow e_{2}(x)\) uniformly on [0, 1]. Again, by Corollary 3.6 (i),  it follows that \(K_{n,\mu } (1-e_{1})(x)\rightarrow 1-e_{1}\), uniformly on [0, 1]. Since \(K_{n,\mu }\) is comonotone additive:

$$\begin{aligned} K_{n,\mu }(1-e_{1})(x)=K_{n,\mu }(e_{0})(x)+K_{n,\mu }(-e_{1})(x), \end{aligned}$$

which implies that \(K_{n,\mu }(-e_{1})\rightarrow -e_{1}\) uniformly on [0, 1]. Therefore, the operators \(K_{n,\mu }\) satisfy the hypothesis of Theorem 2, whence the conclusion

$$\begin{aligned} K_{n,\mu }(f)(x)\rightarrow f(x)\text { uniformly for every }f\in C([0,1]). \end{aligned}$$

Similarly, one can show that the operators \(S_{n,\mu }\) and \(V_{n,\mu }\) satisfy the hypothesis of Theorem 2 for \(N=1\) and \(X=[0,+\infty )\). In the first case, notice that the condition \(S_{n,\mu }(e_{0})=e_{0}\) is trivial. The convergence of the sequence of functions \(S_{n,\mu }(e_{2})(x)\) will be settled by computing the integrals \(\sqrt{n}\cdot (C)\int _{k/n}^{(k+1)/n}t^{2}d\mu \). We have:

$$\begin{aligned} \sqrt{n}\cdot (C)\int _{k/n}^{(k+1)/n}t^{2}{\text {d}}\mu= & {} \sqrt{n}\int _{0}^{\infty } \mu (\{t\in [k/n,(k+1)/n]:t\ge \sqrt{\alpha }\}){\text {d}}\alpha \\= & {} \sqrt{n}\int _{0}^{((k+1)/n)^{2}}\mu (\{t\in [k/n,(k+1)/n]:t\ge \sqrt{\alpha }\}){\text {d}}\alpha \\= & {} \sqrt{n}\int _{0}^{(k/n)^{2}}\mu (\{t\in [k/n,(k+1)/n]:t\ge \sqrt{\alpha }\}){\text {d}}\alpha \\&+\sqrt{n}\int _{(k/n)^{2}}^{((k+1)/n)^{2}}\mu (\{t\in [k/n,(k+1)/n]:t\ge \sqrt{\alpha }\}){\text {d}}\alpha \\= & {} \sqrt{n}\cdot \left( \frac{k}{n}\right) ^{2}\cdot \frac{1}{\sqrt{n}}+\sqrt{n}\cdot \int _{(k/n)^{2}}^{((k+1)/n)^{2}}\sqrt{(k+1)/n-\sqrt{\alpha }}{\text {d}}\alpha \\= & {} \left( \frac{k}{n}\right) ^{2}+\sqrt{n}\cdot \int _{0}^{1/n}\beta ^{1/2}((k+1)/n-\beta ){\text {d}}\beta \\= & {} \left( \frac{k}{n}\right) ^{2}+\sqrt{n}\cdot \frac{2(k+1)}{n}\cdot \frac{2}{3}\cdot \beta ^{3/2}|_{0}^{1/n}-2\sqrt{n}\cdot \frac{2}{5}\beta ^{5/2} |_{0}^{1/n}\\= & {} \frac{1}{15n^{2}}\left( 15k^{2}+20k+8\right) . \end{aligned}$$

This immediately implies:

$$\begin{aligned} S_{n,\mu }(e_{2})(x)=S_{n}(e_{2})(x)+\frac{4}{3n}S_{n}(e_{1})(x)+\frac{4}{3n^{2}}-\frac{4}{5n^{2}}\rightarrow e_{2}(x), \end{aligned}$$

uniformly on every compact subinterval [0, a]. Here, \(S_{n}\) denotes the classical Szász–Mirakjan–Kantorovich operator, associated with the Lebesgue measure.

It remains to show that \(S_{n,\mu }(-e_{1})(x)\rightarrow -e_{1}(x)\), uniformly on every compact subinterval [0, a]. For this goal, we have to perform the following computation:

$$\begin{aligned}&\sqrt{n}\cdot (C)\int _{k/n}^{(k+1)/n}(-t){\text {d}}\mu \\&\quad =\int _{-\infty }^{0}\left\{ \mu (\{\omega \in [k/n,(k+1)/n]:-\omega \ge \alpha \})-\frac{1}{\sqrt{n} }\right\} {\text {d}}\alpha \\&\quad =\sqrt{n}\int _{-k/n}^{0}\left\{ \mu (\{\omega \in [k/n,(k+1)/n]:\omega \le -\alpha \})-\frac{1}{\sqrt{n}}\right\} {\text {d}}\alpha \\&\qquad +\sqrt{n}\int _{-(k+1)/n}^{-k/n}\left\{ \mu (\{\omega \in [k/n,(k+1)/n]:\omega \le -\alpha \})-\frac{1}{\sqrt{n}}\right\} {\text {d}}\alpha \\&\quad =-\frac{k}{n}+\sqrt{n}\cdot \int _{-(k+1)/n}^{-k/n}\left( \sqrt{-\alpha -k/n}-\frac{1}{\sqrt{n}}\right) {\text {d}}\alpha \\&\quad =-\frac{k}{n}+\sqrt{n}\int _{k/n}^{(k+1)/n}\sqrt{\beta -k/n}{\text {d}}\beta -\frac{1}{n}\\&\quad =-\frac{k}{n}+\sqrt{n}\int _{0}^{1/n}\beta ^{1/2}{\text {d}}\beta -\frac{1}{n}=-{(3k+1)}/{(3n)}. \end{aligned}$$

Consequently:

$$\begin{aligned} S_{n,\mu }(-e_{1})(x)=S_{n}(-e_{1})(x)-\frac{1}{n}\rightarrow -x, \end{aligned}$$

uniformly on any compact interval [0, a].

In a similar way, one can be prove that the Baskakov–Kantorovich–Choquet operators \(V_{n,\mu }\) satisfy the hypothesis of Theorem 2.

The several variables framework can be illustrated by the following special type of Bernstein–Durrmeyer–Choquet operators (see [14] for the general case) that act on the space of continuous functions defined on the N-simplex

$$\begin{aligned} \Delta _{N}=\{(x_{1},\ldots ,x_{N}):0\le x_{1},\ldots ,x_{N}\le 1,\,0\le x_{1} +\cdots +x_{N}\le 1\} \end{aligned}$$

via the formulas:

$$\begin{aligned} M_{n,\mu }(f)(\mathbf {x})=B_{n}(f)(\mathbf {x})-f(\mathbf {x})+x_{N}^{n}\left[ \frac{(C)\int _{\Delta _{N}}f(t_{1},\ldots ,t_{N})t_{N}^{n}d\mu }{(C)\int _{\Delta _{N} }t_{N}^{n}d\mu }-f(0,\ldots ,0,1)\right] . \end{aligned}$$

Here, \(\mathbf {x}=(x_{1},\ldots ,x_{N})\), \(B_{n}(f)(\mathbf {x})\) is the multivariate Bernstein polynomial and \(\mu =\sqrt{\mathcal {L}_{N}}\), where \(\mathcal {L}_{N}\) is the N-dimensional Lebesgue measure. The fact that these operators verify the hypotheses of Theorem 2 is an exercise left to the reader.

4 The Case of Spaces of Functions Defined on Compact Spaces

The alert reader has probably already noticed that the basic clue in the proof of Theorem 2 is the estimate (2.3), characterized in [21] (see also [20]) as a property of absolute continuity. This estimate occurs in the larger context of spaces C(M),  where M is a metric space on which is defined a separating function, that is, a nonnegative continuous function \(\gamma :M\times M\rightarrow \mathbb {R}\), such that:

$$\begin{aligned} \gamma (s,t)=0\text { implies }s=t. \end{aligned}$$

If M is a compact subset of \(\mathbb {R}^{N},\) and \(f_{1},\ldots ,f_{m}\in C(M)\) is a family of functions which separates the points of M (in particular this is the case of the coordinate functions \({\text {*}}{pr}_{1} ,\ldots ,{\text {*}}{pr}_{N}),\) then:

$$\begin{aligned} \gamma (s,t)=\sum _{k=1}^{m}\left( f_{k}(s)-f_{k}(t)\right) ^{2} \end{aligned}$$
(4.1)

is a separating function.

Lemma 1

(See [21]) If K is a compact metric space, and \(\gamma :K\times K\rightarrow \mathbb {R}\) is a separating function, then any real-valued continuous function f defined on K verifies an estimate of the following form:

$$\begin{aligned} \left| f(s)-f(t)\right| \le \varepsilon +\delta (\varepsilon )\gamma (s,t)\quad \text { for all }s,t\in K. \end{aligned}$$

The separating functions play an important role in obtaining Korovkin-type theorems. A sample is as follows:

Theorem 3

Suppose that K is a compact metric space and \(\gamma \) is a separating function for M. If \(T_{n}:C(K)\rightarrow C(K)\) \((n\in \mathbb {N})\) is a sequence of comonotone additive, sublinear and monotone operators, such that \(T_{n}(1)\rightarrow 1\) uniformly and:

$$\begin{aligned} T_{n}(\gamma (\cdot ,t))(t)\rightarrow 0\quad \text { uniformly in }t, \end{aligned}$$
(4.2)

then \(T_{n}(f)\rightarrow f\) uniformly for each \(f\in C(K).\)

The details are similar to that used for Theorem 2, so they will be omitted.

In a similar way, one can prove the following nonlinear extension of the Korovkin-type theorem (due in the linear case to Schempp [23] and Grossman [17]):

Theorem 4

Let X be a compact Hausdorff space and \(\mathcal {F}\) be a subset of C(X) that separates the points of X. If \((T_{n})_{n}\) is a sequence of comonotonic additive, sublinear, and monotone operators that map C(X) into C(X) and satisfy the conditions \(\lim _{n\rightarrow \infty }T_{n}(f^{k} )=f^{k}\) for each f in \(\mathcal {F}\) and \(k=0,1,2,\) then:

$$\begin{aligned} \lim _{n\rightarrow \infty }T_{n}(f)=f, \end{aligned}$$

for every f in C(X).

5 Appendix: Some Basic Facts on Capacities and Choquet Integral

For the convenience of the reader, we will briefly recall in this section some basic facts concerning the mathematical concept of capacity and the integral associated to it. Full details are to be found in the books of Denneberg [8], Grabisch [16], and Wang and Klir [26].

Let \((X,\mathcal {A})\) be an arbitrarily fixed measurable space, consisting of a nonempty abstract set X and a \(\sigma \)-algebra \({\mathcal {A}}\) of subsets of X.

Definition 1

A set function \(\mu :{\mathcal {A}}\rightarrow [0,\infty )\) is called a capacity if \(\mu (\emptyset )=0\) and:

$$\begin{aligned} \mu (A)\le \mu (B)\quad \text { for all }A,B\in {\mathcal {A}},\text { with }A\subset B. \end{aligned}$$

A capacity is called normalized if \(\mu (X)=1;\)

An important class of normalized capacities is that of probability measures (that is, the capacities playing the property of \(\sigma \)-additivity). Probability distortions represents a major source of nonadditive capacities. Technically, one start with a probability measure \(P:\mathcal {A\rightarrow }[0,1]\) and applies to it a distortion \(u:[0,1]\rightarrow [0,1],\) that is, a nondecreasing and continuous function, such that \(u(0)=0\) and \(u(1)=1;\) for example, one may chose \(u(t)=t^{a}\) with \(\alpha >0.\) The distorted probability \(\mu =u(P)\) is a capacity with the remarkable property of being continuous by descending sequences; that is:

$$\begin{aligned} \lim _{n\rightarrow \infty }\mu (A_{n})=\mu \left( {\displaystyle \bigcap _{n=1}^{\infty }} A_{n}\right) \end{aligned}$$

for every nonincreasing sequence \((A_{n})_{n}\) of sets in \(\mathcal {A}.\) Upper continuity of a capacity is a generalization of countable additivity of an additive measure. Indeed, if \(\mu \) is an additive capacity, then upper continuity is the same with countable additivity. When the distortion u is concave (for example, when \(u(t)=t^{a}\) with \(0<\alpha <1),\) then \(\mu \) is also submodular in the sense that:

$$\begin{aligned} \mu (A\cup B)+\mu (A\cap B)\le \mu (A)+\mu (B)\quad \text { for all }A,B\in \mathcal {A}. \end{aligned}$$

Another simple technique of constructing normalized submodular capacities \(\mu \) on a measurable space \(\left( X,\mathcal {A}\right) \) is by allocating to it a probability space \(\left( Y,\mathcal {B},P\right) \) via a map \(\rho :\mathcal {A\rightarrow B}\), such that:

$$\begin{aligned} \rho (\emptyset )= & {} \emptyset ,\text { }\rho (X)=Y\text { and}\\ \rho \left( {\displaystyle \bigcap \nolimits _{n=1}^{\infty }} A_{n}\right)= & {} {\displaystyle \bigcap \nolimits _{n=1}^{\infty }} \rho (A_{n})\quad \text { for every sequence of sets }A_{n}\in \mathcal {A}. \end{aligned}$$

This allows us to define \(\mu \) by the formula:

$$\begin{aligned} \mu (A)=1-P\left( \rho (X\backslash A)\right) . \end{aligned}$$

See Shafer [24] for details.

The next concept of integrability with respect to a capacity refers to the whole class of random variables, that is, to all functions \(f:X\rightarrow \mathbb {R}\), such that \(f^{-1}(A)\in {\mathcal {A}}\) for every Borel subset A of \(\mathbb {R}\).

Definition 2

The Choquet integral of a random variable f with respect to the capacity \(\mu \) is defined as the sum of two Riemann improper integrals:

$$\begin{aligned} (C)\int _{X}f{\text {d}}\mu =&\int _{0}^{+\infty }\mu \left( \{x\in X:f(x)\ge t\}\right) {\text {d}}t\\&+\int _{-\infty }^{0}\left[ \mu \left( \{x\in X:f(x)\ge t\}\right) -\mu (X)\right] {\text {d}}t. \end{aligned}$$

Accordingly, f is said to be Choquet integrable if both integrals above are finite.

If \(f\ge 0\), then the last integral in the formula appearing in Definition 2 is 0.

The inequality sign \(\ge \) in the above two integrands can be replaced by \(>;\) see [26], Theorem 11.1,p. 226.

Every bounded random variable is Choquet integrable. The Choquet integral coincides with the Lebesgue integral when the underlying set function \(\mu \) is a \(\sigma \)-additive measure.

The integral of a function \(f:X\rightarrow \mathbb {R}\) on a set \(A\in \mathcal {A}\) is defined by the formula:

$$\begin{aligned} (C)\int _{A}f{\text {d}}\mu =(C)\int _{X}f{\text {d}}\mu _{A}, \end{aligned}$$

where \(\mu _{A}\) is the capacity defined by \(\mu _{A}(B)=\mu (B\cap A)\) for all \(B\in \mathcal {A}.\)

We next summarize some basic properties of the Choquet integral.

Remark 1

(a) If \(\mu :{\mathcal {A}}\rightarrow [0,\infty )\) is a capacity, then the associated Choquet integral is a functional on the space of all bounded random variables, such that:

$$\begin{aligned} f\ge & {} 0\text { implies }(C)\int _{A}f{\text {d}}\mu \ge 0\quad \text {( positivity)}\\ f\le & {} g\text { implies }\left( C\right) \int _{A}f{\text {d}}\mu \le \left( C\right) \int _{A}g{\text {d}}\mu \quad \text { {(}monotonicity{)}}\\ \left( C\right) \int _{A}afd\mu= & {} a\cdot \left( \left( C\right) \int _{A} f{\text {d}}\mu \right) \text { for }a\ge 0\quad \text { {(}positive homogeneity {)}}\\ \left( C\right) \int _{A}1\cdot {\text {d}}\mu (t)= & {} \mu (A)\text { ( calibration)}; \end{aligned}$$

see [8], Proposition 5.1 (ii), p. 64, for a proof of the property of positive homogeneity.

(b) In general, the Choquet integral is not additive, but, if the bounded random variables f and g are comonotonic, then:

$$\begin{aligned} \left( C\right) \int _{A}(f+g){\text {d}}\mu =\left( C\right) \int _{A}f{\text {d}}\mu +\left( {\text {*}}{Ch}\right) \int _{A}g{\text {d}}\mu . \end{aligned}$$

This is usually referred to as the property of comonotonic additivity and was first noticed by Delacherie [10]. An immediate consequence is the property of translation invariance:

$$\begin{aligned} \left( C\right) \int _{A}(f+c){\text {d}}\mu =\left( C\right) \int _{A}f{\text {d}}\mu +c\cdot \mu (A) \end{aligned}$$

for all \(c\in \mathbb {R}\) and all bounded random variables f. For details, see [8], Proposition 5.1, (vi), p. 65.

(c) If \(\mu \) is an upper continuous capacity, then the Choquet integral is upper continuous in the sense that:

$$\begin{aligned} \lim _{n\rightarrow \infty }\left( \left( C\right) \int _{A}f_{n}{\text {d}}\mu \right) =\left( C\right) \int _{A}f{\text {d}}\mu \end{aligned}$$

whenever \((f_{n})_{n}\) is a nonincreasing sequence of bounded random variables that converges pointwise to the bounded variable f. This is a consequence of the Beppo Levi monotone convergence theorem from the theory of Lebesgue integral.

(d) Suppose that \(\mu \) is a submodular capacity. Then, the associated Choquet integral is a subadditive functional; that is:

$$\begin{aligned} \left( C\right) \int _{A}(f+g){\text {d}}\mu \le \left( C\right) \int _{A}f{\text {d}}\mu +\left( C\right) \int _{A}g{\text {d}}\mu \end{aligned}$$

for all bounded random variables f and g. See [8], Corollary 6.4, p. 78. and Corollary 13.4, p. 161. It is also a submodular functional in the sense that:

$$\begin{aligned} \left( C\right) \int _{A}\sup \left\{ f,g\right\} d\mu +\left( C\right) \int _{A}\inf \{f,g\}{\text {d}}\mu \le \left( C\right) \int _{A}f{\text {d}}\mu +(C)\int _{A}g{\text {d}}\mu \end{aligned}$$

for all bounded random variables f and g. See [5], Theorem 13 (c).

A characterization of Choquet integral in terms of additivity on comonotonic functions is provided by the following analogue of the Riesz representation theorem. See Zhou [28], Theorem 1, and Lemma 3, for a simple (and more general) argument.

Theorem 5

Suppose that \(I:C(X)\rightarrow \mathbb {R}\) is a comonotonically additive and monotone functional with \(I(1)=1\). Then, it is also upper continuous and there exists a unique upper continuous normalized capacity \(\mu :\mathcal {B}(X)\rightarrow [0,1]\), such that I coincides with the Choquet integral associated with it.

On the other hand, according to Remark 1, the Choquet integral associated with any upper continuous capacity is a comonotonically additive, monotone, and upper continuous functional.

Notice that under the assumptions of Theorem 5, the capacity \(\mu \) is submodular if and only if the functional I is submodular.