Intuitive Searching: An Approach to Search the Decision Policy of a Blackjack Agent

Pan, Zhenyu; Xue, Jie; Ge, Tingjian

doi:10.1007/978-981-16-2380-6_77

Zhenyu Pan¹³,
Jie Xue¹⁴ &
Tingjian Ge¹³

Part of the book series: Lecture Notes in Networks and Systems ((LNNS,volume 236))

1082 Accesses

Abstract

Porker is a game of imperfect information (Rubin and Waston, Artif Intell, 175(5-6), [1]), in which each player has cards hidden from other players. In previous research (Billings et al, in American Association for Artificial Intelligence Press, [2]; Popyack, in Blackjack-playing agents in an advanced AI course, [3]), Markov model serves as a powerful framework to design a Poker agent. In this paper, we propose a new approach named intuitive searching to build the Exploitative Agent AI against the Markov model Agent in the Blackjack Game. The correctness of our approach can be proved by analyzing the predictive distribution of winning rate. Our AI prevails Markov Agent in a 10000-round experiment. The performance of intuitive policy searching approach in general cases and corresponding limitations are also discussed.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 189.00; Price excludes VAT (USA)

Softcover Book: USD 249.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

World-class interpretable poker

Article Open access 09 June 2022

High-Level Language to Build Poker Agents

Towards a Fast Detection of Opponents in Repeated Stochastic Games

References

Rubin J, Waston I (2011) Computer poker: a review. In: Artificial intelligence, vol 175, no 5–6, April 2011, pp 958–987. https://doi.org/10.1016/j.artint.2010.12.005
Billings D, Castillo L, Schaeffer J, Szafron D (1999) Using probabilistic knowledge and simulation to play Poker. American Association for Artificial Intelligence Press, pp 697–703. http://poker.cs.ualberta.ca/publications/AAAI99.pdf
Popyack JL (2009) Blackjack-playing agents in an advanced AI course. In: ITiCSE ’09: proceedings of the 14th annual ACM SIGCSE conference on innovation and technology in computer science education, pp 208–212. https://doi.org/10.1145/1562877.1562944
Thorp Edward O (1966) Beat the dealer: a winning strategy for the game of twenty-one. Vintage Books. 0394703103
Google Scholar
Bishop CM (2006) Pattern recognition and machine learning (Information science and statistics). Springer, New York Inc, Secaucus, NJ
MATH Google Scholar
QazazChristopher CS, Williams Christopher KI, Bishop M. An upper bound on the Bayesian error bars for generalized linear regression. In: Ellacott SW, Mason JC, Anderson IJ (eds) Mathematics of neural networks. Operations research/Computer science interfaces series, vol 8. Springer, Boston, MA. https://doi.org/10.1007/978-1-4615-6099-9_51
Spiegelhalter D, Lauritzen S (1990) Sequential updating of conditional probabilities on directed graphical structures. In: Network, vol 20, no 5, August 1990, pp 579–605. https://doi.org/10.1002/net.3230200507

Download references

Author information

Authors and Affiliations

University of Massachusetts, Lowell, MA, USA
Zhenyu Pan & Tingjian Ge
University of California, Santa Barbara, CA, USA
Jie Xue

Authors

Zhenyu Pan
View author publications
You can also search for this author in PubMed Google Scholar
Jie Xue
View author publications
You can also search for this author in PubMed Google Scholar
Tingjian Ge
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Zhenyu Pan .

Editor information

Editors and Affiliations

Middlesex University, London, UK
Xin-She Yang
University of Reading, Reading, UK
Simon Sherratt
JIS University, Kolkata, India
Nilanjan Dey
Global Knowledge Research Foundation, Ahmedabad, India
Amit Joshi

Appendix 1

1.1 Missing Proofs

1.1.1 Proof of Lemma 2

For any $0 \le R\le 2T$, we have

$$\begin{aligned} \Pr \{|x-y|\le R\}&= \int \limits _{-T}^{T} Pr\{y=y_0\} \cdot Pr\{|x-y_0|\le R| y=y_0\} dy_0 \\&=\int \limits _{-T}^{T} \frac{1}{2T} dy_0 \int \limits _{\max \{-T,y_0-R\}}^{\min \{T,y_0+R\}} \frac{1}{2T} dx. \end{aligned}$$

Case 1: $0 \le R \le T$

$$\begin{aligned} \Pr \{|x-y|\le R\}&= \bigg (\int \limits _{-T}^{R-T} \frac{1}{2T} dy \int \limits _{-T}^{y+R} \frac{1}{2T}dx\bigg )+\bigg (\int \limits _{R-T}^{T-R}\frac{1}{2T}dy\int \limits _{y+R}^{y-R}\frac{1}{2T}dx\bigg )\\&\quad +\,\bigg (\int \limits _{T}^{T-R}\frac{1}{2T}dy\int \limits _{y-R}^{T}\frac{1}{2T}dx\bigg )\\&=\frac{1}{4T^2}\bigg [\bigg (\int \limits _{-T}^{R-T} (y+R+T)dy\bigg )+\bigg (\int \limits _{R-T}^{T-R}2Rdy\bigg ) \\&\quad +\,\bigg (\int \limits _{T-R}^{T}(T+R-y)dy\bigg )\bigg ]\\&=\frac{3R^2}{8T^2}+\frac{R(T-R)}{T^2}+\frac{3R^2}{8T^2}=\frac{4TR-R^2}{4T^2}. \end{aligned}$$

Case 2:$T \le R \le 2T$

$$\begin{aligned} \Pr \{|x-y|\le R\}&= \bigg (\int \limits _{-T}^{T-R}\frac{1}{2T}dy\int \limits _{-T}^{y+R}\frac{1}{2T}dx\bigg )+\bigg (\int \limits _{T-R}^{R-T}\frac{1}{2T}dy\int ^{y+R}_{y-R} \frac{1}{2T} dx\bigg )\\&\quad +\,\bigg (\int \limits _{R-T}^{T}\frac{1}{2T}dy\int \limits _{y-R}^{T}\frac{1}{2T} dx\bigg )\\&=\frac{1}{4T^2}\bigg [\frac{(T-R)^2-T^2}{2}+(R+T)(2T-R)\bigg ]+\frac{R-T}{T}+\frac{R-T}{T}\\&\quad +\,\frac{1}{4T^2}\bigg [\frac{(T-R)^2-T^2}{2}+(R+T)(2T-R)\bigg ]\\&=-\frac{R^2}{4T^2}+\frac{R}{T}=\frac{4TR-R^2}{4T^2}. \end{aligned}$$

1.1.2 Proof of Lemma 3

When $0\le R\le 2T$, we have

$$\begin{aligned} \Pr \left\{ \min _{1\le i\le n} |x_i-y| \le R \right\}&= 1- \Pr \left\{ \min _{1\le i\le n} |x_i-y| {>} R \right\} \\&=1- \prod _{1\le i\le n} \Pr \{|x_i-y| {>} R\}\\&=1-\left( 1-\frac{4TR-R^2}{4T^2}\right) ^{n}. \end{aligned}$$

1.1.3 Proof of Lemma 4

We have

$$\begin{aligned} E\left\{ \min _{1\le i\le n} |x_i-y|\right\}&=\int \limits _{0}^{2T} R\cdot d\Pr \left\{ \min _{1 \le i\le n} |x_i-y|\le R\right\} \\&=R\cdot \Pr \left\{ \min _{1\le i \le n} |x_i-y| \le R\right\} |^{R=2T}_{R=0} \\&\quad -\,\int \limits _{0}^{2T}\Pr \left\{ \min _{1 \le i \le n} |x_i-y| \le R\right\} dR\\&=2T-\int \limits _{0}^{2T} \left( 1-\left( 1-\frac{4TR-R^2}{4T^2}\right) ^n \right) dR\\&=\int \limits _{0}^{2T} \left( 1-\frac{4TR-R^2}{4T^2}\right) ^n dR. \end{aligned}$$

Using $R=2Tx$ to replace R, we have

$$\begin{aligned} E\left\{ \min _{1\le i\le n} |x_i-y|\right\}&=\int \limits _{0}^{2T} \left( 1-\frac{4TR-R^2}{4T^2}\right) ^n dR\\&=2T\int \limits _{0}^{1}\left( 1-(2x-x^2)\right) ^ndx=2T\int \limits _{0}^{1} (1-x)^{2n} dx \\&=\frac{2T}{2n+1}\int \limits _{0}^{1} d(1-x)^{2n+1}=\frac{2T}{2n+1} \in O(n^{-1}). \end{aligned}$$

Similarly,

$$\begin{aligned} D\left\{ \min _{1\le i\le n} |x_i-y| \right\} =E\left\{ \left( \min _{1\le i\le n} |x_i-y|\right) ^2\right\} - \left( E\left\{ \min _{1\le i\le n} |x_i-y|\right\} \right) ^2. \end{aligned}$$

and

$$\begin{aligned} E\left\{ \left( \min _{1\le i\le n} |x_i-y|\right) ^2\right\}&= \int \limits _{0}^{2T} R^2\cdot d\Pr \left\{ \min _{1\le i \le n} |x_i-y|\le R\right\} \\&=R^2\Pr \left\{ \min _{1\le i \le n} |x_i-y|\le R\right\} |^{R=2T}_{R=0}\\&\quad -\,2\cdot \int \limits _{0}^{2T} R\cdot \Pr \left\{ \min _{1\le i \le n} |x_i-y|\le R\right\} dR\\&=4T^2-2\int \limits _{0}^{2T} R\left( 1-\left( 1-\frac{4TR-R^2}{4T^2}\right) ^{n} \right) dR\\&=2\int \limits _{0}^{2T}R\cdot \left( 1-\frac{4TR-R^2}{4T^2}\right) ^{n} dR. \end{aligned}$$

Using $R=2Tx$ to replace R, we then have

$$\begin{aligned} E\left\{ \left( \min _{1\le i\le n} |x_i-y|\right) ^2\right\}&=8T^2\int \limits _{0}^{1} x\left( 1-2x+x^2\right) ^n dx \\&= 8T^2\bigg (\frac{1}{2n+1}-\frac{1}{2n+2}\bigg )=\frac{8T^2}{(2n+1)(2n+2)} \in O\bigg (\frac{1}{n}\bigg ). \end{aligned}$$

As a result,

$$\begin{aligned} D\left\{ \min _{1\le i\le n} |x_i-y| \right\} \in O\bigg (\frac{1}{n}\bigg ). \end{aligned}$$

1.1.4 Proof of Theorem 1

Part 1: $\displaystyle \frac{1}{n}\bigg (\sum _{1\le i\le n} f(\phi (y_i),\theta )\bigg ) \xrightarrow {P} \displaystyle \frac{1}{n}\bigg (\sum _{1\le i\le n} f(\phi (x_{k_i}),\theta )\bigg )$

Proof

Let P be the probability that y and its nearest neighbor belong to different class. Then we have

$$\begin{aligned} P=\Pr \left\{ f(\phi (y_i),\theta _x) \ne f(\phi (x_{k_i}))\right\} . \end{aligned}$$

Because the random event “$f(\phi (y_i),\theta _x) \ne f(\phi (x_{k_i}))$” and “|$f(\phi (y_i),\theta _x) - f(\phi (x_{k_i}))|=1$” are the same, the conditional expectation by given training data $x_1,\ldots ,x_n$ is

$$\begin{aligned} E_{y|x} \left\{ \frac{1}{n}\sum _{1\le i \le n}|f(\phi (y_i),\theta _x)-f(\phi (x_{k_i},\theta _x))|\right\} =P \end{aligned}$$

and the conditional variance by given training data $x_1,\ldots ,x_n$ is

$$\begin{aligned} D_{y|x} \left\{ \frac{1}{n}\sum _{1\le i \le n}|f(\phi (y_i),\theta _x)-f(\phi (x_{k_i},\theta _x))|\right\} =P(1-P). \end{aligned}$$

Denote by $\varGamma _{\theta _x}$ the decision boundary of classification function f with parameter $\theta _x$ and $\Vert y\varGamma _{\theta _x} \Vert $ as the distance from y to the decision boundary. Then we have

$$\begin{aligned} P \le \int \Pr \left\{ \Vert y_i\varGamma _{\theta _x}\Vert \le R \wedge |x_{k_i}-y_i|\le R\right\} dR. \end{aligned}$$

As the decision boundary of the classification function f relies on training data $x_1,\dots ,x_n$, which is independent from y, we have

$$\begin{aligned} P&\le \int \Pr \left\{ \Vert y_i\varGamma _{\theta _x}\Vert \le R \wedge |x_{k_i}-y_i|\le R\right\} dR \\&=\int \Pr \left\{ |x_{k_i}-y_i|\le R\right\} \cdot \Pr \left\{ \Vert y_i\varGamma _{\theta _x}\Vert \le R\right\} dR\\&=\int \frac{\Vert \varGamma _{\theta _x}\Vert }{2T}\cdot R \cdot \Pr \left\{ \min _{1\le j\le n}|x_j-y_i|\le R\right\} dR \\&= \frac{\Vert \varGamma _{\theta _x}\Vert }{2T} \cdot E\left\{ \min _{1\le i\le n} |x_i-y|\right\} \in O\bigg (\frac{1}{n}\bigg ). \end{aligned}$$

We remark that $\Vert \varGamma _{\theta _x} \Vert $ are effected by the model complex.

Part 2: $\frac{1}{n}\bigg (\sum _{1\le i\le n} f(\phi (x_{k_i}),\theta _x)\bigg )\xrightarrow {P} \frac{1}{n}\bigg (\sum _{1\le i\le n} f(\phi (x_i),\theta _x)\bigg )$

Proof

Let $\xi _i$ be the frequency of $x_i$ in sequence $x_{k_1}\dots x_{k_n}$. Then we have

$$\begin{aligned} \frac{1}{n}\left( \sum _{i=1}^{n}f(\phi (x_{k_i}),\theta _x)\right) =\left( \sum _{i=1}^{n}\frac{\xi _i}{n}\cdot f(\phi (x_{i}),\theta _x)\right) . \end{aligned}$$

Applying the big number theorem, we have

$$\begin{aligned} \frac{\xi _i}{n} \approx \Pr \{x_i\,\,\text {is the nearest neighbor of}\,\,y\}= \frac{1}{n}, \end{aligned}$$

which completes the proof.

1.1.5 Proof of Lemma 8

Proof

Let $V^{*}= {<}v_1^{*},\dots ,v_n^{*}{>}^{T}$ be the optimal solution. If $v_1^{*} \ne 0$ or $v_1^{*} \ne 1 \rightarrow v_1\in (0,1)$ at the point $\mathbf {v}^{*}= {<}v_1^{*},\dots ,v_n^{*}{>}^T$, then the partial deviation of cost function is equal to zero, i.e.,

$$\begin{aligned} \frac{\partial f_C(\mathbf {v}) }{\partial v_1}|_{\mathbf {v}=\mathbf {v}^{*}} = 0. \end{aligned}$$

We can divide the expression of $f_C(\mathbf {v})$ into three groups: the expression ends with $(1-v_1)$, the expression begin with $v_1$, and the expression does not contain $v_1$, i.e.,

$$\begin{aligned} \frac{\partial f_C(\mathbf {v}) }{\partial v_1}&=\left( \sum _{c=(1-v_1)} \frac{\partial c}{\partial v_1}\right) +\left( \sum _{c=v_1\cdot w} \frac{\partial c}{\partial v_1}\right) +\left( \sum _{c=w \wedge v_1 \notin w} \frac{\partial c}{\partial v_1}\right) . \end{aligned}$$

Let $k=|\{c|c=(1-v_1) \wedge c\in C\}|$, at the point ${<}v_1^{*},\dots ,v_n^{*}{>}^T$, we have the equation

$$\begin{aligned} \frac{\partial f_C(\mathbf {v})}{\partial v_1} |_{\mathbf {v}=\mathbf {v}^{*}} = \left( \sum _{c=(1-v_1)} (-1)\right) +\left( \sum _{c=v_1\cdot w} (w)|_{\mathbf {v}=\mathbf {v}^{*}}\right) = 0. \end{aligned}$$

Therefore,

$$\begin{aligned} k=\left( \sum _{c=v_1\cdot w} (w)|_{\mathbf {v}=\mathbf {v}^{*}}\right) . \end{aligned}$$

Note that the optimal value of the cost function at the point $\mathbf {v}^{*}= {<}v_1^{*},\dots ,v_n^{*}{>}^T$ is

$$\begin{aligned} f_C(\mathbf {v}^{*})&=\sum _{c_i\in C} c_i(v_1^{*},\dots ,v_n^{*})\\&=k(1-v_1^{*})+v_1^{*}\left( \sum _{c=v_1\cdot w} w(v_2^{*},\dots ,v_n^{*})\right) + \left( \sum _{c=w \wedge v_1 \notin w} w(v_2^{*},\dots ,v_n^{*})\right) \\&=k + v_1^{*} \left( \left( \sum _{c=v_1\cdot w} w(v_2^{*},\dots ,v_n^{*})\right) - k\right) + \left( \sum _{c=w \wedge v_1 \notin w} w(v_2^{*},\dots ,v_n^{*})\right) \\&=k+\left( \sum _{c_i=w \wedge v_1 \notin w} w(v_2^{*},\dots ,v_n^{*})\right) = \sum _{c_i\in C} c_i(0,v_2^{*},\dots ,v_n^{*}). \end{aligned}$$

Hence, $\mathbf {v}^{'}= {<}0,v_2^{*},\ldots ,v_n^{*}{>}^{T}$ is another optimal solution. Repeating the above operation, we can eventually find an optimal solution $v^{'}$ in which each $v_i$ is either 0 or 1.

1.1.6 Proof of Lemma 12

Lemma 12

When $m\ge 1$ , $g(x)=\frac{\ln (1-x^{m})}{\ln (1-x)}$ is a monotonously increase function on interval $x\in (0,1)$.

Proof

For any dx, functions $g_1(x)=\ln (1-x^{m})$ and $g_2(x)=\ln (1-x)$, we have

$$\begin{aligned} f(x+dx)&=\frac{g_1(x+dx)}{g_2(x+dx)} \\&=\frac{g_1(x)+g^{'}_1(x)dx+O(dx)}{g_2(x)+g^{'}_2(x)dx+O(dx)} \\&=\frac{\ln (1-x^m)-mx^{m-1}(1-x^m)^{-1}dx+o(dx)}{\ln (1-x)-(1-x)^{-1}dx+o(dx)}. \end{aligned}$$

Note that

$$\begin{aligned} -mx^{m-1}(1-x^m)^{(-1)}+(1-x)^{-1} = \frac{1+(m-1)x^{m}-mx^{m-1}}{(1-x^{m})(1-x)}. \end{aligned}$$

Furthermore, we have $\xi (0)=1$ and $\xi (1)=0$, where $\xi (x)=1+(m-1)x^{m}-mx^{m-1}$. Besides, we have

$$\begin{aligned} \xi (x)^{'}=(m-1)mx^{m-1}-m(m-1)x^{m-2}=m(m-1)(x-1)x^{m-2}. \end{aligned}$$

Therefore,

$$\begin{aligned}&\forall x\left( x\in (0,1) \rightarrow \xi (x)^{'}<0 \rightarrow \xi (0)\ge \xi (x)\ge \xi (1) \rightarrow 1\ge \xi (x) \ge 0 \right) \\&\qquad \Longrightarrow \forall x\left( x\in (0,1)\rightarrow \frac{1+(m-1)x^m-mx^{m-1}}{(1-x^{m})(1-x)}\ge 0\right) \\&\qquad \Longrightarrow \forall x\left( x\in (0,1)\rightarrow (-mx^{m-1}(1-x^m)^{(-1)}+(1-x)^{-1}) \ge 0\right) \\&\qquad \Longrightarrow \forall x\left( x\in (0,1)\rightarrow \bigg (-\frac{mx^{(m-1)}}{1-x^{m}}\ge -\frac{1}{1-x}\bigg )\right) . \end{aligned}$$

Let $-\frac{1}{1-x}=c$. we have

$$\begin{aligned} f(x+dx)&=\frac{\ln (1-x^m)-mx^{m-1}(1-x^m)^{-1}dx+o(dx)}{\ln (1-x)-(1-x)^{-1}dx+o(dx)} \\&{>}\frac{\ln (1-x^m)+cdx+o(dx)}{\ln (1-x)+cdx+o(dx)}. \end{aligned}$$

Because

$$\begin{aligned} \forall x(x\in (0,1)\rightarrow 0 {>} \ln (1-x^{m}) {>} \ln (1-x) \rightarrow \frac{\ln (1-x^{m})}{\ln (1-x)}<1). \end{aligned}$$

We can find an $\epsilon >0$ such that $\forall dx( 0 \le dx\le \epsilon \rightarrow d<0)$ and

$$\begin{aligned} \forall x(x\in (0,1)\rightarrow f(x)=\frac{\ln (1-x^{m})}{\ln (1-x)} \le \frac{\ln (1-x^{m})+d}{\ln (1-x)+d}\le f(x+dx)), \end{aligned}$$

where $d=cdx+o(dx)$.

1.1.7 Proof of Lemma 10

We have

$$\begin{aligned} E\left\{ \min _{1\le i\le n} \Vert \mathbf {x}_i-\mathbf {y}\Vert \right\}&=\int \limits _{0}^{2T} R\cdot d \Pr \left\{ \min _{1\le i\le n}\Vert \mathbf {x}_i-\mathbf {y}\Vert \le R \right\} \\&=2T-\int \limits _{0}^{2T}\left( 1-\left( 1-\left( \frac{4RT-R^2}{4T^2}\right) ^m\right) ^n\right) dR\\&=2T-2T+\int \limits _{0}^{2T}\left( 1-\left( \frac{4RT-R^2}{4T^2}\right) ^m\right) ^ndR.\\&=2T\int \limits _{0}^{1} (1-[2x-x^2]^M)^n dx \\&=2T\int \limits _{0}^{1} (1-[1-t^2]^M)^n dx. \end{aligned}$$

When $x \in (0,1)$, we have $(1-x)^k>(1-x^m)$ iff $k < \frac{\ln (1-x^m)}{\ln (1-x)} \le 1$. For any $y \in (0,1)$,

$$\begin{aligned} E\left\{ \min _{1\le i\le n} \Vert \mathbf {x}_i-\mathbf {y}\Vert \right\}&= 2T\int \limits _{\frac{\ln (1-y^m)}{\ln (1-y)}\le \frac{\ln (1-(1-t^2)^m)}{\ln (1-(1-t^2))}} (1-[1-t^2]^m)^n dt\\&\quad +\,2T\int \limits _{\frac{\ln (1-y^m)}{\ln (1-y)}> \frac{\ln (1-(1-t^2)^m)}{\ln (1-(1-t^2))}} (1-[1-t^2]^m)^n dt. \end{aligned}$$

Let $k=\frac{\ln (1-y^m)}{\ln (1-y)}$. Notice that

$$\begin{aligned}&\forall t \bigg (t\in \left\{ t|\frac{\ln (1-y^m)}{\ln (1-y)}=k\le \frac{\ln (1-(1-t^2)^m)}{\ln (1-(1-t^2))}\right\} \\&\qquad \rightarrow (1-(1-t^2)^m)>(1-(1-t^2))^k=t^{2k}\bigg ). \end{aligned}$$

Therefore,

$$\begin{aligned} E\left\{ \min _{1\le i\le n} \Vert \mathbf {x}_i-\mathbf {y}\Vert \right\} \le&2T\int \limits _{\frac{\ln (1-y^m)}{\ln (1-y)}\le \frac{\ln (1-(1-t^2)^m)}{\ln (1-(1-t^2))}} (t^{2k})^n dt\\&\quad +\,2T\int \limits _{\frac{\ln (1-y^m)}{\ln (1-y)}> \frac{\ln (1-(1-t^2)^m)}{\ln (1-(1-t^2))}} (1-[1-t^2]^M)^n dt. \end{aligned}$$

According to Lemma 12, we further have

$$\begin{aligned} E\left\{ \min _{1\le i\le n} \Vert \mathbf {x}_i-\mathbf {y}\Vert \right\} \le&2T\int \limits _{ (1-t^2) \ge y} (t^{2k})^n dt+2T\int \limits _{(1-t^2) < y} (1-[1-t^2]^M)^n dt\\ \le&2T\left( \int \limits _{0}^{1} (t^{2k})^{n}dt+\int \limits _{\sqrt{1-y}}^{1} 1\cdot dt\right) \\&= 2T\bigg (\frac{1}{2kn+1}+(1-\sqrt{1-y})\bigg ) \end{aligned}$$

when $y=n^{-\frac{1}{m}}$, we have $\frac{1}{2kn+1}\in O(n^{-\frac{1}{m}})$. Because $\lim _{y\rightarrow 0}\frac{1-\sqrt{1-y}}{y}=1$, we have $1-\sqrt{1-y}\in O(y) = O(n^{-\frac{1}{m}})$. Therefore,

$$\begin{aligned} E\left\{ \min _{1\le i\le n} \Vert \mathbf {x}_i-\mathbf {y}\Vert \right\} \in O(n^{-\frac{1}{m}}). \end{aligned}$$

Similarly, we have

$$\begin{aligned} D\left\{ \min _{1\le i\le n} \Vert \mathbf {x}_i-\mathbf {y}\Vert \right\} \in O(n^{-\frac{1}{m}}). \end{aligned}$$

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Pan, Z., Xue, J., Ge, T. (2022). Intuitive Searching: An Approach to Search the Decision Policy of a Blackjack Agent. In: Yang, XS., Sherratt, S., Dey, N., Joshi, A. (eds) Proceedings of Sixth International Congress on Information and Communication Technology. Lecture Notes in Networks and Systems, vol 236. Springer, Singapore. https://doi.org/10.1007/978-981-16-2380-6_77

Download citation

DOI: https://doi.org/10.1007/978-981-16-2380-6_77
Published: 17 September 2021
Publisher Name: Springer, Singapore
Print ISBN: 978-981-16-2379-0
Online ISBN: 978-981-16-2380-6
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)

Publish with us

Policies and ethics

Intuitive Searching: An Approach to Search the Decision Policy of a Blackjack Agent

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

World-class interpretable poker

High-Level Language to Build Poker Agents

Towards a Fast Detection of Opponents in Repeated Stochastic Games

References

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Appendix 1

1.1 Missing Proofs

1.1.1 Proof of Lemma 2

1.1.2 Proof of Lemma 3

1.1.3 Proof of Lemma 4

1.1.4 Proof of Theorem 1

Proof

Proof

1.1.5 Proof of Lemma 8

Proof

1.1.6 Proof of Lemma 12

Lemma 12

Proof

1.1.7 Proof of Lemma 10

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Subscribe and save

Buy Now

Navigation

Intuitive Searching: An Approach to Search the Decision Policy of a Blackjack Agent

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

World-class interpretable poker

High-Level Language to Build Poker Agents

Towards a Fast Detection of Opponents in Repeated Stochastic Games

References

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Appendix 1

Appendix 1

1.1 Missing Proofs

1.1.1 Proof of Lemma 2

1.1.2 Proof of Lemma 3

1.1.3 Proof of Lemma 4

1.1.4 Proof of Theorem 1

Proof

Proof

1.1.5 Proof of Lemma 8

Proof

1.1.6 Proof of Lemma 12

Lemma 12

Proof

1.1.7 Proof of Lemma 10

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation