Toward a Theory of Vulnerability Disclosure Policy: A Hacker’s Game

Canann, Taylor J.

doi:10.1007/978-3-030-32430-8_8

Taylor J. Canann¹²

Part of the book series: Lecture Notes in Computer Science ((LNSC,volume 11836))

Included in the following conference series:

International Conference on Decision and Game Theory for Security

1283 Accesses
1 Citations

Abstract

A game between software vendors, heterogeneous software users, and a hacker is introduced in which software vendors attempt to protect software users by releasing updates, i.e. disclosing a vulnerability, and the hacker is attempting to exploit vulnerabilities in the software package to attack the software users. The software users must determine whether the protection offered by the update outweighs the cost of installing the update. Following the model is a description of why the disclosure of vulnerabilities can only be an optimal policy when the cost to the hacker of searching for a Zero-Day vulnerability is small. The model is also extended to discuss Microsoft’s new “extended support” disclosure policy.

I am grateful to Richard Evans, Kerk Phillips, the BYU MCL workshops, Brennan Platt, Brad Greenwood, Robert Mrkonich, Samuel Kaplan, Kenneth Judd, Chase Coleman, Ryne Belliston, Jan Werner, David Rahman, and Aldo Rustichini for very helpful comments and to Alexander Pingry for excellent research assistance. Additional comments and proofs can be found in the online mathematical appendix.

Access provided by Autonomous University of Puebla. Download conference paper PDF

Software Security Economics: Theory, in Practice

Assessing vulnerability exploitability risk using software properties

Article 27 March 2015

Understanding the Vulnerability Lifecycle for Risk Assessment and Defense Against Sophisticated Cyber Attacks

Keywords

1 Introduction

In May of 2017 the WannaCry attacks infected over 300,000 systems in 150 countries and the approximate estimated cost that these attacks is $\$4$ billion. One month later, the NotPetya attacks, another major global attack that primarily targeted Ukrainian systems, began. The approximated costs of the NotPetya attacks were even larger than the WannaCry attacks and have been estimated at around $10 billion. Following the NotPetya attacks, the Retefe banking Trojan began leveraging the EternalBlue exploit in September. Finally, in August of 2018 the Taiwan Semiconductor Manufacturing Company, an Apple chip supplier, was hit by a new variant of the WannaCry attack that cost the company approximately $170 million. The problem was not that Windows is an inherently flawed system, but instead that these attacks could have been avoided if users/firms had only updated. In March of 2017, Microsoft patched this vulnerability in their monthly, second Tuesday, update.

This is not just a problem with Microsoft software, every piece of software, no matter what care is taken by a software vendor, is riddled with vulnerabilities, which leaves users of the software open to attack by hackers. To protect users, software vendors release patches to address these found vulnerabilities, but this is a double-edged sword. Releasing updates, a.k.a. vulnerability disclosure, may in fact increase the susceptibility of current users to attack, in particular, those who chose not to immediately install the updates. This is due to the fact that the update can be reverse engineered quite easily by hackers. These types of hacks have been gaining in prevalence over the last few of years.

In a seminal paper in the field of vulnerability disclosure, [1] asked if finding vulnerabilities is optimal for social welfare. Since then, vulnerability disclosure policy has been greatly debated in the literature. The model outlined in this paper explores the decisions made by both the network of users and a hacker given the different policy regimes that could be implemented by the vendor. The interaction between vendors and software users was first modeled by [2], in which they find that vendors will always want to delay the release of patches, but this action is not socially optimal. However, [2] do not pose an answer to whether a vendor should engage in disclosing vulnerabilities, which is the main focus of this paper.

One of the first papers on the economic modeling of hacker behavior was developed in [3], where they attempt to estimate the effects of the fixed costs of hacking on the incentives of a profit maximizing hacker. Much of the recent literature that has attempted to model hacker behaviour, e.g. see [4], follow models similar to the Becker model of criminal behavior [5], but this approach assumes that: (i) Law enforcement can easily track and find a hacker and (ii) Hackers can easily be prosecuted. These assumptions are the exception, not the rule. To break from this convention, the hacker is modeled as a profit maximizing agent in order to contribute hacker behavior into the vulnerability disclosure debate.

Both the network framework of software users and the hacker behaving as a profit maximizing agent are extensions of the work in [6], where they focus on the welfare effects of disclosure policy for a representative set of users with the vendor facing a monopolistically competitive market. This paper follows the notation in [6] rather closely so as to maintain a consistent notational scheme within the vulnerability disclosure literature.

Others have examined how attack propensity changes under different disclosure regimes (e.g. [7]), and have found that releasing patches tends to increase the number of attacks. This model identifies the reasons for this observed increase in attacks as being driven by the decisions of both the hacker and the software users. The hacker’s decision is driven by parameters such as the probability of a successful attack, as well as the costs associated of finding new vulnerabilities in the software package. The software users must balance the value they place on using the software relative to the expected damages of an attack and the cost of updating their machine. Therefore, this model is able to give a causal relationship between attack propensity and disclosure regimes which strengthen the story behind these correlations.

In order to describe the best type of disclosure policy, a model of a heterogeneous network made up of an interconnected set of software users that are attempting to defend themselves against a profit-maximizing hacker is developed in this paper. Within my model, there are three decisions to be made: (i) The strategy of attack to be played by the hacker, (ii) The optimal disclosure policy, and (iii) The updating decision made by the software user. Following the model setup, welfare maximizing policies are formulated to decrease a hacker’s efforts in infiltrating networks and increase the software users’ utility.

The optimal strategy for the software vendor, the optimal policy, is to maximize the egalitarian sum of utilities of the software users, i.e. the vendor acts as a social planner. The optimal policy is dependent both on the distribution of software users on the network and how costly finding a previously unknown vulnerability, i.e. a Zero-Day vulnerability, is for the hacker. Software users that do not expect to bear the majority of the burden of an attack, known as low-type users, do not want vulnerabilities to be disclosed, i.e. a Non-Disclosure policy, since they will not update their machines, deeming it too costly. Also, if the cost of searching for a Zero-Day exploit is higher than the expected payoff of the exploit, then the hacker is not willing to expend the energy searching for a Zero-Day, and Non-Disclosure is the optimal policy. Therefore, the only case in which Disclosure can be an optimal policy is when search costs are low and there are enough users that desire to update their machines.

Starting in January of 2020, Microsoft will no longer support Windows 7, unless the users enroll in Extended Support^{Footnote 1}. The final result of the paper is that Microsoft’s new policy increases the cost of exploiting the disclosed vulnerability, and, even though the policy increases the cost of updating, under certain parameter values the policy causes the software uses to receive higher payoffs. This new approach to disclosure policy can increase the overall welfare relative to the policy of disclosing all vulnerabilities.

The sections of the paper are as follows: the model is introduced as well as the first main contribution: A discussion of optimal policy when the hacker are decision making agents in Sect. 2. Following is the newly proposed policy by Microsoft in Sect. 4, then concluding is in Sect. 5.

2 Static Game

The players within this static game are the software vendor, a hacker, and the software users. The software vendor follows a welfare maximizing disclosure policy, and thus determines the rules of the game. The hacker maximizes his profits by choosing a hacking strategy of exploiting either a Zero-Day, the patch released by the software vendor, i.e. an N-Day attack, or he can exit the game. Lastly, each software user must decide whether or not to update her machine if a vulnerability is disclosed, i.e. an update is released.

Table 1. Notations used in the paper

Full size table

The vendor of the software package is only concerned with maximizing software user welfare in an egalitarian manner, similar to a social planner. The vendor is unable to detect all vulnerabilities before selling the software, but the vendor, under a Disclosure policy, will attempt to find these vulnerabilities ex post. The probability of finding a vulnerability, $\alpha $, can either be thought of as individual vendors searching for vulnerabilities by themselves or as bounty systems such as Microsoft’s Bounty System (E.g. See: [8,9,10,11]).

The software user’s weight parameter in Table 1 can be thought of as the network centrality of the software user, or as how desirable the information found on the software user’s machine is to the hacker. Due to the hacker attempting to exploit the network and the inability of vendors to solve all vulnerabilities ex ante, each software user is vulnerable to an attack. The hacker is only able to extract as much information as is available to software user i. This damage can also be thought of a direct transfer from the software user to the hacker when the software user is hacked.

The vendor does not usually charge the software users to install the updates, but the updates are still costly in terms of opportunity costs, i.e. the time to install the update. Updates often require software users to stop working or even shutdown their machines, thus $c_u>0$. For simplicity, this cost is assumed to be a fixed cost to be paid if the software user decides to update. To model the fact that some people do not update under any policy and there exists at least one software user that might update, the following assumption is made:

Assumption 1

Let $\theta _1<\frac{c_u}{v+D}<\theta _m$.

A single hacker attempts to exploit vulnerabilities to maximize profits via gaining access to the network of software users. The hacker’s profits function is dependent on both the chosen policy and the software users’ optimal updating decision. The hacker has two types of exploitation available to them, he is able to hack via a known vulnerability, an N-Day exploit, or by a previously unknown vulnerability, a Zero-Day attack. The information available to the hacker consists of the policy played by the vendor, the distribution of software user weights, the strategies available to the software users, and the probability of a successful Zero-Day attack.

Hacking, however, is not cost-less. A constant search cost, or opportunity cost of searching for a Zero-Day, is assumed. If the hacker decides to exploit a known vulnerability, meaning to attack vulnerability that was just patched by the vendor, then the hacker’s cost of hacking is assumed to be zero. This is to account for the relative ease of reverse engineering an update to find the vulnerability in the code.

Under a Non-Disclosure regime, the hacker is only able to search for Zero-Day exploits or exit the game; while the software user makes no decision under this regime. If policy dictates that a Disclosure regime is optimal, then the hacker can still search for Zero-Day exploits or exit the game as in the Non-Disclosure regime, but he can also choose to exploit the updated vulnerability on all machines that have not had the patch installed. Given a disclosed vulnerability, each software user must decide whether to install the update on her machine.

2.1 Non-Disclosure Regime

If the vendor chooses ND, or is forced to withhold this information, then software users do not make any decisions, they just use the software to gain, at most, $v\theta _i$. The hacker’s action set is defined as $A_{nd}\in \{S, X\}$. The utility payoff of software user i, $U^i_{nd}:A_{nd}\times \theta _i\rightarrow \mathbb {R}$, is equal to $v\theta _i$ if they are not exploited or $-D\theta _i$ if the hacker was successful in finding a Zero-Day.

Table 2. Hacker expected payoff functions: Non-Disclosure

Full size table

Via Table 2, the hacker will only choose (S) if the cost of searching for a Zero-Day is low, i.e. the expected payoff is positive. If $c_s<\delta D\sum _{i\in I} \theta _i$, then the Nash Equilibrium is that hacker will search for Zero-Days, $A^*_{nd}=(S)$. However, if $c_s>\delta D\sum _{i\in I}\theta _i$, then under Non-Disclosure, the unique Nash equilibrium is to exit the game, $A^*_{nd}=(X)$.

2.2 Disclosure Regime

If the vendor chooses D, the vendor releases updates every time they find a vulnerability. Each software user then must choose whether to update, and thus endogenously define the two sets $\varGamma _{nu}$ and $\varGamma _u$ as the set of software users that do not update and the set of software users that do update, respectively, and $\xi =|\varGamma _u|$, the number of software users that update under a Disclosure policy. When a software user chooses to update, she protects her machine from N-Day exploits, but is still vulnerable to Zero-Days. Due to the costly nature of updating, some software users may choose not to update leaving their computers open to both Zero-Day and N-Day hacks (E.g. See [12]).

Now there are two stages within the game, the first being the possible release of updates by the vendor, which happen with probability $\alpha $, followed by the game between the hacker and the software users. When the vendor is unable to find a vulnerability, the game is identical to that of the Non-Disclosure regime in Sect. 2.1. The hacker’s action set when the vendor is unable to find a vulnerability within the Disclosure regime is denoted as $A^{1-\alpha }_d\in \{S, X\}$. When the vendor finds a vulnerability and releases an update, then both the hacker and the software user must choose their actions, $A^{\alpha }_d\in \{E, S, X\}$ and $A^i\in \{u, nu\}$, respectively.

The expected utility of the software user i is defined as the function $U^i_d:A^{\alpha }_d\times A^{1-\alpha }_d\times A^i\times \theta _i\rightarrow \mathbb {R}$, where she receives $v\theta _i$ if her machine is not exploited, $-D\theta _i$ if the hacker is successful in attacking her machine, and $-c_u$ if she decides to update.

Table 3. Hacker expected payoff functions: Disclosure

Full size table

There are three main drivers of the Nash equilibria under Disclosure:

(a)
Do there exist any software users that choose not to update when an update is released? (Notice that this is always satisfied via Assumption 1.)
(b)
If there is no update released, does the cost of finding a Zero-Day exceed the expected profits of searching? I.e.
$$\begin{aligned} c_s\lessgtr \delta D\sum _{i\in I}\theta _i. \end{aligned}$$
(1)
(c)
If a vulnerability is disclosed, does the cost of finding a Zero-Day exceed the expected profits of searching? I.e.
$$\begin{aligned} c_s\lessgtr \widehat{\delta }D\sum _{i\in I}\theta _i. \end{aligned}$$
(2)

The first case to examine is when the cost of searching is high, i.e. $c_s>\delta D\sum _i\theta _i$. Since both the hacker and the software users know whether an update has been released, then the solution can be split into the Non-Disclosure and the Disclosure sub-games. Similar to the Non-Disclosure case when there are high search costs, in the Disclosure game when no vulnerability is found $A^{1-\alpha *}_d=(X)$ is the equilibrium of the sub-game.

Since the search costs are high for the hacker and there exists at least one software user that does not update, then $A^{\alpha *}_d=(E)$ is the only strategy to survive elimination of strictly dominant strategies for the hacker, and is thus the only strategy in the best response for the hacker. Given the hacker strategy (E), the best response of software user i is to not update, i.e. $i\in \varGamma _{nu}^*$, if $\theta _i<\frac{c_u}{v+D}$. Otherwise, for software user j such that $\theta _j>\frac{c_u}{v+D}$, updating is optimal, $j\in \varGamma _u^*$. If $\theta _i=\frac{c_u}{v+D}$, then she is indifferent between any mixture $p_j\in [0,1]$ of Update and Not Update.

Therefore, the Nash equilibrium of the Disclosure game under high search costs is

$$\begin{aligned} ((A^{\alpha *}_d,A^{(1-\alpha )*}_d),(A^*_i)_{i\in I})=((E,X),(nu)_{i\in \varGamma _{nu}^*},(u)_{j\in \varGamma _{u}^*}) \end{aligned}$$

(3)

Where $\varGamma _{nu}^*=\left\{ i\in I| \theta _i<\frac{c_u}{v+D}\right\} $ and $\varGamma ^*_u=\left\{ j\in I| \theta _j>\frac{c_u}{v+D}\right\} $.

The next case, denoted the medium search cost case, is when searching is profitable when the vendor is unable to find the vulnerability but not when the vulnerability is disclosed by the vendor, i.e. $\widehat{\delta } D\sum _{i\in I}\theta _i\le c_s<\delta D\sum _{i\in I}\theta _i$. If the vendor is unable to find a vulnerability, the cost of searching is still exceeded by the expected profits of searching, and thus $A^{(1-\alpha )*}_d =(S)$ is his best response. However, when the vendor finds a vulnerability, then the expected profits of (S) are surpassed by the cost of (S), then the action of (S) when a vulnerability is disclosed yields a strictly lower payoff then (X). Since there always exist software users that do not update, then the best action for the hacker to play is $A^{\alpha *}_d=(E)$.

Then, notice that all software users such that $\theta _i<\frac{c_u}{v+D}$ will be in $\varGamma _{nu}^*$, and all software users $\theta _j>\frac{c_u}{v+D}$ will be in $\varGamma _u^*$. Therefore, the Nash equilibrium of the medium search cost case is

$$\begin{aligned} ((A^{\alpha *}_d,A^{(1-\alpha )*}_d),(A^*_i)_{i\in I})=((E,S),(nu)_{i\in \varGamma _{nu}^*},(u)_{j\in \varGamma _{u}^*}) \end{aligned}$$

(4)

Where $\varGamma _{nu}^*=\left\{ i\in I| \theta _i<\frac{c_u}{v+D}\right\} $ and $\varGamma ^*_u=\left\{ j\in I| \theta _j>\frac{c_u}{v+D}\right\} $.

The final case is to determine what happens when searching yields positive profits under both branches of the game, i.e. $c_s<\hat{\delta }D\sum _i\theta _i$. In this low search cost case, with probability $1-\alpha $, we obtain the same solution as in the Non-Disclosure game in Sect. 2.1, i.e. $A^{(1-\alpha )*}_d=(S)$.

Next is to determine the best response of both the hacker and each software user when an update is released. The first thing to notice is that (X) is never a best response since exiting gives a payoff of zero while (S) and (E) both yield positive expected payoffs. Given the hacker strategy (E), $i\in \varGamma _{nu}^*$, is the software user i’s best response so long as $\theta _i<\frac{c_u}{v+D}$.

If $\theta _j>\frac{cu}{v+D}$, then software user j’s best response is $j\in \varGamma _u^*$. Whenever the hacker plays (S), updating will not protect the software user from a hack, and thus, $i\in \varGamma _{nu}^*$ is the best response for all $i\in I$.

Allowing for the hacker to use mixed-strategies introduces the probability $\rho \in (0,1)$, where $\rho $ is the probability that the hacker chooses (E) and $(1-\rho )$ gives (S). Using the expected payoffs of the software users given $\rho $, then any software user i’s best response is $i\in \varGamma _{nu}^*$ when $\theta _i<\frac{c_u}{\rho (v+D)}$. Notice that for any $\rho \in [0,\frac{c_u}{\theta _{m}(v+D)})$, (nu) is the best response for all software users. For all software users j such that $\theta _j>\frac{c_u}{\rho (v+D)}$, $j\in \varGamma _{u}^*$ is their optimal action. For any software user k such that $\theta _k=\frac{c_u}{\rho (v+D)}$, the software user is indifferent between updating and not updating, and will mix with probability $p_k\in [0,1]$, where $p_k$ is the probability of choosing (u).

Now to examine the best response of the hacker when a vulnerability is disclosed given the software users’ strategies. If all of the software users update, i.e. $\varGamma _u=I$, then the best response for the hacker is $A^{\alpha *}_d=(S)$. Similarly, if the software user strategy is $\varGamma _{nu}=I$, then $A^{\alpha *}_d=(E)$ is the only strategy in the best response for the hacker.

Due to the monotonicity of the software users, and thus their optimal actions, all that is left to do is to split I between high- and low-type users. Define $\varOmega \equiv \left\{ j\in I|\theta _j\ge \frac{c_u}{v+D}\right\} $ as the set of high-type software users, i.e. the users that will update if the hacker chooses (E). For some $k\in \varOmega $, define $\varGamma ^k_{nu}=\left\{ i\in I| \theta _i<\theta _k\right\} $ and $\varGamma ^k_u=\left\{ j\in I|\theta _j>\theta _k\right\} $. Given a software user strategy of $(\varGamma ^k_{nu},(p_k(u),(1-p_k)(nu)),\varGamma ^k_u)$, for some mixed strategy $p_k\in [0,1]$ for software user k, then the hacker’s expected payoff of mixing with $\rho \in [0,1]$ between exploiting and searching is

$$\begin{aligned} \rho \biggl [D\sum _{i\in \varGamma ^k_{nu}}\theta _i+(1-p_{k})D\theta _{k}\biggr ]+(1-\rho )\biggl [\widehat{\delta }D\sum _{i\in I}\theta _i-c_s\biggr ] \end{aligned}$$

(5)

For all $\rho \in [0,1]$, if

$$\begin{aligned} c_s>\widehat{\delta }D\sum _{i\in I}\theta _i-D\sum _{i\in \varGamma ^k_{nu}}\theta _i-\bigr (1-p_{k}\bigl )D\theta _{k} \end{aligned}$$

(6)

then $\rho ^*=1$ is the best response for the hacker given the software users’ strategy.

However, if for every value $\rho \in [0,1]$,

$$\begin{aligned} c_s<\widehat{\delta }D\sum _{i\in I}\theta _i-D\sum _{i\in \varGamma ^k_{nu}}\theta _i-\bigr (1-p_{k}\bigl )D\theta _{k} \end{aligned}$$

(7)

then the hacker will send $\rho ^*$ to zero.

The last case is if there exists a $p_{k}\in [0,1]$ such that Inequality 6 holds with equality, i.e.

$$\begin{aligned} c_s=\widehat{\delta }D\sum _{i\in I}\theta _i-D\sum _{i\in \varGamma ^k_{nu}}\theta _i-\bigr (1-p_{k}\bigl )D\theta _{k} \end{aligned}$$

(8)

then any $\rho ^*\in [0,1]$ is the hacker’s best response to the software users’ strategy of $(\varGamma ^k_{nu},(p_{k}(u),(1-p_{k})(nu)),\varGamma ^k_u)$.

Theorem 1

Let $k_{min}\in \varOmega $ be the minimal software user in $\varOmega $. If Inequality 6 holds for $p_{k_{min}}=1$, then the Nash Equilibrium is

$$\begin{aligned} ((A^{\alpha *}_d,A^{(1-\alpha )*}_d),(A^*_i)_{i\in I})=((E,S),(nu)_{i\in \varGamma _{nu}^{*}},(u)_{j\in \varGamma _{u}^{*}}) \end{aligned}$$

(9)

Where $\varGamma _{nu}^*=\bigl \{i\in I|\theta _i<\frac{c_u}{v+D}\bigr \}$ and $\varGamma _{u}^*=\bigl \{i\in I|\theta _i>\frac{c_u}{v+D}\bigr \}$.

Otherwise, there exists a pivotal software user, $k^*\in \varOmega $, and a mixed strategy for software user $k^*$, $p^*_{k^*}\in [0,1]$, such that Eq. 8 holds, and the Nash equilibrium is

$$\begin{aligned}&((A^{\alpha *}_d,A^{(1-\alpha )*}_d),(A^*_i)_{i\in I})\\&=((\rho ^*(E,S),(1-\rho ^*)(S,S)),(nu)_{i\in \varGamma _{nu}^{k*}},(p^*_{k^*}(u),(1-p^*_{k^*})(nu)),(u)_{j\in \varGamma _{u}^{k*}})\nonumber \end{aligned}$$

(10)

Where $\rho ^*=\frac{c_u}{\theta _{k^*}(v+D)}$, $\varGamma _{nu}^{k*}=\{i\in I|\theta _i<\theta _{k^*}\}$, and $\varGamma _{u}^{k*}a=\{i\in I|\theta _i>\theta _{k^*}\}$.

3 Welfare Analysis

In this section, the “Optimal Disclosure Policy” is first defined followed by solving for the optimal policy for each of the different search cost scenarios found in Sect. 2.

Definition 1

The optimal policy $\varPsi ^*\in \{Disclosure, Non-Disclosure\}$ is chosen such that:

$$\begin{aligned} \varPsi ^*=argmax_{\psi \in \{d, nd\}}\left\{ \sum _{i\in I}U_d(A^{\alpha *}_d,A^{(1-\alpha ) *}_d,A^*_i,\theta _i), \sum _{i\in I}U_{nd}(A^*_{nd},\theta _i)\right\} \end{aligned}$$

(11)

Where $((A^{\alpha *}_d,A^{(1-\alpha ) *}_d), (A^*_i)_{i\in I})$ and $(A^*_{nd})$ are the Nash equilibria under Disclosure and Non-Disclosure, respectively.

Under High Search Costs, recall that in the Nash equilibrium the hacker chooses to exploit the N-Day under Disclosure and to exit the game under Non-Disclosure. Under Disclosure, all low-type software users, the software users in $\varGamma _{nu}^*$, are hacked if a vulnerability is found; while all other software users must pay the cost of updating, which is assumed to be strictly greater than zero. Under Non-Disclosure, the hacker exits the game, and all software users obtain $\theta _iv$. Then the optimal policy is Non-Disclosure.

If $\widehat{\delta }D\sum _{i\in I}\theta _i\le c_s<\delta D\sum _{i\in I}\theta _i$, i.e. the medium search cost case, then under a Non-Disclosure regime the hacker searches for a Zero-Day. However, under Disclosure, the hacker chooses to exploit the released vulnerability. Then, solving for the optimal policy is dependent on

$$\begin{aligned} \sum _{i\in \varGamma _{nu}^*}\theta _i+\xi ^*\frac{c_u}{v+D}\lessgtr \delta \sum _{i\in I}\theta _i \end{aligned}$$

(12)

As the expected losses from a Zero-Day exceed the cost of the low type software users being hacked since they did not update and the cost of updating for all $\xi ^*=|\varGamma ^*_u|$ of the high type software users, then Disclosure is the optimal policy. Thus, the optimal policy under medium search costs is Disclosure if $\sum _{i\in \varGamma _{nu}^*}\theta _i+\xi ^*\frac{c_u}{v+D}<\delta \sum _{i\in I}\theta _i$. Otherwise, the optimal policy is Non-Disclosure if $\sum _{i\in \varGamma _{nu}^*}\theta _i+\xi ^*\frac{c_u}{v+D}>\delta \sum _{i\in I}\theta _i$.

The last case to examine is that of low search costs. Recall that the Nash equilibrium of the Non-Disclosure game is $A^{(1-\alpha )*}_d=(S)$, while the Nash equilibrium of the Disclosure game takes the form of mixing between (E) and (S) for the hacker while the software users split into $(\varGamma _{nu}^{k*}, (p^*_k(u),(1-p^*_k)(nu)), \varGamma _u^{k*})$. The analysis begins with the optimal policy for all low-type software users, followed by the optimal policy for all high-type software users. To conclude the section the combined results of both high- and low-type software users are used to find the optimal policy.

For all software users $i\in \varGamma _{nu}^{k*}$, then we are able to analyze which policy they would prefer by solving

(13)

Disclosure is the optimal policy for all software users that do not update so long as

(14)

Notice that both the left-hand side and the right-hand side are strictly positive. Thus, the software users that do not update, software users in $\varGamma _{nu}^{k*}$, will sometimes want the policy to be Disclosure.

High-type software users, $j\in \varGamma _{u}^{k*}$, then face the welfare decision of

(15)

For software users $i\in \varGamma _{nu}^{k*}$, Disclosure decreases the probability of being hacked by a Zero-Day, but it also increases their probability of being hacked since the hacker can exploit the N-Day vulnerability that these software users are not willing to defend against. However, software users $j\in \varGamma _{u}^{k*}$ are more likely to want a Disclosure regime since they both obtain the benefit of hackers having less vulnerabilities to search over as well as protection from the N-Day exploits since they will sometimes update.

Now to examine the welfare over all the software users by comparing the sum of all software users’ utility functions. The optimal policy condition can be written as

$$\begin{aligned} \sum _{i\in \varGamma _{nu}^{k*}}\theta _i+\biggl (\frac{D}{v+D}-\widehat{\delta }+\xi ^*\biggr )\theta _k\lessgtr \biggl (\frac{\delta -(1-\rho ^*)\widehat{\delta }}{\rho ^*}\biggr )\sum _{i\in I}\theta _i \end{aligned}$$

(16)

Hence, the optimal policy under low search costs is Disclosure if $\sum _{i\in \varGamma _{nu}^{k*}}\theta _i+\biggl (\frac{D}{v+D}-\widehat{\delta }+\xi ^*\biggr )\theta _k<\biggl (\frac{\delta -(1-\rho ^*)\widehat{\delta }}{\rho ^*}\biggr )\sum _{i\in I}\theta _i$, or the optimal policy is Non-Disclosure if $\sum _{i\in \varGamma _{nu}^{k*}}\theta _i+\biggl (\frac{D}{v+D}-\widehat{\delta }+\xi ^*\biggr )\theta _k>\biggl (\frac{\delta -(1-\rho ^*)\widehat{\delta }}{\rho ^*}\biggr )\sum _{i\in I}\theta _i$.

4 Microsoft’s “Extended Support”

This section contains an analysis of the forthcoming change to Microsoft 7 and 10’s updating procedures and how this change alters the game described in Sects. 2 and 3. The game is altered such that the software vendor, Microsoft, introduces a new monthly charge to receive updates. Microsoft intends to implement this policy starting on January 14$^{th}$, 2020, which is the same day that Windows 7 will no longer be supported. But with a large number of Windows users still using Windows 7, Microsoft needed to come up with a policy to protect these users and maintain their market share (Table 4).

Table 4. New notation for Microsoft’s “Extended Support”

Full size table

Updating is no longer the only available choice to the software user. The software user can also choose to shift toward using a different version, i.e. Windows 10, for which the software user must pay a cost $c_v>0$. If the software user shifts toward using the new version of the software, then the hacker is not able to attack the software user, not even via Zero-Days.

Assumption 2

Let $\frac{c_v}{\delta (v+D)}\in (\theta _{1},\theta _{m})$.

If the hacker wants to gain access to the disclosure of the vulnerability, the hacker must pay the subscription fee for the “Extended Support”, $\phi _u$. However, the hacker does not have to pay $c_u$ since the hacker could easily enroll an old computer in the updating scheme in order to be notified of vulnerabilities. Consequently, the cost of exploiting N-Days has increased since $\phi _u>0$. To be clear, Microsoft’s new policy is fascinating since it has the potential to increase the cost of exploiting N-Days while also decreasing the effectiveness of Zero-Days against Windows 7.

Following the notation of the game in Sect. 2, this new policy can be explicitly defined. The first case to be described is the Non-Disclosure regime. The vendor was unable to find a vulnerability, and thus the hacker is only able to an action $A^{(1-\alpha )}_M\in \{S, X\}$. Searching for a Zero-Day is not as effective as in the above games due to the fact that software users are now able to change their software version to avoid being attacked. The software user choice is an action $A^{(1-\alpha )}_{M,i}\in \{v, nu\}$. The utility of software user i, $U_{M;nd}^i:A^{(1-\alpha )}_M\times A^{(1-\alpha )}_{M,i}\times \theta _i$. All players that use the old software are contained in $\varGamma _{nu}$, and all software users that switch versions are in $\varGamma _v$.

The next step is to formalize the Disclosure sub-game. The hacker has the same set of actions in this case as in the Disclosure case above to pick from: $A^{\alpha }_M\in \{E, S, X\}$. The action set for the software users is now $A^{\alpha }_{M,i}\in \{v, u, nu\}$. The utility of software user i is now $U^i_{M;d}:A^{\alpha }_M\times A^{\alpha }_{M,i}\times \theta _i$.

Table 5. Hacker expected payoff functions: Microsoft

Full size table

There are five main drivers of the Nash equilibria in this model: the three stated in Sect. 2 and the following two conditions.

(d)
Does the cost of updating exceed the cost of switching to the new version of the software package? I.e.
$$\begin{aligned} c_v\lessgtr c_u+\phi _u \end{aligned}$$
(17)
(e)
Does the cost of searching for an N-Day exceed the payoff?
$$\begin{aligned} \phi _u\lessgtr D\sum _{i\in I}\theta _i \end{aligned}$$
(18)

When search costs exceed the expected payoff of search under the Non-Disclosure sub-game, the hacker will always play (X). Given the hacker strategy of exiting the game, all software users will not update. Therefore, the equilibrium is $\left( A^{(1-\alpha )*}_M,\left( A^{(1-\alpha )*}_{M,i}\right) _{i\in I}\right) =((X),(nu)_{i\in I})$.

If $c_s<\delta D\sum _{i\in I}\theta _i$, then via the best responses of both software users and the hacker, the Nash equilibria under medium search costs are as follows in Theorem 2. Define $\varOmega _M\equiv \left\{ k\in I|\theta _k\ge \frac{c_v}{\delta (v+D)}\right\} $.

Theorem 2

Let $k_{min}\in \varOmega _M$ be the minimal software user in $\varOmega _M$. Then under low search costs in the Non-Disclosure sub-game, if

$$\begin{aligned} c_s<\delta D\sum _{i\in I\setminus \varOmega _M}\theta _i \end{aligned}$$

(19)

Then the Nash equilibrium is

$$\begin{aligned} \left( A^{(1-\alpha )*}_M,\left( A^{(1-\alpha )*}_{M,i}\right) _{i\in I}\right) =\left( (S),((nu)_{i\in \varGamma _{nu}^{k_{min},nd*}},(v)_{j\in \varGamma _v^{k_{min},nd*}})\right) \end{aligned}$$

(20)

Where $\varGamma ^{k_{min},nd*}_{nu}=\left\{ i\in I| \theta _i<\theta _{k_{min}}\right\} $, and $\varGamma ^{k_{min},nd*}_{v}=\left\{ j\in I| \theta _j\ge \theta _{k_{min}}\right\} $.

Otherwise, there exists a pivotal software user $k^*\in \varOmega _M$ and a mixed strategy for software user $k^*$ strategy, $p^{v*}_{k^*}\in [0,1]$, such that

$$\begin{aligned} c_s=\delta \left( D\sum _{i\in \varGamma _{nu}^{k^*,nd*}}\theta _i+(1-p^{v*}_k)D\theta _{k^*}\right) \end{aligned}$$

(21)

Then the Nash equilibrium is

$$\begin{aligned}&\left( A^{(1-\alpha )*}_M,\left( A^{(1-\alpha )*}_{M,i}\right) _{i\in I}\right) \\&=\biggl (\bigl (\rho ^*(S),(1-\rho ^*)(X)\bigr ),\bigl ((nu)_{i\in \varGamma _{nu}^{k^*,nd*}}, (p^{v*}_{k^*}(v),(1-p^{v*}_{k^*})(nu)), (v)_{j\in \varGamma _v^{k^*,nd*}}\bigr )\biggr )\nonumber \end{aligned}$$

(22)

Where $\rho ^*=\frac{c_v}{\theta _{k^*}\delta (v+D)}$, $\varGamma ^{k^*,nd*}_{nu}=\left\{ i\in I| \theta _i<\theta _{k^*}\right\} $, and $\varGamma ^{k^*,nd*}_{v}=\big \{j\in I| \theta _j>\theta _{k^*}\big \}$.

Now to solve for the Nash equilibria under the Disclosure sub-game. Notice that both the hacker and the software users have three actions they could each take. In Sect. 2.2, the equilibria cases followed from the relation between the cost of searching and the expected payoffs from searching. However, due to the new action available to the software users, (v), and the enrollment fee, $\phi _u$, there now exist extra cases dependent on Eqs. 17 and 18.

If there are both high or medium search costs and high exploitation costs, i.e. $c_s>\widehat{\delta } D\sum _{i\in I}\theta _i$ and $\phi _u>D\sum _{i\in I}\theta _i$, then notice that both searching for Zero- and N-Days are too costly, therefore, the hacker will always exit the game. Given this strategy, the workers will all not update. Hence, the Nash equilibrium is $(A^{\alpha *}_M,(A^{\alpha *}_{M,i})_{i\in I})=\left( (X),(nu)_{i\in I}\right) $.

The last case to examine is when the exploitation costs of the N-Day are low.

Theorem 3

If $c_s>\widehat{\delta }D\sum _{i\in I}\theta _i$ and $\phi _u\le D\sum _{i\in I}\theta _i$, while the software users face $c_v<c_u+\phi _u$, and

$$\begin{aligned} \phi _u<D\sum _{i\in I\setminus \varOmega _M}\theta _i \end{aligned}$$

(23)

Then the Nash equilibrium is

$$\begin{aligned} (A^{\alpha *}_M,(A^{\alpha *}_{M,i})_{i\in I})=\biggl ((E),\bigl ((nu)_{i\in \varGamma ^{d*}_{nu}},(v)_{j\in \varGamma ^{d*}_v}\bigr )\biggr ) \end{aligned}$$

(24)

Where $\varGamma ^{d*}_{nu}=\left\{ i\in I\setminus \varOmega _M\right\} $ and $\varGamma ^{d*}_{v}=\left\{ j\in \varOmega _M\right\} $.

Otherwise if $c_s>\widehat{\delta }D\sum _{i\in I}\theta _i$ and $\phi _u\le D\sum _{i\in I}\theta _i$, while the software users face $c_v<c_u+\phi _u$, and there exists $k^*\in \varOmega _M$ and a mixed strategy for software user $k^*$, $p^{v*}_{k^*}\in [0,1]$, such that

$$\begin{aligned} \phi _u=D\sum _{i\in \varGamma _{nu}^*}\theta _i+(1-p^{v*}_{k^*})D\theta _{k^*} \end{aligned}$$

(25)

Then the Nash equilibrium of the game is

$$\begin{aligned} (A^{\alpha *}_M,(A^{\alpha *}_{M,i})_{i\in I})=\biggl ((\rho ^*(E),(1-\rho ^*)(X)),\bigl ((nu)_{i\in \varGamma ^{d*}_{nu}},(p^{v*}_{k^*}(v),(1-p^{v*}_{k^*})(nu)),(v)_{j\in \varGamma ^{d*}_v}\bigr )\biggr ) \end{aligned}$$

(26)

Where $\varGamma ^{d*}_{nu}=\left\{ i\in I|\theta _i<\theta _{k^*}\right\} $, $\varGamma ^{d*}_{v}=\left\{ j\in I|\theta _j>\theta _{k^*}\right\} $, and $\rho ^*=\frac{c_v}{\theta _{k^*}(v+D)}$.

4.1 Welfare Analysis

Now to investigate whether this new “Extended Coverage” will be a welfare improving policy. This section flows as follows: First, define the optimal policy; Then, the welfare improving policy will be solved for each of the different cost scenarios.

Definition 2

The optimal policy $\varPsi ^*\in \{Microsoft, Disclosure, Non-Disclosure\}$ is chosen such that:

(27)

Where $((A^{\alpha *}_d,A^{(1-\alpha ) *}_d), (A^*_i)_{i\in I})$, $(A^*_{nd})$, and $((A^{\alpha *}_M,A^{(1-\alpha ) *}_M), (A^{\alpha *}_{M,i}, A^{(1-\alpha )*}_{M,i})_{i\in I})$ are the Nash equilibria of the Disclosure, Non-Disclosure, and Microsoft policies, respectively.

Beginning with the high search cost case, recall that the equilibria of the Microsoft policy game are split into two sub-cases. These two cases can be identified by Inequality 18. If $\phi _u>D\sum _{i\in I}\theta _i$, then both Microsoft and Non-Disclosure are optimal policies. However, if $\phi _u\le D\sum _{i\in I}\theta _i$, then Non-Disclosure is the optimal policy.

Therefore, for the new policy to be effective under high search costs, the extended service fee must be large. Also notice that if $\phi _u\le D\sum _{i\in I}\theta _i$, i.e. the exploitation fee is low, then the Nash equilibrium of the hacker exit when a vulnerability is not found and to mix between exploitation of the N-Day and exiting the game. Then, Microsoft is preferred to Disclosure when

$$\begin{aligned} \rho ^*_M\left[ \sum _{i\in \varGamma ^{M*}_{nu}}\theta _i+(1-p^{M*}_{k^*})\theta _{k^*}\right] +\xi ^{M*}c_v<(v+D)\sum _{i\in \varGamma ^{d*}_{nu}}\theta _i \end{aligned}$$

(28)

Given medium search costs and high exploitation costs, the welfare equation for the software users is

$$\begin{aligned} \sum _{i\in I}U_M(A^{\alpha *}_M,A^{(1-\alpha ) *}_M,A^{\alpha *}_{M,i},A^{(1-\alpha )*}_{M,i},\theta _i)=v\sum _{i\in I}\theta _i \end{aligned}$$

(29)

Therefore, compared to Disclosure, the software users do not need to either update or be hacked via the released patch, and compared to Non-Disclosure, the hacker is not going to be searching for a Zero-Day, and thus the software users will not bear the burden of the expected damages. Hence, as discussed in Theorem 4, the new policy proposed by Microsoft is optimal.

The next case to discuss is when the exploitation cost is low, $\phi _u\le D\sum _{i\in I}\theta _i$, and the cost of installing the new version is less than the cost of updating, $c_v\le c_u+\phi _u$. Comparing the new Microsoft policy to Disclosure and Non-Disclosure, the following inequality describes when the new Microsoft policy is optimal.

(30)

Finally, if $c_v>c_u+\phi _u$, then, under the Disclosure sub-game, the high-type software users will update. Whereas, in the Non-Disclosure sub-game, the high-type software users will install the new version of the software to protect their computers, hence $\rho ^{d*}_M\not =\rho ^{nd*}_M$. Thus yields the following condition for when “Extended Support” of Windows 7 is the optimal policy.

(31)

Theorem 4

Let $\widehat{\delta }D\sum _{i\in I}\theta _i\le c_s<\delta D\sum _{i\in I}\theta _i$. Then the cases satisfying Inequality 18 are

1.
If $\phi _u>D\sum _{i\in I}\theta _i$, then Microsoft is the optimal policy.
2.
If $\phi _u\le D\sum _{i\in I}\theta _i$, $c_v\le c_u+\phi _u$, and Inequality 30 is satisfied, then Microsoft is an optimal policy.
3.
If $\phi _u\le D\sum _{i\in I}\theta _i$, $c_v\le c_u+\phi _u$, and Inequality 30 is not satisfied, then Microsoft is not an optimal policy.
4.
If $\phi _u\le D\sum _{i\in I}\theta _i$, $c_v>c_u+\phi _u$, and Inequality 31 is satisfied, then Microsoft is an optimal policy.
5.
If $\phi _u\le D\sum _{i\in I}\theta _i$, $c_v>c_u+\phi _u$, and Inequality 31 is not satisfied, then Microsoft is not an optimal policy

Notice that $\phi _u$ can be used as a weapon to harm hackers. In order for Microsoft’s new policy to be effective under medium search costs, the optimal extended service fee and cost of installing the new version are interdependent. The first way for Microsoft to maximize software user welfare is to pick a very large support fee, i.e. high exploitation costs. This prices the hacker out of the market, while also allowing for the software users to not have to pay to install updates or update their software version since the hacker is priced out of the exploitation market. However, under low exploitation costs, for the Microsoft policy to maximize software user welfare they must choose $c_v$ such that either Inequality 30 or Inequality 31 hold.

5 Conclusion

Sun Tzu said: “Know thy self, know thy enemy. A thousand battles, a thousand victories.” This sentiment is just as relevant in cybersecurity as it was in the 5$^{th}$ century BC. The optimal policy debate should be centered around how policies influence both the hacker’s and software users’ behavior. The ease with which the hacker is able to infiltrate the network can be decreased via appropriate disclosure policies. Since the cost of searching for Zero-Days has drastically increased over the last couple of years, the hacker desires more disclosure to decrease his costs. Hence, Disclosure can only be an optimal policy in cases when the cost to the hacker of searching for a Zero-Day vulnerability is small. The policies of Non-Disclosure and Microsoft’s new policy both decrease hacker interference in the network as well as increase overall software user welfare.

The idea of this paper is to push the vulnerability disclosure literature toward thinking about the appropriate assumptions faced by hackers, software users, and software vendors. As the title implies, this is a simplified explanation of the problem that firms face. For example, the Equifax hack can be traced to an unpatched vulnerability, however there is more at play than is discussed in this static model. Many firms do not immediately update their software packages since doing so may inadvertently negatively affect other software packages. This is beyond the scope of this paper, as this is an introduction to a theoretical approach to the problem, and will be a focus of future research.

Notes

1.
See https://www.microsoft.com/en-us/microsoft-365/blog/2018/09/06/helping-customers-shift-to-a-modern-desktop/.

References

Rescorla, E.: Is finding security holes a good idea? IEEE Secur. Priv. 3(1), 14–19 (2005)
Article Google Scholar
Arora, A., Telang, R., Hao, X.: Optimal policy for software vulnerability disclosure. Manag. Sci. 54(4), 642–656 (2008)
Article Google Scholar
Png, I.P.L., Tang, C.Q., Wang, Q.-H.: Hackers, users, information security. In: WEIS Conference Proceedings (2006)
Google Scholar
Hong, Y., Neilson, W.: Cybercrime and punishment: a rational victim model. Working Paper (2018)
Google Scholar
Becker, G.S.: Crime and punishment: an economic approach. In: Fielding, N.G., Clarke, A., Witt, R. (eds.) The Economic Dimensions of Crime, pp. 13–68. Springer, London (1968). https://doi.org/10.1007/978-1-349-62853-7_2
Chapter Google Scholar
Choi, J.P., Fershtman, C., Gandal, N.: Network security: vulnerabilities and disclosure policy. J. Ind. Econ. 58(4), 868–894 (2010)
Article Google Scholar
Arora, A., Nandkumar, A., Telang, R.: Does information security attack frequency increase with vulnerability disclosure? An empiricial analysis. Inf. Syst. Front. 8(5), 350–362 (2006)
Article Google Scholar
Ozment, A.: Bug auctions: vulnerability markets reconsidered. In: Workshop on the Economics of Information Security (2004)
Google Scholar
Coyne, C., Leeson, P.: Who’s to protect cyberspace? J. Law Econ. Policy 2, 473–496 (2005)
Google Scholar
Laszka, A., Zhao, M., Grossklags, J.: Banishing misaligned incentives for validating reports in bug-bounty platforms. In: Askoxylakis, I., Ioannidis, S., Katsikas, S., Meadows, C. (eds.) ESORICS 2016. LNCS, vol. 9879, pp. 161–178. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-45741-3_9
Chapter Google Scholar
Kuehn, A., Mueller, M.: Analyzing bug bounty programs: an institutional perspective on the economics of software vulnerabilities. In: TPRC Conference Paper (2016)
Google Scholar
Ion, I., Reeder, R., Consolvo, S.: “...no one can hack my mind”: comparing expert and non-expert security practices. In: Symposium on Usable Privacy and Security (SOUPS) (2015)
Google Scholar

Download references

Author information

Authors and Affiliations

CEPA, McCombs School of Business, The University of Texas at Austin, Austin, TX, 78712, USA
Taylor J. Canann

Authors

Taylor J. Canann
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Taylor J. Canann .

Editor information

Editors and Affiliations

University of Melbourne, Melbourne, VIC, Australia
Tansu Alpcan
Washington University in St. Louis, St. Louis, MO, USA
Yevgeniy Vorobeychik
University of Maryland, College Park, College Park, MD, USA
John S. Baras
KTH Royal Institute of Technology, Stockholm, Sweden
György Dán

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Canann, T.J. (2019). Toward a Theory of Vulnerability Disclosure Policy: A Hacker’s Game. In: Alpcan, T., Vorobeychik, Y., Baras, J., Dán, G. (eds) Decision and Game Theory for Security. GameSec 2019. Lecture Notes in Computer Science(), vol 11836. Springer, Cham. https://doi.org/10.1007/978-3-030-32430-8_8

Download citation

DOI: https://doi.org/10.1007/978-3-030-32430-8_8
Published: 23 October 2019
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-32429-2
Online ISBN: 978-3-030-32430-8
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Toward a Theory of Vulnerability Disclosure Policy: A Hacker’s Game

Abstract

Similar content being viewed by others

Software Security Economics: Theory, in Practice

Assessing vulnerability exploitability risk using software properties

Understanding the Vulnerability Lifecycle for Risk Assessment and Defense Against Sophisticated Cyber Attacks

Keywords

1 Introduction