Keywords

1 Introduction

One of the most important aspects of modeling taxation is tax control and so it has been the subject of continuous interest in much recent research. The standard model of taxation used to describe the behavioral relationship between taxpayers and the tax authority is static. For example, [1, 1719] formulate mathematical models of tax evasion and auditing. Two of the most famous works [17] and [18] applied a game-theoretical approach to describe the taxation problem for the first time: they have presented the interaction between tax authority and taxpayers based on a hierarchical game “principal-to-agent.” As well as in [1] and [19], optimal strategies are defined as optimal scheme or optimal contract, including tax and penalty rates and the probabilities of auditing.

However as tax collection is a periodic event, this problem can be considered as a dynamic process which occurs, for example, on a yearly basis. Moreover, it has been proved that a total tax audit is quite expensive and so the tax authority must define a method which helps to collect taxes while minimizing the costs of inspections. In particular, spreading information about the positive social benefits of tax collection or possible penalties for tax evasion through social networks and media provides a tool of control of large group of taxpayers.

A major difference between the models which have been studied in the past and our current study is that we combine an approach based on game theory with one accounting for the process of spreading information. Recent studies have been shown that the process of spreading information resembles an epidemic process and hence it is possible to use a modification of the Susceptible-Infected-Recovered (SIR) model to characterize the propagation of information.

As in classical SIR model we consider a large but finite population of taxpayers which is divided into several subgroups subject to the relevance of information. The entire population may be sorted out to Uninformed, Informed, Indifferent, and Resistant. The subgroup Uninformed consists of agents, who do not have information about a future tax auditing campaign. Informed agents received information and disseminate it if it is important to them. Indifferent taxpayers get information, but is not interesting to them and they may not transfer it.

We suppose that information is propagated through the population by pairwise contacts between spreaders and others in the population of taxpayers. The important fact is that if someone has adopted information and it is important, or she believes in it, then the agent is capable of spreading it to others.

In real life, social networks offer a good platform for interactions among agents in a population and many taxpayers have extensive social contacts and can disseminate information through their contacts network. It is a fact that in a social network information spreads rapidly through different channels without many restrictions and it is possible to consider the Internet and social networks as an effective tool for the propagation of information. However people are more likely to believe news from their friends and relatives while network information must be verified and from time to time as it has a lack of credibility. Hence we have to take into account a group of agents who ignore the received news. The scale-free structure of the Internet implies that each agent with an access to the social network has a statistically significant probability of having a very large number of contacts, which can be estimated by the average connectivity of the network.

During the past decades different models for the propagation viruses and information in networks have been developed. One of the first papers, applying epidemic processes to the spreading of the rumors, ideas, and information, is [3]. In [14] the spreading of computer virus over the network is considered as the propagation of epidemic process.

In this paper, we establish a complex control-theoretic model to design tax authority control strategies through the propagation of information and advertising the bonuses of participating in tax collection campaign to mitigate the impact of nonpayment on society. Information transmission can be represented by dynamics on a graph where vertices denote individuals and an edge connecting a pair of vertices indicates interaction between individuals. Due to a large population of people involved in the process of spreading rumors and information, random graph models such as scale-free networks in [7, 13] are convenient to capture the heterogeneous patterns in the large scale complex network.

This paper is organized as follows. In Sect. 2, we review a static game-theoretical model of tax control. In Sect. 3, we formulate and analyze the complex model of propagation information through the social network in a population of taxpayers; we formulate an optimal control problem and present a structure of an optimal program for spreading information by the tax authority, employing Pontryagin’s maximum principle. In Sect. 4 we show a modification of algorithm which forms a scale-free network. Finally, we present simulations and conclusions about the model and discuss the impact of parameters to the system so that we can come up with suggestions for possible preventative or control methods.

2 Static Model

Based on a game-theoretical model presented in [1] in this section we present a static model of tax control. In the mentioned model the tax authority (high level of the hierarchy) and N taxpayers (low level of the hierarchy) are players. Each taxpayer has income level equal to i j , where \(j = \overline{1,N}\). At the end of every tax period the jth taxpayer can declare her income as r j which can be less or equal to her true income i j (r j  ≤ i j for each \(j = \overline{1,N}\)).

After collecting the tax returns the tax authority audits taxpayers with the probability \(\overline{p}\). The tax auditing supposed to be absolutely effective, that is, it reveals the existing evasion.

Let ξ be the tax rate, π be the penalty rate (these rates are assumed to be constants). If the evasion is revealed as a result of a tax audit, then the evaded taxpayer should pay unpaid tax and the penalty, which depends on the evasion level: (ξ +π)(i j r j ).

Then the jth taxpayer’s expected payoff is defined from the equation

$$\displaystyle{ \beta _{j} = i_{j} -\xi r_{j} -\overline{p}\ (\xi +\pi )(i_{j} - r_{j}), }$$
(1)

where the first summand is always paid by the taxpayer (pre-audit payment), and the second—as the result of the tax auditing—made with probability \(\overline{p}\) (post-audit payment).

Then the tax authority’s net income can be defined as

$$\displaystyle{ J =\sum _{ j=1}^{N}\left (\xi r_{ j} + \overline{p}(\xi +\pi )(i_{j} - r_{j}) -\overline{p} \cdot \overline{c}\right ), }$$
(2)

where \(\overline{c}\) is the unit cost of one audit. Let’s define it as

$$\displaystyle{ \overline{c} = \frac{B} {\nu }, }$$
(3)

where B is the tax authority’s budget, ν is the share of the audited taxpayers:

$$\displaystyle{ \overline{p} = \frac{\nu } {N}. }$$
(4)

Naturally, every players’ aim is to maximize their expected payoffs.

For obtaining the further results we should use the next proposition, which was formulated for the model, described above, and proved in [1].

Proposition 1.

Let the inequality

$$\displaystyle{ (\xi +\pi )i_{j} \geq \overline{c}, }$$
(5)

be fulfilled for the subset {1,N 0 } (N 0 < N) of N taxpayers. The optimal tax authority’s strategy is \(p^{{\ast}} = \frac{\xi } {\xi +\pi }\) for every \(j = \overline{1,N_{0}}\) . The jth taxpayer’s optimal strategy is

$$\displaystyle{ r_{j}^{{\ast}}(\overline{p}) = \left \{\begin{array}{ll} 0, &\mbox{ if $\overline{p} <p^{{\ast}}$,} \\ i_{j},&\mbox{ if $\overline{p} \geq p^{{\ast}}$.} \end{array} \right. }$$

Let the inequality ( 5 ) be not fulfilled for every \(j = \overline{N_{0} + 1,N}\) . The optimal tax authority’s strategy is \(\overline{p} = 0\) . The jth taxpayer’s optimal strategy is r j (0) = 0.

The first case of the Proposition 1 is about the optimal strategy of tax authority in terms of what the tax audit is profitable for it (inequality (5) is satisfied). In response, the optimal strategy for lower level players is to decide to pay taxes or not, depending on the probability of being audited, chosen by the top player. This result is similar to the “threshold rule,” obtained in [19] for another mathematical model of tax control.

The second case is a pessimistic situation. In this case, the tax authority does not have sufficient funds to carry out the necessary tax audits. Taxpayers are rational, and in these conditions they can afford not to pay anything.

3 Dynamic Model of Spreading Information

As we indicated above, taxation is a regular process and can be formulated as a dynamic model that takes into account the dissemination of information as a factor to stimulate tax compliance.

Let’s assume that at the first moment of the process the audit probability \(\overline{p} = 0\). The total population of rational taxpayers evades of taxation in accordance with the second case of the Proposition 1.

In practice there is no information about the relation of parameters in (5) for every \(j = \overline{1,N}\), therefore, the tax authority does not know whether auditing is profitable or not. Moreover, as the tax authority’s budget B is strongly limited, auditing with probability p is practically impossible.

Therefore, the tax authority has to stimulate unaudited taxpayers to pay tax that corresponds to their true income level. The means of such stimulation is spreading information about future audits to the taxpayers (which can be irrelevant in general). This information makes rational taxpayers think that the audit probability is high enough that paying taxes is less costly than evading them and risking having to pay back-taxes along with the penalties. Within the framework of this model, this information is given by inequality

$$\displaystyle{ \overline{p} \geq p^{{\ast}}, }$$
(6)

where \(p^{{\ast}} = \frac{\xi } {\xi +\pi }\) (due to Proposition 1).

The tax authority spreads this information with the intensity u(t) (the share of the Informed taxpayers per unit time), \(u \in [0;\overline{u}]\), where \(\overline{u}\) is the possible maximum value of control function u.

3.1 Scheme of Spreading Information

Here we consider the process of spreading information over the network of contacts modeled as a scale-free network. Each node of such network represents a taxpayer, who gets the information and propagates it over her social contacts, internet, social networks, etc.

The entire population of taxpayers is divided into four subgroups according to their relation to spreading information (see [6, 8, 13]):

  • Uninformed taxpayers S (we denote the number of agents in this group as n S ). They do not have any information about future auditing and, therefore, do not pay taxes (r j  = 0 due to Proposition 1).

  • Informed taxpayers I (group n I ). These taxpayers receive information and propagate it: at first, they pay the taxes corresponding to their true income levels (ξ i j ); second, they spread the information over Uninformed taxpayers.

  • Indifferent taxpayers E (group n E ). They get information, but do not propagate it: they do not spread the information over Uninformed taxpayers and pay the taxes and penalties ((ξ +π)i j ) if and only if they were audited;

  • Resistant taxpayers R (group n R ). The taxpayers from this subgroup are those who lost their interest in the information, because they paid taxes and propagated the information or, vice versa, did not propagate it and, so, paid penalties. In any case, the information becomes irrelevant for them.

Let’s denote shares of Uninformed, Informed, Indifferent, and Resistant as

$$\displaystyle{S^{k} = \frac{n_{S}} {N},I^{k} = \frac{n_{I}} {N},E^{k} = \frac{n_{E}} {N},R^{k} = \frac{n_{R}} {N},}$$

where S k + E k + I k + R k = 1 and k is the degree of each taxpayer’s connections at time t. Initial states are I k(t 0) = I 0 k > 0, E k(t 0) = E 0 k > 0, R k(t 0) = R 0 k > 0, S k(t 0) = 1 − I 0 kE 0 kR 0 k.

The scheme of information spreading in a population of taxpayers is presented in the following diagram. See Fig. 1.

Fig. 1
figure 1

Scheme of information spreading in a taxpayers population

3.2 Constructing the Aggregated System Profit

In the current study, the aggregated system profit consists of two different parts: first, profit which was received by the tax authority from the propagation of information and, second, tax auditing. This step gives us the following conclusions.

The first conclusion is that in the model examined there is a two-component budget

$$\displaystyle{ B =\int _{ 0}^{T}\left (b_{ 1}(\overline{p}(t)) + b_{2}(u(t))\right )dt, }$$
(7)

where \(b_{1}(\overline{p}(t))\) is the cost of auditing with probability \(\overline{p}\) and b 2(u(t)) is the cost for activating the spread of information: it is twice differentiable and increasing function in u(t), such that b 2(0) = 0, b 2(u) > 0, when u(t) > 0.

The second conclusion is that the aggregated system profit also consists of two components:

$$\displaystyle{ J = J_{aud} + J_{inf}, }$$
(8)

where J aud is the tax authority’s net income, obtained as a result of auditing, and J inf is the profit, obtained from spreading information.

The first summand is the post-audit payments of Indifferent taxpayers (from the subgroup E) without total audit cost:

$$\displaystyle{ J_{aud} = g_{E}(E^{k}(T)) - b_{ 1}(\overline{p}(T)), }$$
(9)

where \(b_{1}(\overline{p}(T))\) is defined from

$$\displaystyle{ b_{1}(\overline{p}(T)) = n_{E}\overline{p}(T)\overline{c}, }$$
(10)

and the post-audit payments of the taxpayers from E k are

$$\displaystyle{ g_{E}(E^{k}(T)) = (\xi +\pi )\overline{p}\sum _{ j=1}^{n_{E} }i_{j}. }$$
(11)

To simplify the following reasoning, we will substitute g E (E k(t)) in the next continuous estimation:

$$\displaystyle{ \widehat{g_{E}}(E^{k}(T)) = (\xi +\pi )\overline{p}NE^{k}(T)\hat{i}, }$$
(12)

where \(\hat{i}\) is the average taxpayers’ income.

The second summand of the aggregated system profit is the profit from the propagation of information:

$$\displaystyle{ J_{inf} =\int _{ 0}^{T}\left (f_{ R}(R^{k}(t)) - f_{ E}(E^{k}(t)) - b_{ 2}(u(t))\right )dt, }$$
(13)

where the integrand is a sum of the taxes f R (R k(t)) collected from the Resistant taxpayers R k, without the taxes unpaid by the Indifferent taxpayers from E k and the cost of activating information spreading b 2(u(t)).

The first summand under the integral in (13) is

$$\displaystyle{ f_{R}(R^{k}(t)) =\xi \left (\sum _{ j=1}^{n_{R} }i_{j}\right ). }$$
(14)

The second summand with minus is the unpaid taxes

$$\displaystyle{ f_{E}(E^{k}(t)) =\xi \left (\sum _{ j=1}^{n_{E} }i_{j}\right ). }$$
(15)

We should use the continuous estimations for f R (R k(t)) and f E (E k(t)), as it was done for g E (E k(t)) in (12):

$$\displaystyle{ \widehat{f_{R}}(R^{k}(t)) =\xi NR^{k}(t)\hat{i}, }$$
(16)
$$\displaystyle{ \widehat{f_{E}}(E^{k}(t)) =\xi NE^{k}(t)\hat{i}, }$$
(17)

where \(\hat{i}\) is the average taxpayers’ income.

f R (R k(t)) and f E (E k(t)) (from (16) and (17) correspondingly) are non-decreasing and differentiable functions, such as f R (0) = 0, f E (0) = 0, f R (R k(t)) > 0, f E (E k(t)) > 0 for R k(t) > 0, E k(t) > 0.

The cost for activating information spreading b 2(u(t)) can be defined as

$$\displaystyle{ b_{2}(u(t)) = Nu(t)\tilde{c}, }$$
(18)

where \(\tilde{c}\) is the unit cost of information spreading.

Thus, the aggregated system profit (the tax authority’s net income) is

$$\displaystyle{ J =\int \limits _{ 0}^{T}\left [f_{ R}(R^{k}(t)) - f_{ E}(E^{k}(t)) - b_{ 2}(u(t))\right ]dt + g_{E}(E^{k}(T)) - b_{ 1}(\overline{p}(T)). }$$
(19)

3.3 Constructing the System of Equations

We define a process of spreading information as a system of nonlinear differential equations corresponding to the scheme (Fig. 1). In our study, we use a modification of a classical epidemic model (see [10, 14]):

$$\displaystyle{ \begin{array}{l} \dot{S^{k}} = -\delta _{I}I^{k}S^{k}\varTheta _{I} - uS^{k}; \\ \dot{I^{k}} =\delta _{I}I^{k}S^{k}\varTheta _{I} - (\sigma _{I}+\alpha )I^{k}; \\ \dot{E^{k}} =\alpha I^{k}; \\ \dot{R^{k}} =\sigma _{I}I^{k} + uS^{k};\\ \end{array} }$$
(20)

where control

$$\displaystyle{ \begin{array}{l} 0 \leq u(t) \leq \overline{u} \leq 1,\ \mbox{ for all}\ t \in [0,T],\\ \end{array} }$$
(21)

where \(\overline{u}\) is a boundary value of control; δ I is the rate of spreading of information in subgroup I; σ I is the rates of forgetting of information in subgroup I; and α is a probability that received information is not important for an agent.

Θ I (t) represents a probability that any given link points to an Informed or Indifferent taxpayer (see [7, 14]), as

$$\displaystyle{ \varTheta _{I}(t) =\sum \limits _{k'}\frac{\tau (k')P(k'\vert k)I_{k'}^{I}} {k'}, }$$
(22)

where τ(k) denotes the infectivity of a node with degree k [7, 14]:

  1. 1.

    τ(k) ≤ k;

  2. 2.

    τ(k) is monotonically increasing;

  3. 3.

    \(\lim \limits _{k\rightarrow \infty }\tau (k) = M> 0;\)

P(k′ | k) shows the probability of a node with degree k pointing to a node with degree k′: \(P(k'\vert k) = \frac{k'P(k')} {\langle k\rangle }\), where mean value \(\langle k\rangle =\sum \limits _{k}kP(k)\).

Within the framework of a model statement (20)–(21), we solve the optimal control problem. We find the optimal intensity of information spreading u(t), which gives maximum to the functional (13)

$$\displaystyle{ J_{inf} =\int \limits _{ 0}^{T}\left [f_{ R}(R^{k}(t)) - f_{ E}(E^{k}(t)) - b_{ 2}(u(t))\right ]dt \rightarrow \max. }$$
(23)

3.4 Optimal Control Problem of Propagation Information

We find the optimal propagation strategy u to the problem described above applying Pontryagin’s maximum principle [5, 15]. We define the associated Hamiltonian H and adjoint functions λ S , λ I , λ E , λ R as follows:

$$\displaystyle{ \begin{array}{l} H = -b_{2}(u) - f_{E}(E^{k}) + f_{R}(R^{k}) + (\lambda _{I} -\lambda _{S})\delta _{I}S^{k}I^{k}\varTheta _{I} + (\lambda _{R} -\lambda _{S})uS^{k}+ \\ (\lambda _{E} -\lambda _{I})\alpha I^{k} + (\lambda _{R} -\lambda _{I})\sigma _{I}I^{k}. \end{array} }$$
(24)

Adjoint system is

$$\displaystyle{ \begin{array}{l} \dot{\lambda }_{S}(t) = -\frac{\partial H} {\partial S^{k}} = (\lambda _{S} -\lambda _{I})\delta _{I}I^{k}\varTheta _{I} + (\lambda _{S} -\lambda _{R})u; \\ \dot{\lambda }_{I}(t) = -\frac{\partial H} {\partial I^{k}} = (\lambda _{S} -\lambda _{I})\delta _{I}S^{k}\varTheta _{I} + (\lambda _{I} -\lambda _{E})\alpha + (\lambda _{I} -\lambda _{R})\sigma _{I}; \\ \dot{\lambda }_{E}(t) = -\frac{\partial H} {\partial E^{k}} = f'_{E}(E^{k}); \\ \dot{\lambda }_{R}(t) = -\frac{\partial H} {\partial R^{k}} = -f'_{R}(R^{k});\\ \end{array} }$$
(25)

with the transversality conditions given by

$$\displaystyle{ \lambda _{S}(T) = 0,\ \lambda _{I}(T) = 0,\ \lambda _{E}(T) = 0,\ \lambda _{R}(T) = 0. }$$
(26)

According to the Pontryagin’s maximum principle [15], there exist continuous and piecewise continuously differentiable co-state functions \(\overline{\lambda } = (\lambda _{S},\lambda _{I},\lambda _{E},\lambda _{R})\) that at every point t ∈ [0, T], where u is continuous, satisfy (25) and (26). In addition, we have

$$\displaystyle{ u \in \text{arg}\max \limits _{\underline{u}\in [0,\overline{u}]}H(\overline{\lambda },(S^{k},I^{k},E^{k},R^{k}),\underline{u}). }$$
(27)

The derivative of Hamiltonian by u is

$$\displaystyle{ \frac{\partial H} {\partial u} = -b'_{2}(u) + (\lambda _{R} -\lambda _{S})S^{k} \geq 0. }$$
(28)

It is easy to see that Hamiltonian H reaches its maximum if condition (28) is satisfied.

According to the standard approach our main results are formulated in the following proposition and auxiliary lemmas:

Lemma 1.

Function ϕ is decreasing over the time interval [0,T).

Lemma 2.

For all t, 0 < t < T the following condition holds (λ R −λ S ) < 0.

Proofs of Lemmas 1 and 2 follow the same technique as in [4, 11].

Proposition 2.

In the problem statement ( 20 ), ( 21 ) ( 23 ) optimal control u(t) has the following structure:

  • When b 2 (⋅) is concave function for ( 23 ), then there exist the time moments \(\overline{t},\ \overline{t} \in [0,T]\) such as

    $$\displaystyle{ u(t) = \left \{\begin{array}{l} \overline{u},\ \mbox{ if}\ \phi> b_{2}(\overline{u})/\overline{u},\mbox{ }for\ 0 <t <\overline{t}; \\ 0,\ \mbox{ if}\ \phi <b_{2}(\overline{u})/\overline{u},\ \mbox{ }for\ \overline{t} <t <T.\end{array} \right. }$$
    (29)
  • When b 2 (⋅) is strictly convex function, then there exist the time moments \(t_{0},\ \overline{t} \in [0,T]\) , \(0 \leq t_{0} \leq \overline{t} \leq T\) such as:

    $$\displaystyle{ u(t) = \left \{\begin{array}{@{}l@{\quad }l@{}} \overline{u}\, \mbox{ on} 0 <t \leq t_{0}; \quad \\ \mbox{ is continually decreasing function}\, \mbox{ on} t_{0} <t \leq \overline{t};\quad \\ 0\, \mbox{ on} \overline{t} \leq t \leq T; \quad \\ \quad \end{array} \right. }$$
    (30)

where ϕ = (λ R −λ S )S k is switching function for control problem ( 20 ), ( 24 ), ( 25 ), \(\overline{u}\) is defined in ( 21 ).

Rewrite Hamiltonian in the following form:

$$\displaystyle{ \begin{array}{l} H = -f_{E}(E^{k}) + f_{R}(R^{k}) + (\lambda _{I} -\lambda _{S})\delta _{I}S^{k}I^{k}\varTheta _{I} + (\phi u - b_{2}(u))+ \\ (\lambda _{E} -\lambda _{I})\alpha I^{k} + (\lambda _{R} -\lambda _{I})\sigma _{I}I^{k}.\end{array} }$$
(31)

To prove the main statement of the Proposition 2 we consider two cases:

  1. (1)

    Consider a case when b 2(⋅ ) is concave.

Since function b 2 is concave (b 2″ ≤ 0), then (u ϕb 2(u)) is convex function of u in (31) and for any t ∈ [0, T] it reaches its maximum either at \(u(t) = \overline{u}\) or u(t) = 0. From (31) we have that optimal u(t) satisfies \(u\phi - b_{2}(u) \geq \underline{ u}\phi - b_{2}(\overline{u})\), where u is any admissible control, \(\underline{u} \in [0,\overline{u}]\). If \(u = \overline{u}\), then switching function is satisfied \(\phi \geq b_{2}(\overline{u})/\overline{u}\) and if u = 0, then \(\phi \leq b_{2}(\overline{u})/\overline{u}\).

Lemma 1 suggests that ϕ is decreasing function, then there can be at most one moment t ∈ [0, T] at which \(\phi (t) = b'_{2}(\overline{u})\), moreover if such moment exists, for example, \(\overline{t}\), then \(\phi (t)> b_{2}(\overline{u})/\overline{u}\) on \(0 \leq t <\overline{t}\) and \(\phi (t) <b_{2}(\overline{u})/\overline{u}\) on \(\overline{t} <t \leq T.\) Then, (29) is satisfied.

  1. (2)

    Let cost function b 2(⋅ ) be strictly convex.

If function b 2 is strictly convex (b 2″ > 0), then minimizer of (u ϕb 2(u)) is unique. Expression (28) implies that if \(\frac{dH} {du} = -b'_{2}(u)+\phi = 0\) at optimal u else \(u \in [0,\overline{u}]\).

Thus, from continuity of functions ϕ and b2 follows that u is continuous at all t ∈ [0, T]. As far as b 2 is strictly convex, then \(b'_{2}(\overline{u})> b'_{2}(0)\), \(\overline{u}> 0\). Lemma 1 requires that there exist time moments t 0, \(\overline{t}\), such as \(0 <t_{0} <\overline{t} <T\), which are defined from the following conditions:

$$\displaystyle{ u(t) = \left \{\begin{array}{@{}l@{\quad }l@{}} 0\, \mbox{ if}\ \frac{db_{2}(0)} {du} \leq \phi; \quad \\ \frac{db_{2}^{-1}(\phi )} {du} \, \mbox{ if}\quad \frac{db_{2}(\overline{u})} {du} \leq \phi <\frac{db_{2}(0)} {du};\quad \\ \overline{u}\, \mbox{ if}\ \phi <\frac{db_{2}(\overline{u})} {du}. \quad \end{array} \right. }$$
(32)

4 Scale-Free Network

Having considered scale-free network as a tool to structure the population of taxpayers and an engine for effective information spreading we estimated the number of contacts as the average connectivity of the network 〈k〉 and suppose that each node has approximately the same number of connections. Usually scale-free network is defined as a random graph whose degree distribution follows a power law and the main characteristic of the network does not depend on its size [7, 14]. The probability that a node of these networks has k connections follows a scale-free distribution P(k) ∼ k γ with an exponent γ that ranges between 2 and 3.

We studied the SEIR model of spreading information over a scale-free network (SF network), taking into account the impact of scale-free connectivity into the process of propagations. Based on the algorithm which had been studied in [14] in the present paper we introduce a modification of the algorithm of constructing a SF network. We develop a software product that allows for the observation of the process of dissemination of information and tracking changes in the networks settings. Below we show the key point of the algorithm:

  • Initially (in the moment t 0), the number of unrelated nodes m 0 is small.

  • At any time t i  = t i−1 + 1 we add a new node with m links that point to an existing node i with k i links according to the probability

    $$\displaystyle{ P(i) = \frac{k_{i}} {\sum _{j}k_{j}}. }$$
    (33)

Here m 0 and m are the parameters, defined by the user, which have some restrictions. The parameter m characterizes the average number of connections of a single individual 〈k〉 = 2m. Suppose that the parameter m 0 must be not more than m due to the following considerations: if m 0 > m and the next node is added then, according to (33), the nodes with a zero probabilities of further connection remain in this network. Then the constructed graph will be disconnected and contains nodes with no neighbors. These nodes represent individuals who do not have any contacts in the population and, therefore, are not involved in the epidemic process. These individuals can then be eliminated. After iterating this process we obtain a network with N nodes with connectivity distribution P(k) ∼ k γ.

The detailed process of forming the SF network can be divided into two main stages:

Step 1.:

We build m 0 disconnected nodes. Then, while the number of available nodes is not more than m, we add node by node, which immediately get communication with others. This approach helps to avoid the unacceptable situation when two nodes are connected by two or more links.

Step 2.:

When there are more than m nodes in the network we can use Eq. (33). Before the size of the network reaches N, the nodes are added one by one. New node gets the link with one of old nodes according to the calculated probability. At each iteration the denominator in Eq. (33) changes. Nodes that have already connected to the added one are not involved in the process anymore. When the size of the network reaches N, the algorithm is stopped.

Using a scale-free network to assign the connections between agents in the populations of taxpayers we consider a process of propagating information which resembles an epidemic, then we must define the parameters of the proposed SEIR model: δ I , σ I are the transition coefficients; the initial distribution of the states of nodes; \(\overline{p}\) is the audit probability and \(\overline{c}\) is the audit cost; ξ and π are the tax and penalty rates correspondingly; distribution of the population by income level.

As has been shown in the previous research, an epidemic process can take place in different ways on a network, even when given the same parameters. This occurs because the initial distribution of the states of nodes introduces an element of chance—the user defines the number of nodes in a particular state, but does not choose exactly which nodes belong to each group. This also changes the initial value of Θ I , since in one case we can obtain a hub as an Informed, which will increase Θ I , and in the other case we can obtain a node with a small number of connections.

After all the parameters of the epidemic process are defined, the user can run it step by step. At each step, the following actions occur:

  • For each Uninformed S k node which has a connection with an Informed “neighbor” I k, we check if he receives and adopts information in accordance with the specified transition coefficients δ I . If the transmission of information is successful, the Uninformed node changes its status from S k to I k.

  • If a node changed its status from S k → I k, then it is necessary to determine if the received information is important to the agent. It means that information is important for agent with probability (1 −α) and it is indifferent with the probability α. If the node becomes indifferent, then it belongs to E k.

  • Each Informed I k node loses interest in information in accordance with the transition coefficient σ I and becomes resistant to information (R k).

  • after all transitions we recalculate values Θ I and draw the new network.

Below we depict an example of process of propagation of information on small population of taxpayers. In Figs. 24 blue dots correspond to Uninformed taxpayers, red dots—Informed, orange dots—Indifferent, and green dots—Resistant.

Fig. 2
figure 2

An example of the network at different moments t. t = 8 seconds, N = 45, S k(t) = 15, E k(t) = 2, I k(t) = 25, R k(t) = 2, δ I  = 0. 4, α = 0. 1, σ I  = 0. 05, \(\overline{p} = 0.2\)

Fig. 3
figure 3

An example of the network at different moments t. t = 13 seconds, N = 45, S k(t) = 0, E k(t) = 3, I k(t) = 33, R k(t) = 9, δ I  = 0. 4, α = 0. 1, σ I  = 0. 05, \(\overline{p} = 0.2\)

Fig. 4
figure 4

An example of the network at different moments t. T = 42 seconds, N = 45, S k(T) = 0, E k(T) = 3, I k(T) = 7, R k(T) = 35, δ I  = 0. 4, α = 0. 1, σ I  = 0. 05, \(\overline{p} = 0.2\). Results: S k(T) = 0, I k(T) = 0, E k(T) = 3, R k(T) = 42, aggregated system profit is J = 109, 980 monetary units

5 Numerical Simulations

In this section, we present numerical simulations which are used to corroborate the results of the main propositions. We study the model of spreading information with the following parameters: tax and penalty rates are ξ = 0. 13 and π = 0. 13 correspondingly; the value of optimal probability is p  = 0. 5 (according to the fixed values of ξ and π); the value of the actual auditing probability is \(\overline{p} = 0.2\).

We use the distribution of income among the population of Russian Federation in April of 2014 (see [2]) and calculate the average income level as the expected value of the uniform and Pareto distributions [9] (as it was previously done in [12]) to illustrate the simulation results.

We estimate an average monthly income of taxpayers as \(\hat{i} = 30,000\) (rub) (see Table 1). According to the statistical data, costs of audit and information announcements approximately are equal to \(\overline{c} = 7455\) (rub) and \(\tilde{c} = 200\) (rub), respectively. We assume that the duration of time period which is valued to propagate information is T = 0. 5 (130 days). In our paper we consider population of size N = 1000 and initial fractions of Uninformed, Informed, Indifferent, and Resistant are S k(0) = 0. 9, E k(0) = 0, I k(0) = 0, 1, R k(0) = 0. We use as a model parameters \(\sigma = \frac{1} {\overline{T}} = 0.0083\), where \(\overline{T} = 120\) is a period of obsolescence of information, and α = 0. 1. We construct a scale-free network (for N = 1000), according to the algorithms, which are presented in [14, 16], using the next parameters: 〈k〉 = 6, \(P(k) = \frac{2m^{2}} {k^{-3}},m = 5,m_{0} = 4\), Θ I  ∼ 0. 33. Examples of the networks are presented in Figs. 23, and 4.

Table 1 The distribution of income among the taxpayers

In Figs. 56, and 7 we estimate an impact of transitions rate α on SEIR system and compare fractions of I k in five different cases. A higher value of α suggests that the fraction I k is less and application of control decreases a number of I k for the same values of α. This fact shows that spreading information guarantees the increasing of taxpayers in group R k who will pay taxes.

Fig. 5
figure 5

The fraction of Informed taxpayers. Controlled case. Initial states: S k(0) = 0. 9, E k(0) = 0, I k(0) = 0, 1, R k(0) = 0, σ I  = 0. 0083, α = 0. 1. (1) δ I  = 0. 1, I max k = 0. 1, t max  = 0. (2) δ I  = 0. 4, I max k = 0. 21985, t max  = 7. (3) δ I  = 0. 7, I max k = 0. 38927, t max  = 6

Fig. 6
figure 6

The fraction of Informed taxpayers. Uncontrolled case. Initial states: S k(0) = 0. 9, E k(0) = 0, I k(0) = 0, 1, R k(0) = 0, σ I  = 0. 0083, α = 0. 1. (1) δ I  = 0. 1, I max k = 0. 1, t max  = 0. (2) δ I  = 0. 4, I max k = 0. 425751, t max  = 11. (3) δ I  = 0. 7, I max  = 0. 619338, t max  = 7

Fig. 7
figure 7

Fractions of I k in SEIR model, controlled and uncontrolled case. S k(0) = 0. 9, E k(0)  =  0, I k(0)  =  0, 1, R k(0)  =  0, σ I   =  0. 0083, δ I   =  0. 7. (1) Uncontrolled system. α  =  0. 1, I max k  =  0. 6193, t max   =  7, (2) Controlled system. α  =  0. 1, I max k  =  0. 3892, t max   =  6, (3) Controlled system. α  =  0. 3, I max k  =  0. 1831, t max   =  4, (4) Controlled system. α  =  0. 5, I max k  =  0. 1138, t max   =  2, (5) Controlled system. α  =  0. 3, I max k  =  0. 1, t max   =  0

Figures 8 and 9 show the aggregated system profit and demonstrate the influence of parameter δ to collected taxes. We observe that total system costs which consist of J inf and J aud persistently grow depending on increasing of α at large value of δ (Figs. 10 and 11).

Fig. 8
figure 8

Aggregated system profit tax authority throws information into the taxpayers population. S k(0) = 0. 9, E k(0) = 0, I k(0) = 0. 1, R k(0) = 0, σ I  = 0. 0083, α = 0. 7. (1) δ I  = 0. 1, J inf  = 282, 717, J = 282, 945 monetary units. (2) δ I  = 0. 4, J inf  = 158, 526, J = 159, 171 monetary units. (3) δ I  = 0. 7, J inf  = 71, 178, J = 72, 101 monetary units

Fig. 9
figure 9

Aggregated system profit tax authority throws information into the taxpayers population. S k(0) = 0. 9, E k(0) = 0, I k(0) = 0, 1, R k(0) = 0, σ I  = 0. 0083, δ I  = 0. 7. (1) Uncontrolled case. α = 0. 1, J = −98, 018 monetary units. (2) Controlled case. α = 0. 1, J = 701, 013. (3) Controlled case. α = 0. 3, J = 111, 655 monetary units. (4) Controlled case. α = 0. 5 J = 171, 343 monetary units. (5) Controlled case. α = 0. 7, J = 211, 925 monetary units

Fig. 10
figure 10

SEIR model without application of control. S k(0) = 0. 9, E k(0) = 0, I k(0) = 0. 1, R k(0) = 0, σ I  = 0. 0083, δ I  = 0. 7. I max k = 0. 6193, t max  = 7

Fig. 11
figure 11

SEIR model without application of control. S k(0) = 0. 9, E k(0) = 0, I k(0) = 0. 1, R k(0) = 0, σ I  = 0. 0083, δ I  = 0. 7. I max k = 0. 3892, t max  = 4

By studying the impact of various parameters to the population of taxpayers where tax authority propagates information about future tax audit we can draw a conclusion that the amount of collected taxes increases if a number of Informed taxpayers and spreaders grow. At the same time even if the probability α is high, the spreading information provokes the augmentation of tax collection with minimum costs. Therefore we are able to say that this approach can be considered as effective and reasonable method to improve taxation system.

6 Conclusion

In the present paper we have investigated a complex model in which we combine a game-theoretical approach of tax control with a dynamic model of information propagation over a structured population of taxpayers. We formulated an optimal control problem for a tax auditing policy and analyzed the behavior of agents depending on social contacts and specific cost functions. All theoretical results are supported by numerical simulations with the real statistical data. Connections between taxpayers are modeled as a scale-free network constructed by a specially developed algorithm. As a result of our research we attempted to provide a new method of tax collection which can be more efficient and cost-effective.