Keywords

1 Introduction

Natural Computing is a discipline whose aim is the study and implementation of the dynamic processes that occur in the living nature and that are likely to be interpreted as calculation procedures. Membrane computing (P systems) was initiated by P\(\check{a}\)un, which is a class of powerful computing model abstracted from the way that the living cells process chemical compounds, energy and information in their compartmental structures [1].

As a model of computation with universality property, membrane computing is so computationally efficient that it can solve difficult computational NP-complete problems in a polynomial time by creating exponentially membranes [2]. SAT problem was solved by the splitting rule of membrane system [3]. [4] gives a family of P systems to solve All-SAT problem with simplified membrane structure and few evolution rules based on the character of membrane division and parallel processing in the P systems.

Artificial immune systems (AIS) can be defined as computational systems inspired by theoretical immunology, observed immune functions, principles, and mechanisms to solve problems [5]. Negative selection is one of the most discussed algorithms in artificial immune system [6].

In this paper, based on a combination of membrane system and artificial immune systems, we proposed immune classification algorithm by a membrane system. In this system, a set of detectors are created by the given rules firstly, then this collection of detectors called classifier will be used to recognize the category of the unknown objects. It can be also used in some pattern identification and combinatorial optimization problems.

The paper is organized as follows. Section 2 describes the foundation of membrane system and negative selection algorithm. The details of \(\mathrm {\Pi }_\mathrm {NS}\), including the membrane structure, evolving rules and analyses are proposed in Sect. 3. The conclusions and further researches are then discussed in Sect. 4.

2 Related Works

Membrane Computing devices are generically called P systems. They constitute a theoretical computing model of a distributed, parallel and non-deterministic type.

2.1 Cell-Like P System

The main syntactic ingredients of a cell-like P system are the membrane structure shown in Fig. 1, the multi-sets, and the evolution rules. The semantics of the systems are defined through a nondeterministic and synchronous model, by introducing the concepts of configuration, transition step, and computation.

Fig. 1.
figure 1

The structure of cell-like P system

A basic transition P system of degree \(m\ge 1\) is a tuple,

$$\begin{aligned} \mathrm {\Pi } = (O, \mu , \omega _1, \cdots , \omega _m, (R_1, \rho _1), (R_2, \rho _2), \cdots , (R_m, \rho _m), i_o) \end{aligned}$$
(1)

where,

  1. (i)

    O is the alphabet of the system;

  2. (ii)

    \(\mu \) is a membrane structure consisting of m membranes, which are labeled by numbers in the set {1,\(\cdots \), m};

  3. (iii)

    \(\omega _1 \cdots \omega _m\) are multi-sets, representing the objects initially presented in the regions (1,\(\cdots \), m) of the system;

  4. (iv)

    \(\mathrm {R}_1,\cdots ,\mathrm {R}_m\) are finite sets of evolution rules associated with the regions (1,\(\cdots \), m) of \(\mu \); (\(\rho _1,\cdots ,\rho _m\)) are strict partial order relations defined over (\(\mathrm {R}_1,\cdots ,\mathrm {R}_m\)) respectively, specifying a priority relation among the evolution rules; The rule can be described as the form \((u\rightarrow v, \rho _i)\), where, \(u\rightarrow v\) is rewrite rule, and \(\rho _i (1 \le i \le m)\) indicate the priority.

  5. (v)

    \(i_o\) is indicating the output region.

The cell division in P systems plays in a crucial role for generating exponential work space in linear time. Here we consider the following cell membrane interaction operation to formalize in P systems area.

2.2 Negative Selection

The immune algorithm called negative selection was first developed by Forrest [6] for real-time detection of computer virus. In this algorithm, T cells are randomly generated to detect foreign antigens without reacting to self-cells in the thymus. Hence, the mature T cells leaving the thymus will not match the self-cells and will therefore only match the non-self cells. The basic concept of a negative selection algorithm is shown in Fig. 2.

Fig. 2.
figure 2

Negative selection algorithm

3 \(\mathrm {\Pi }_\mathrm {NS}\) for Classification

The P system named \(\mathrm {\Pi }_\mathrm {NS}\) is proposed to create the classifier which will classify the un-label objects. The details of the system will be given in this section.

3.1 Definition

\(\mathrm {\Pi }_\mathrm {NS}\) for classification can defined as:

$$\begin{aligned} \mathrm {\Pi }_{NS} = (O, \mu , \omega _1, \cdots , \omega _m, R, i_o) \end{aligned}$$
(2)
Fig. 3.
figure 3

Initial membrane structure

  1. I.

    O is a finite and non-empty alphabet of objects.

    O = \(\{\varPsi _1, \varPsi _2,\beta ,\gamma ,\gamma ',\lambda ,\delta ,\phi ,\phi _{s},\phi _{ss},\phi _{c},\phi _{cc},\eta ,\eta _0,\eta ',\eta _i,\eta _{ii}\}\)

    \(\cup \,\{\xi _i,0 \le i \le 26\}\,\cup \,\{a_j,b_j,\cdots ,z_j, 1 \le j \le k\}\)

  2. II.

    \(\mu \) is the membrane structure composed of four main membranes in the skin membrane shown in Fig. 3. The membrane G represents gene pool to store the different gene segments as the initial multi-sets which should be placed. The membrane S represents the self-set to store the autologous cells which should also be placed at first, the number of these cells is define as (\(2^k-1\)). The membrane C denotes as classifier which is used to retain the detectors, it is empty at the beginning. The membrane T is the templte of a membrane to generate new membrane u to store the un-label cells.

  3. III.

    \(\omega _i\) is multi-sets, representing the objects for i record of the data set in the regions i.

  4. IV.

    The rules in R should have priority, and the explanations of some rules are given here, k indicates the priority:

    • \([_h u \delta ]_h \rightarrow \mathrm{v},k\)

      The membrane h is dissolved, and the object u is replaced with object v.

    • \([_h u \rightarrow v ]_h,k\)

      In the region named h, remove the multi-set of objects specified by u, and to introduce the objects specified by v.

    • \([_h u \rightarrow (v,in_j)]_h,k\)

      The object v should be moved into the membrane j which is the upper membrane that has not been dissolved, and the object u will be removed from membrane h.

    • \([_h u \rightarrow (v,out)]_h,k\)

      The object v will be moved to the region immediately outside membrane h.

    • \([_h u \rightarrow [_h v]_h]_h,k\)

      The object u will be replaced with a new membrane containing object v.

    • \([_{h1} u ]_{h1} \rightarrow [_{h2} u ]_{h2} [_{h2} u ]_{h2}\)

      All objects and membranes in the original membrane will be duplicated and appear in the two new membranes.

    • \(u_j[_h ]_h [_j ]_j \rightarrow [_j v [_h]_h]j\)

      The membrane will be filtered into j \(_{th}\) membrane by \(u_j\), and \(u_j\) be replaced with symbol v.

  5. V.

    \(i_o\) is the skin membrane to output the result.

The value of k is smaller, and the priority of the corresponding rule is higher. When k = 1, the corresponding rule will have the highest priority.

All of the rules that can be applied must be applied simultaneously. The marked output membranes are never dissolved.

3.2 Rule Set

The rules of P system \(\mathrm {\Pi }_\mathrm {NS}\) in each membrane are as follows:

  1. (1)

    The rule set in the skin membrane called \(R_M\). In this set, \(r_1 \sim r_7\) are used in generation phase, \(r_8 \sim r_{37}\) are used in testing phase. In generation phase, The number of new cell is determined by the number of symbol \(\phi \) denoted as N. The symbol \(\phi _{ss}\) is used to send detectors into self set. The key point of this process is that leaving only one cell in skin membrane, the others all sending into membrane S. In testing phase, The symbol L represents number of detectors to be generated. The symbol K represents the count of each attribute. The priority of \(r_9\) is lower than that of \(r_12 \sim r_{37}\), in other words, sending the objects into membrane u before duplicating membrane u.

    • \(r_1\)\( \varPsi _1 \rightarrow \phi ^N(\xi _0\xi _1\cdots \xi _M,in_G)\)

    • \(r_2\)\( \phi [_0 ]_0\rightarrow \phi _s[_0 ]_0[_0 ]_0\)

    • \(r_3\)\( \phi _s^N\rightarrow \phi _{ss}^N \)

    • \(r_4\)\( \phi _{ss}^N[_0 ]_0\rightarrow (\eta [_0 ]_0,in_s)\)

    • \(r_5\)\( \gamma \rightarrow \gamma '\varPsi _1([_0 ]_0,in_C),2\)

    • \(r_6\)\( \gamma '^L\varPsi _1\rightarrow \sharp ,1 \)

    • \(r_7\)\( \beta \rightarrow \varPsi _1(\eta '',in_0) \)

    • \(r_8\)\( \varPsi _2[_T ]_T\rightarrow \phi ^{L-1}[_T ]_T[_u ]_u\)

    • \(r_9\)\( \phi [_u ]_u\rightarrow \phi _c[_u ]_u[_u ]_u,2 \)

    • \(r_{10}\)\( \phi _c^{L-1}\rightarrow \phi _{cc}^L\)

    • \(r_{11}\)\( \phi _{cc}[_u ]_u\rightarrow (\eta _0[_u ]_u,in_C)\)

    • \(r_{12}\)\( a_j\rightarrow (a_j,in_u),1;(1\le j\le K)\)

    • \(r_{13}\)\( b_j\rightarrow (b_j,in_u),1;(1\le j\le K)\)

    • \(\cdots \)

    • \(r_{37}\)\( z_j\rightarrow (z_j,in_u),1;(1\le j\le K)\)

  2. (2)

    The rule set in membrane G called \(R_G\). The symbol M represents the number of attributes in the data set, its max value is 26. The symbol \(\xi _0\) was generated a new candidate cell. The membrane 0 will be out of membrane G after that all the values have been sent into the membrane by \(r_4 \sim r_{29}\).

    • \(r_1\)\(\xi _i \rightarrow (\xi _i,in_i),1; (1\le i\le M) \)

    • \(r_2\)\( \xi _0\rightarrow [_0 ]_0,1 \)

    • \(r_3\)\( \phi ^M[_0 ]_0\rightarrow ([_0 ]_0,out_G) \)

    • \(r_4\)\( a_j\rightarrow \phi (a_j,in_0),1;(1\le j\le K)\)

    • \(r_5\)\( b_j\rightarrow \phi (b_j,in_0),1;(1\le j\le K)\)

    • \(\cdots \)

    • \(r_{29}\)\( z_j\rightarrow \phi (z_j,in_0),1;(1\le j\le K)\)

  3. (3)

    The rule set in gene membrane of membrane G called \(R_{gi}\), \(1\le i \le M\). The symbol \(\xi _i\) releases an attribute value randomly out of corresponding membrane i.

    • \(r_1\)\( \xi _1 a_j\rightarrow a_j(a_j,out_1);(1\le j\le K)\)

    • \(r_2\)\( \xi _2 b_j\rightarrow b_j(b_j,out_2);(1\le j\le K)\)

    • \(\cdots \)

    • \(r_{26}\)\( \xi _{26} z_j\rightarrow z_j(z_j,out_{26});(1\le j\le K)\)

  4. (4)

    The rule set in membrane S called \(R_S\). The symbol N represents the number of self cells. The symbol \(\beta \) represents that the corresponding attribute from the two cells are matched. The symbol \(\gamma \) represents that the two cells are not matched.

    • \(r_1\)\( \eta \eta _i\rightarrow \eta _{ii};(1\le i\le N) \)

    • \(r_2\)\( \eta _{ii}[_i]_i \rightarrow [_i]_i\eta _i([_i\eta ']_i,in_0);(1\le i\le N)\)

    • \(r_3\)\( \beta ^2\rightarrow \beta \gamma ,1\)

    • \(r_4\)\( \beta \gamma ^{M-1} \rightarrow (\beta ,out_S),2 \)

    • \(r_5\)\( \gamma ^M \rightarrow (\gamma ,out_S)\)

  5. (5)

    The rule set in self membrane of membrane S called \(R_{si}\), \(1\le i \le N\). Dissolve the membrane to release the object into outer membrane.

    • \(r_1\)\( \eta '\rightarrow \gamma \delta \)

  6. (6)

    The rule set in membrane 0 called \(R_0\). \(r_1 \sim r_{26}\) are matching rules for the affinity measure. The symbol m is a threshold of the affinity. If the number of matched attributes is greater than the threshold, it will release a symbol \(\beta \) out of membrane, otherwise it will release a symbol \(\gamma \).

    • \(r_1\)\( a_j^2\rightarrow \beta ;(1\le j\le K) \)

    • \(r_2\)\( b_j^2\rightarrow \beta ;(1\le j\le K) \)

    • \(\cdots \)

    • \(r_{26}\)\( z_j^2\rightarrow \beta ;(1\le j\le K) \)

    • \(r_{27}\)\(\gamma \rightarrow \gamma ^2,3 \)

    • \(r_{28}\)\(\beta ^m\gamma ^2 \rightarrow \eta ''(\beta ,out_0),1\)

    • \(r_{29}\)\( \gamma ^2\rightarrow \eta ''(\gamma ,out_0),1\)

    • \(r_{30}\)\( \beta \rightarrow \lambda |_{\eta ''},1\)

    • \(r_{31}\)\( a_j\rightarrow \lambda |_{\eta ''},1; (1\le j\le K) \)

    • \(r_{32}\)\( b_j\rightarrow \lambda |_{\eta ''},1; (1\le j\le K) \)

    • \(\cdots \)

    • \(r_{56}\)\( z_j\rightarrow \lambda |_{\eta ''},1; (1\le j\le K) \)

    • \(r_{57}\)\(\eta '' \rightarrow \delta ,2 \)

    • \(r_{58}\)\(\eta ' \rightarrow \gamma \delta \)

  7. (7)

    The rule set in membrane C called \(R_C\). The copies of unlabel cell are sent into each detector in this membrane. The symbol Z represents the self cell and the symbol Y represents the non-self cell. The result will be sent out of this membrane into skin membrane.

    • \(r_1\)\( \eta _0[_0]_0\rightarrow [_0]_0([_0\eta ']_0,in_u) \)

    • \(r_2\)\( \beta ^2\rightarrow \beta \gamma ,1\)

    • \(r_3\)\( \beta \gamma ^{N-1}\rightarrow (Y,out_C),2 \)

    • \(r_4\)\( \gamma ^N\rightarrow (Z,out_C) \)

  8. (8)

    The rule set in membrane T and u called \(R_T\). Each unknown cell is sent into the copy of membrane T saved as membrane u. The match procedure is the same with \(R_0\). If the affinity of the two cell is greater than the threshold, it will release a symbol \(\beta \) out of membrane, otherwise it will release a symbol \(\gamma \).

    • \(r_1\)\( a_j^2\rightarrow \beta ; (1\le j\le K) \)

    • \(r_2\)\( b_j^2\rightarrow \beta ; (1\le j\le K) \)

    • \(\cdots \)

    • \(r_{26}\)\( z_j^2\rightarrow \beta ; (1\le j\le K) \)

    • \(r_{27}\)\(\gamma \rightarrow \gamma ^2,3 \)

    • \(r_{28}\)\(\beta ^m\gamma ^2 \rightarrow \eta ''(\beta ,out_u),1\)

    • \(r_{29}\)\( \gamma ^2\rightarrow \eta ''(\gamma ,out_u),1\)

    • \(r_{30}\)\( \beta \rightarrow \lambda |_{\eta ''},1\)

    • \(r_{31}\)\( a_j\rightarrow \lambda |_{\eta ''},1; (1\le j\le K) \)

    • \(r_{32}\)\( b_j\rightarrow \lambda |_{\eta ''},1; (1\le j\le K) \)

    • \(\cdots \)

    • \(r_{56}\)\( z_j\rightarrow \lambda |_{\eta ''},1; (1\le j\le K) \)

    • \(r_{57}\)\(\eta '' \rightarrow \delta ,2 \)

3.3 Algorithm Implementation

\(\mathrm {\Pi }_\mathrm {NS}\) contains three important phases: generate the candidate detectors randomly; calculate the affinity of the detector to decide whether deleted or converted into immune cell in classifier; test the performance of the classifier.

In skin membrane M, when it receives the starting symbol \(\varPsi _1\), the rule (\(r_1 \in R_M\)) will send symbols into membrane G to produce a new elementary membrane called the candidate cell. In skin membrane G, the new membrane marked zero for a candidate detector by rule (\(r_2 \in R_G\)). The symbol \(\xi _i(1 \le i \le M)\) is sent into membrane (\(1 \le i \le M\)) in order to release a random attribute value by the rules (\(r_1 \sim r_{26} \in R_{gi}\)), the detector will be out of the gene pool by rule (\(r_3 \in R_G\)).

The candidate cell will be replicated by membrane separation. The affinity of each self cell in self set with the candidate cell must be calculated. Suppose the number of self cells is N(N = \(2^k-1\)), the symbol \(\phi \) controls the times of replication to produce N copies which will be sent into membrane S by the rules (\(r_2 \sim r_4 \in R_M\)). Only one membrane will be left in the skin membrane waiting for the next operation.

In membrane S, once the symbol \(\eta \) sent in, the rule (\(r_1,r_2\in R_S\)) will send the copy of each self cell into the candidate cell. Then the membranes i will dissolve and release the objects into membrane zero. Next, the rules (\(r_1 \sim r_{26} \in R_0\)) may be reacted or not. There are (M + 1) kinds of possibility. If one attribution is equivalent, it will produce one \(\beta \). The symbol \(\gamma \) is as a time slice, its count will be two by the rule \(r_{27}\).m is the affinity to determine the similarity of two cells according to the specific problem (\(1 \le m \le M\)). The final output is \(\beta \) or \(\gamma \) representing matched or not matched respectively.

In the skin membrane, the symbol \(\gamma \) means that the new cell can not be recognized by the self set and can join to the classifier by rule (\(r_5 \in R_M\)). The symbol \(\beta \) means that the new cell can be recognized and must be deleted and start the next round search of by rules (\(r_7 \in R_M\) and \(r_{57} \in R_0\)). L indicates the number of detectors to be generated, the search process will stop until the termination symbol \(\sharp \).

When the classifier is generated, the classification performance of that must be tested. In the testing process, input the symbol \(\varPsi _2\) from the environment to start the detection, then apply the rule (\(r_8 \in R_M\)) to generate membrane u to store the unlabel cell by rules (\(r_{12} \sim r_{37} \in R_M\)). Then replicate the membrane u and send them into membrane C by rule (\(r_9 \sim r_{11} \in R_M\)). The replication process and matching process are similar to the process of affinity testing in generation of detector.

In membrane C, the rules (\(r_3 \sim r_4 \in R_C\)) are applied to release the symbol Y which means that the cell is allogeneic or Z which means that the cell is autologous.

3.4 Analyses

\(\mathrm {\Pi }_\mathrm {NS}\) is a P system of immune algorithm for classification. The key issues of the classification problem are implemented in three membranes. Therefore, the data of practical problems in real life can be transferred into objects in the specific membrane. The membrane G represents gene pool to store the different gene segments as the initial multi-sets, in other words, which contains the range of values for each attribute and the values must be the discrete numerical. The membrane S represents the self-set to store the autologous cells. Typically, all the data is divided into two sets: training set and testing set. The records of training set are transferred and placed in the membrane S in advance. If the new candidate cell is recognized by the cells of training set, that means they are the same kind, otherwise, they belong to different kinds, and the candidate cell will be converted to detector and sent to the membrane C which denotes as classifier. When the classifier is generated, it can be used to differentiate the objects without class label. The performance of the classifier can be tested by the data of testing set.

The simulation of \(\mathrm {\Pi }_\mathrm {NS}\) based on negative selection was validated using iris data set from UCI Machine Learning Repository [7]. The initial data of iris data set was preprocessed. In the experimentations, the affinity of two different cell is 75%. According to the thought of membrane calculation, both in the detector generation phase and in the testing phase, the matching process between self cell with detector and the testing cell with detector are asynchronous execution. It means that this system converts serial execution of the basic algorithm to parallel execution, decreases the algorithm execution time. The classification accuracy is improved with the increase of the number of detectors and the accuracy of classification is sufficiently comparable to other top classification algorithm.

4 Conclusion

In this paper, we have proposed \(\mathrm {\Pi }_\mathrm {NS}\) for classification based on immune algorithm of immune system. The important advantage of the system is the parallelism which is inherent character of membrane system. In standard negative selection algorithm, the operate to calculate the affinity of two cells both in searching the immune cell and testing the unlabel cell is serial process. However, the candidate cell will be duplicated to match the self cells or detectors at the same time. It could be increase the searching speed. There are still some problem to solve, from which the most important is ignoring noise in data and increasing efficiency of algorithm when using data sets containing patterns not uniformly distributed within different classes.