Many experimental data exist in modern literature concerning the role of genotype in animal cognitive abilities development [13]. Although in most cases it is the study of differences of cognitive tests performance as the result of the of certain genes expression modulation, determining the separate properties of neuronal networks (e.g. [46], etc.). It should be noted as well that in majority of such studies the term “cognitive” behavior implies the wide range of plasticity functions, basing mainly on different forms of associative learning [7, 8]. In the context of L.V. Krushinsky ideas, in this work the term animal cognitive abilities is used to describe the ability to grasp the laws which connect objects and events of the external world and to develop the further behavior on this ground [9]. The neurobiological analysis of such cognitive abilities requires the creation of the respective adequate genetic models, animal strains, in particular, which would differ in the definite cognitive abilities. This paper presents the data on the first generations of laboratory mice artificial selection for high and low scores of cognitive test solution, which reveal the cognitive ability in the sense, mentioned above. This selection experiment is based on the variable success in cognitive puzzle-box test (PBT) performance, based on the animal drive to hide in the dark from the brightly lit area. This paradigm (“puzzle-box”) tests animal cognitive ability per se, as the solution of the test is the solution of the elementary logic task, which does not require previous learning. In experiments presented in the paper, the test consisted of four task presentations (stages). The initial version of this test had been introduced as the part of the test battery for factor g evaluation [10]. Factor g, or factor of general intelligence had been introduced in the experimental psychology by Ch. Spearman, who demonstrated the correlation of the scores from a range of several cognition tests with one another. In recent animal experiments g factor is used as integrative index of cognitive abilities derived as the result of statistical evaluation of data from the large test battery [10]. The logic structure of PBT is based on the animal ability to grasp the “object permanence” rule as this was treated by Piajet [11]. Mouse is placed into the brightly lit part of the experimental box (30 × 28 × 27.5 cm), and it is eager to hide into the dark part of it (14 × 28 × 27.5 cm) via the underpass emerged into the flour of the box. During the first test stage the underpass leading into the dark compartment is opened, and the animal meets no obstacles in its way. During the second stage, this underpass is masked with clear wood shavings up to the level of the box floor, while at stages 3 and 4 it is blocked with a light plug (made from carton and plastic), which mouse can remove taking it out by teeth or move it aside by a muzzle [11]. To solve test stages 1 and 2, the animal was given 180 s, whereas for stages 3 and 4 it had 240 s. After the animal entered the dark part of the box, it was left there for 15–20 s, and then placed in a separate cage for 45–60 s before the next stage of the test. The latencies of animal penetration into the dark part of the box, as well as the fact of definite test stage solution or non-solution, were registered. At stages when the underpass was blocked by a plug, the expression of animal movements aimed at removing the plug (plug “manipulations,” i.e., the attempts to enter the dark by seizing the plug by teeth, the attempts to raise it) were also registered. In cases when the animal failed to solve this stage (for 240 s), the presence or absence of such “manipulations” were considered to be important for evaluation of the behavior differences (see below).

The criterion for selection for the “plus” sub-strain was a successful solution of the test in its most “difficult” stage, when the underpass was blocked with a plug (see below), while the criterion for selection for the “minus” sub-strain was the inability to solve these stages. F20 mice of the selection experiment of strain EX were used as the basic population for the new selection [11]. Strain EX was bred in the previous selection experiment for high scores of extrapolation ability in the paradigm, when the stimulus disappeared from the animal view. In the first generations of the selected strain EX, the proportion of animals that were able to solve this task significantly exceeded the 50% chance level [11], while in the further generations (starting from F9–10) this prevalence were expressed irregularly. In the present work, the experimental results on PBT performance are presented for several groups of animals. These were mice of F20 of the EX strain selection (parents of F1 of the present selection, n = 20), mice of F1–F3 of the new selection (sub-strains “plus” and “minus,” n = 378), and animals from the unselected control population CoEX (n = 34), which served as a control group during the selection for high scores of extrapolation ability [11].

The PBT performances at stages 1 and 2 (the underpass free or masked by wood shavings) were practically similar in the “plus” and “minus” groups, and the reactions were relatively quick. Figure 2 demonstrates the respective latencies. The distinct differences between the “plus” and “minus” groups were found in the scores for stages 3 and 4 when the underpass was blocked with a plug (Figs. 1, 2), and this could be regarded as the response to selection in these generations. The test solution scores at stage 4 (the proportion of mice which solved the task) was higher (with the exception of the “plus” group in F2), and the mean latencies were shorter than those for stage 3, which means that the animals used the memory of the previous task presentation experience.

Fig. 1.
figure 1

Proportions (ordinate, %) of successful puzzle-box solutions by F20 strain EX mice (see text), by parents of the F1 of new selection (1), by F1–F3 mice from the new sub-strains (“plus” and “minus“ groups (2–4)), and by mice from the unselected control population (F23, 5) in test presentations when the underpass leading to the dark box compartment was blocked with a plug. The gray columns show the 1st test stage with the plug, and the black columns show the 2nd test stage with the plug. *, **, *** Scores are significantly different from the respective proportion in the “minus” sub-strain of the same generation, p < 0.05, 0.01, and 0.001, respectively (Fisher ȹ method for alternative proportions difference). #, & Significant differences from the proportions of F1–F3 “plus” sub-strain mice (with different p values, details not shown). Numerals above the columns are the numbers of mice tested in each group.

Fig. 2.
figure 2

Mean values for latencies (± standard error) of animal entrance into the dark part of the box during succeeding the puzzle-box test stages by mice of different groups (see Fig. 1): (a) the underpass is opened, (b) the underpass is masked with wood shavings, (c, d) the underpass is blocked with the plug. The gray columns show the mice of the sub-strain “plus,” and the black columns show the mice of the sub-strain “minus.” Designations for (1–5) are the same as in Fig. 1. *, **, *** Significant differences from the respective values for the “minus” sub-strain, р < 0.05, 0.01, and 0.001, respectively (1-factor ANOVA, post hoc by Fisher).

The test solution by CoEX mice at test stages 3 and 4 (the underpass blocked with a plug) was much less successful than the solutions by mice of three selection generations (in F2 and F3, even by the mice of the “minus” sub-strain).

In those cases, when the underpass was blocked with a plug and the animal failed to solve the task, the observations showed that, in the majority of cases, animals attempted to hold the plug by teeth or to shift it aside. That means that the mice “manipulated” by this object. This was found in F20 mice, as well as in mice of the selected sub-strains. This behavior evidences that the mouse tried to overcome the obstacle but was unable to perform the respective action. The mice that did not “manipulate” the plug were rare:– there was only one “non-manipulating” mouse found in F3 of the “plus” sub-strain (out of 83 animals) and only two mice in the F3 of the “minus” sub-strain (out of 73 animals). At the same time, there were 8 out of 34 mice in the F23 CoEX unselected control population (these mice were tested at the same time that all F3 animals) that did not manipulate the plug. The differences in the success of solving the “most difficult” test stages, more clearly expressed in the last selection generation (F3), might evidence the positive role of selection for these contrasting traits. Although it should be reminded that both sub-strains originated from the mice of the EX strain, which was selected during 20 generations for the successful solution of the task of extrapolation of movement direction [11]. The logic structure for the extrapolation test implies that, in order to solve the task, an animal should understand that the object (milk cup) that disappeared from view still exists and it is possible to search for it. Thus, mice of both new sub-strains, to a certain degree, understood the “object permanence” rule: the majority of these animals tried to penetrate in the underpass, pushing and trying to raise the plug (i.e. “manipulating” the plug). Thus, the selection for successful and unsuccessful PBT solution at the stage when the plug blocked the underpass revealed not the differences in the ability to solve the task but the differences in the ability to achieve a definite solution (i.e., in the expression of the so-called “executive functions” in these animals [12, 13]). The success in the task solution performance, i.e., the expression of the “executive functions,” was demonstrated in one of the first papers in which this test was used [13]. In the PBT stages, when the underpass was opened or masked with wood shavings, the differences in the expression of “executive functions” could be seen presumably in solution latencies (with more quick reaction in mice of “plus” groups) rather than in the scores of successful test solutions (Figs. 2a, 2b). The behavior of mice from two sub-strains was also compared in the hyponeophagia test, in which an animal was given a new food (cheese) in a new (although not frightening) environment. The scores of this test, which evaluate the response to novelty (which is also the component of animal cognitive abilities), were more distinct in mice of the “plus” groups (data not shown). It should be noted that, in the available literature sources in which the cognitive ability of rodents (i.e., their ability to solve elementary logic tasks) was investigated, there are no analogues of the study presented. In both fields—in the neurogenetics as a whole and in the studies of the genotype role in cognitive trait expression—researchers mostly attempt to analyze the effects of gene expression modulation in genetically modified animals, and the respective list of papers includes many dozens of experimental reports. The studies of the origin of human CNS diseases are especially numerous [1416, etc.], and the experimental approaches using the batteries of cognitive tests are very popular as well [10, 17, 18, etc.]. At the same time, no selection experiments were performed. This neurobiology area as a whole remains largely unexplored, in spite of the great progress in revealing brain structures, signal pathways, and specific neuronal groups that are crucial for cognitive functions.

In conclusion, the results of this study demonstrated for the first time the significant differences in the solution of an elementary logic task (i.e., in the expression of the “executive functions” in mice that undergone three generations of the selection for successful and unsuccessful solution of the “object permanence” task).