Keywords

These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

1 Introductory Remarks

The word science originates from the Latin word scientia, which means knowledge. Science is a systematic enterprise that builds and organizes knowledge in the form of testable explanations and predictions about the Universe. Modern science is a discovery as well as an invention. It is a discovery that Nature generally acts regularly enough to be described by laws and even by mathematics; and it required invention to devise the techniques, abstractions, apparatus, and organization for exhibiting the regularities and securing their law-like descriptions [1, 2]. The institutional goal of science is to expand certified knowledge [3]. This happens by the important ability of science to produce and communicate scientific knowledge . We stress especially the communication of new knowledge, since communication is an essential social feature of scientific systems [4]. This social function of science has long been recognized [59].

Research is creative work undertaken on a systematic basis in order to increase the stock of knowledge, including knowledge of humans, culture, and society, and the use of this stock of knowledge to devise new applications [10]. Scientific research is one of the forms of research. Usually, modern science is connected to research organizations. In most cases, the dynamics of these organizations is nonlinear. This means that small influences may lead to large changes. Because of this, the evolution of such organizations must be managed very carefully and on the basis of sufficient knowledge on the laws that govern corresponding structures and processes. This sufficient knowledge may be obtained by study of research structures and processes. Two important goals of such studies are (i) adequate modeling of dynamics of corresponding structures and (ii) design of appropriate tools for evaluation of production of researchers.

This chapter contains the minimum amount of knowledge needed for a better understanding of indicators, indexes, and mathematical models discussed in the following chapters. We consider science as an open system and stress the dissipative nature of research systems. Dissipativity of research systems means that they need continuous support in the form of inflows of money, equipment, personnel, etc. The evolution of research systems is similar to that of other open and dissipative systems: it happens through a sequence of instabilities that lead to transitions to more (or less) organized states of corresponding systems.

Science may play an important role in a national economic system. This is shown on the basis of the Triple Helix model of a knowledge-based economy. Competition is an important feature of modern economics and society. Competition has many faces, and one of them is scientific competition among nations. This kind of competition is connected to the academic diamond: in order to be successful in globalization, a nation has to possess an academic diamond and use it effectively.

In order to proceed to the methods for quantitative assessment of research and research organizations and to mathematical models of science dynamics, one needs some basic information about assessment of research. A minimum of such basic information is presented in the second part of the chapter. The discussion begins with remarks about quality and measurement of processes by process indicators. Measurement can be qualitative and quantitative, and four kinds of measurement scales are described. The discussion continues with remarks on the non-Gaussianity that occurs frequently as a feature of social processes. Research also has characteristics of a social process, and many components and processes connected to research possess non-Gaussian statistical characteristics.

If one wants to measure research, one needs quantitative tools for measurement. Scientometrics , bibliometrics, and informetrics provide such tools, and a brief discussion of quantities that may be measured and analyzed is presented further in the text. In addition, another useful tool for analysis of research and research structures, the knowledge landscape, is briefly discussed. Next, research production is discussed in more detail. Special attention is devoted to publications and citations, since they contain important information that is useful for assessment of research production. The discussion continues with remarks on methods and systems for assessment of research and research organizations. Tools for assessment of basic research as well as the method of expert evaluation and several systems for assessment of research organizations applied in countries from continental Europe are briefly mentioned. The discussion ends with a description of the English–Czerwon method for quantification of performance of research units, which makes it possible to combine qualitative and quantitative information in order to compare results of research of research groups or research organizations.

2 Science, Technology, and Society

Knowledge is our most powerful engine of production

Alfred Marshall

Science, innovation, and technology have led some countries to a state of developed societies and economies [1116]. Thus science is a driving force of positive social evolution, and the neglect of this driving force may turn a state into a laggard [17]. Basic research is an important part of the driving force of science. This kind of research may have large economic consequences, since it produces scientific information that has certain characteristic features of goods [18] such as use value and value. The use value of scientific information is large if the obtained scientific information can be applied immediately in practice or for generation of new information. One indicator for the measure of this value is the number of references of the corresponding scientific publication. The value of scientific information is large when it is original, general, coherent, valid, etc. The value of scientific information is evaluated usually in the “marketplace” such as scientific journals or scientific conferences.

The lag between basic research and its economic consequences may be long, but the economic impact of science is indisputable [19, 20]. This is an important reason to investigate the structures, laws, processes, and systems connected to research [2126]. The goals of such studies are [27]: better management of the scientific substructure of society [2830 ] , increase of effectiveness of scientific research [3134 ] , efficient use of science for rapid and positive social evolution. The last goal is connected to the fact that science is the main factor in the increase of productivity. In addition, science is a sociocultural factor, for it directly influences the social structures and systems connected to education, culture, professional structure of society, social structure of society, distribution of free time, etc. The societal impact of science as well as many aspects of scientific research may be measured [3543].

Science is an information-producing system [44, 45]. That information is contained in scientific products. The most important of these products are scientific publications, and the evaluation of results of scientific research is usually based on scientific publications and on their citations. Scientific information is very important for technology [4648] and leads to the acceleration of technological progress [4959]. Science produces knowledge about how the world works. Technology contains knowledge of some production techniques. There are knowledge flows directed from the area of science to the area of technology [60, 61]. In addition, technological advance leads to new scientific knowledge [62], and in the process of technological development, many new scientific problems may arise. New technologies lead also to better scientific equipment. This allows research in new scientific fields, e.g., the world of biological microstructures. Advances in science may reduce the cost of technology [6366]. In addition, advances in science lead to new cutting-edge technologies, e.g., laser technologies, nanoelectronics, gene therapy, quantum computing, some energy technologies [6774]. But the cutting-edge technologies do not remain cutting-edge for long. Usually, there are several countries that are the most advanced technologically (technology leaders), and the cutting-edge technologies are concentrated in those countries. And those countries generally possess the most advanced research systems.

In summary, what we observe today is a scientifically driven technological advance [7581]. And in the long run, technological progress is the major source of economic growth.

The ability of science to speed up achievement of national economic and social objectives makes the understanding of the dynamics of science and the dynamics of research organizations an absolute necessity for decision-makers. Such an understanding can be based on appropriate systems of science and technology indicators and on tools for measurement of research performance [8287]. Because of this, science and technology indicators are increasingly used (and misused) in public debates on science policy at all levels of government [8896].

3 Remarks on Dissipativity and the Structure of Science Systems

The following point of view exists about the evolution of open systems in thermodynamics [97, 98]:

The evolution of an open thermodynamic system is a sequence of transitions between states with decreasing entropy (increasing level of organization) with an initial state sufficiently far from equilibrium. If the parameters of such systems change and the changes are large enough, the system becomes unstable, and there exists the possibility that some fluctuation of the parameters may push the system to a new state with smaller entropy. Thus the transition takes place through an instability.

This type of development may be observed in scientific systems too. This is not a surprise, since scientific systems are open (they interact with a complex natural and social environment), and they are able to self-organize [99]. In addition, crises exist in these systems, and often these crises are solved by the growth of an appropriate fluctuation that pushes the scientific system to a new state (which can be more or less organized than the state before the crisis). Hence instabilities are important for the evolution of science, and it is extremely important to study the instabilities of scientific (and social) systems [100102]. The time of instability (crisis) is a critical time, and the regime of instability is a critical regime. The exit from this time and this regime may lead to a new, more organized, and more efficient state of the system or may lead to degradation and even to destruction of the system.

3.1 Financial, Material, and Human Resource Flows Keep Science in an Organized State

Dissipative structures: In order to keep a system far from equilibrium, flows of energy, matter, and information have to be directed toward the system. These flows ensure the possibility for self-organization, i.e., the sequence of transitions toward states of smaller entropy (and larger organization). The corresponding structures are called dissipative structures, and they can exist only if they interact intensively with the environment. If this interaction stops and the above-mentioned flows cease to exist, then the dissipative structures cannot exist, and the system will end at a state of thermodynamic equilibrium where the entropy is at a maximum and organization is at a minimum.

Science structures are dissipative. In order to exist, they need inflows of information (since scientific information becomes outdated relatively fast), people (since the scientists retire or leave and have to be replaced), money (needed for paying scientists, for building and supporting the scientific infrastructure), materials (for running experiments, machines, etc.), etc. The weak point of the dissipative structures is that they can be degraded or even destroyed by decreasing their supporting flows [103]. In science, this type of development to retrograde states may be observed when the flows of financial and material support decrease and flows of information decrease or cease.

3.2 Levels, Characteristic Features, and Evolution of Scientific Structures

Researchers act in two directions: (i) they produce new knowledge and information [104, 105] and decrease the disorder as current knowledge become better organized; (ii) the work of researchers leads to new problems and the possibility for new research directions and thus opens the way to new states with an even higher level of organization. By means of these actions, researchers influence the structure of science. There exist three levels and four characteristic features of the scientific structure [106]. The three levels are:

  1. 1.

    Level of material structure: Here are the scientific institutes, material conditions for scientific work, etc.

  2. 2.

    Level of social structure: This includes the scientists and other personnel as well as the different kinds of social networks connected to scientific organizations.

  3. 3.

    Level of intellectual structure: This includes the structures connected to scientific knowledge and the field of scientific research. There are differences in the intellectual structures connected to the social sciences in comparison to the intellectual structures connected to the natural sciences.

The four characteristic features of the scientific structure are:

  1. 1.

    Dependence on material, financial, and information flows. These flows are directed mainly to the material levels of the scientific structure. They include the flows of money and materials that are needed for the scientific work. But there are also flows to other levels of the scientific structure. An important type of such flows is motivation flows. For example, there exist (i) psychological motivation flow: connected to the social level of the scientific structure. This motivation flow is needed to support each scientist to be an active member of scientific networks and to be an expert in the area of his or her scientific work; (ii) intellectual motivation flow: connected to the intellectual level of the scientific structure. This flow supports scientists to learn constantly and to absorb the newest scientific information from their research area.

  2. 2.

    Cyclical behavior of scientific productivity. At the beginning of research in a new scientific area, there are many problems to be solved, and scientists deal with them (highly motivated, for example, by the intellectual motivation flow and possibly by material flows that the corresponding wise national government assigns to support the research in this area). In the course of time, the simple scientific problems are solved, and what remains are more complex unsolved problems. The corresponding scientific production (the number of publications, for example) usually decreases. Some scientists change their field of research, and then a new scientific area or subarea may arise in this new field of research.

  3. 3.

    Homeostatic feature.

Homeostasis is the property of a system to regulate its variables in such a way that internal conditions remain stable and relatively constant.

  • This feature of science is supported by the system of education, the set of traditions and institutional norms, the books and other material and intellectual tools that ensure the translation of knowledge from one generation of scientists to the next, etc. All this contributes to the stable functioning of scientific systems and helps them to overcome unfavorable environmental conditions.

  1. 4.

    Limiting factors. Limiting factors can be (i) material factors that decrease the intensity of work of the scientific organizations (such as decreased funding, for example); (ii) factors connected to decreasing the speed of the process of exchange of scientific information (closing access to an important electronic scientific journal, for example); (iii) factors that decrease the speed of obtaining new scientific results (for example, the constant pressure to increase the paperwork of scientists).

Scientific structures evolve. This evolution is connected to the evolution of scientific research [107109]. Usually, the evolution of scientific structures has four stages: normal stage, network stage, cluster stage, specialty stage. Institutional forms of research evolve, for example, as follows. At the normal stage, these forms are informal; then small symposiums arise at the network stage. At the cluster stage, the symposiums evolve to formal meetings and congresses, and at the specialty stage, one observes institutionalization (research groups and departments at research institutes and universities). Cognitive content evolves too. At the normal stage, a paradigm is formulated. At the network stage, this paradigm is applied, and in the cluster stage, deviations from the paradigm (anomalies) are discovered. Then at the specialty stage, one observes exhaustion of the paradigm, and the cycle begins again by formulation of a new paradigm.

Now let us consider a more global point of view on research systems and structures and let us discuss briefly two additional aspects connected to these systems:

  • The place of research in the economic subsystem of society from the point of view of the Triple Helix model of the knowledge-based economy;

  • Relations among different national research systems: we discuss the competition among these systems from the point of view of the concept of the academic diamond.

4 Triple Helix Model of the Knowledge-Based Economy

Research priorities should be selected by taking into account primarily the requirements of the national economics and society, traditions and results previously attained, possible present and future human and financial potential, international relationships, trends in the world’s economic and social growth, and trends of science.

Peter Vinkler

The Triple Helix model of the knowledge-based economy defines the main institutions in this economy as university (academia), industry, and government [110119]. The Triple Helix has the following basic features:

  1. 1.

    A more prominent role for the university (and research institutes) in innovation, where the other main actors are industry and government.

  2. 2.

    Movement toward collaborative relationships among the three major institutional spheres, in which innovation policy should be increasingly an outcome of interaction rather than a prescription from government.

  3. 3.

    Any of the three spheres may take the role of the other, thus performing new roles in addition to their traditional function. This taking of nontraditional roles is viewed as a major source of innovation.

Organized knowledge production adds a new coordination mechanism in social systems (knowledge production and control) in addition to the two classical coordination mechanisms (economic exchanges and political control). In the Triple Helix model, the economic system, the political system, and the academic system are considered relatively autonomous subsystems of society that operate with different mechanisms. In addition to their autonomy, however these subsystems are interconnected and interdependent. There are amendments in the model of the Triple Helix, and even models of the helix exist with more than three branches [120].

The Triple Helix model allows for the evolution of the branches of the helix. At the beginning of operation of the Triple Helix:

  1. 1.

    Industry operates as a concentration point of production.

  2. 2.

    Government operates as the source of contractual relations and has to be a guarantor for stable interactions and exchange.

  3. 3.

    The academy operates as a source of new knowledge and technology, thus generating the base for establishing a knowledge-based economy.

With increasing time, the place of academia (universities and research institutes) in the helix changes. Initially, the academy is a source of human resources and knowledge, and the connection between academia and industry is relatively weak. Then academia develops organizational capabilities to transfer technologies, and instead of serving only as a source of new ideas for existing firms, academia becomes a source of new firm formation in the area of cutting-edge technologies and in advanced areas of science. Academia becomes a source of regional economic development, and this leads to the establishment of new mechanisms of economic activity and community formation (such as business incubators, science parks, and different kinds of networks between academia and industry). Government supports all this by its traditional regulatory role in setting the rules of the game and also by actions as a public entrepreneur.

The Triple Helix model is a useful model that helps researchers, managers, et al. to imagine the place of research structures in the complex structure of modern economics and society. Let us mention that the Triple Helix can be modeled on the basis of the evolutionary “lock-in” model of innovations [121] connected to the efforts of adoption of competing technologies [122, 123]. And various concepts from time series analysis such as the concept of mutual information [119] can be used to study the Triple Helix dynamics.

5 Scientific Competition Among Nations: The Academic Diamond

It is not enough to do your best. You must know what to do and then do your best

W. Edwards Deming

Globalization creates markets of huge size, and every nation wants to be well represented at these markets with respect to exports of goods, etc. This can happen if a nation has competitive advantages. One important such advantage is the existence of effective national research and development (R & D) systems. Let us note that the scientific production by researchers, research groups, and countries is an object of absolute competition regardless of possible poor equipment, low salaries, or lack of grants for some of the participants in this competition. From this point of view, the evaluation of scientific results may be regarded as unfair if one compares scientists from different nations [4]. Poor working conditions for scientists is clearly a competitive disadvantage to the corresponding nation. In order to export high-tech production, the scientific and technological system of a nation has to work smoothly and be effective enough. A nation that has such a system and uses it effectively for cooperation [124, 125] and competition has a competitive advantage in the global markets. And in order to have such a system, a country should invest wisely in the development of its scientific system and in the processes of strengthening the connection between the national scientific, technological , and business systems and structures [126130]. In particular, the four parts of the so-called academic diamond [131] should be cultivated.

Each of the four parts of the academic diamond is connected to the other three parts. The parts are:

  1. 1.

    Factor conditions: human resources (quantity of researchers, skills levels [132], etc.), knowledge resources (government research institutes, universities, private research facilities, scientific literature, etc.), physical and basic resources (land, water and mineral resources, climatic conditions, location of the country, proximity to other countries with similar research profiles, size of country, etc.), capital resources (government funding of scientific structures and systems, cost of capital available to finance academia, private funding for research projects, etc.), infrastructure (quality of life, attractiveness of country for skilled scientists, telecommunication systems, etc.).

  2. 2.

    Strategy, structure, and rivalry: goals and strategies of the research organizations (research profile, positioning and key faculties or research areas, internationalization path in terms of staff, campuses, and student body, etc.), local rules and incentives (salaries, promotion system, incentives for publication, etc.), local competition (number of research universities, research institutes, research centers, existing research clusters, territorial dynamics of scientific organizations, etc.).

  3. 3.

    Demand conditions: public and private sectors (demand for training and job positions for researchers, etc.), student population (trained students), other academics in country and abroad (active research scientists outside the government research institutes and universities).

  4. 4.

    Related and supporting industries: publication industry, information technology industry, other research institutions.

In addition, the academic diamond has two more components: chance and government . There are different aspects of chance connected to the research organizations. If we consider chance as the possibility for something to happen, then some countries have elites that ensure a good chance with respect to the positive development of science and technology. Government may contribute to the development of scientific and technological systems of a country. This contribution can be made through appropriate politics with respect to (higher) education; government research institutes; basic research [133, 134]; funding of research and development; economic development; etc.

6 Assessment of Research: The Role of Research Publications

Research is an important process in complex scientific systems. Research production is a result of this process that can be assessed. Quantitative assessment of research (at least of publicly funded basic research) has increased greatly in the last decade [135138]. Some important reasons for this are economic and societal [134]: constraints on public expenditures, including the field of research and development; growing costs of instrumentation and infrastructure; requirements for greater public accountability; etc. Another reason is connected to the development of information technologies, bibliometrics, and scientometrics in the last fifty years. Several goals of quantitative assessment of research are [4] to obtain information for granting research projects; to determine the quantity and impact of information production for monitoring research activities; to analyze national or international standing of research organizations and countries’ organizations for scientific policy; to obtain information for personnel decisions; etc.

In addition to the rise of quantitative assessment of research, one observes a process of the increasing use of mathematics in different areas of knowledge [139]. This process also concerns the field of knowledge about science. In the process of human evolution, more and more scientific facts have been accumulated, and these facts have been ordered by means of different methods that include also methods of mathematics. In addition, the use of mathematics (which means also the use of mathematical methods beyond the simplest statistical methods) is important and much needed for supporting decisions in the area of research politics.

Many mathematical methods in the area of assessment of research focus on the study of research publications and their citations. This is because publications are an important form of the final results of research work [140142]. There is a positive correlation between the number of research publications and the meaning that society attaches to the scientific achievements of the corresponding researcher. There exists also a positive correlation between the number of a researcher’s publications and the expert evaluation of his/her scientific work [143]. Senter [144] mentions five factors that may positively influence the research productivity of a researcher:

  1. 1.

    Education level: has important positive impact on productivity;

  2. 2.

    Rank of the scientist: has immediate positive impact on scientific productivity;

  3. 3.

    Years in service: positive impact on productivity but more modest in comparison to the impact of education and rank;

  4. 4.

    Influence of scientist on its research endeavor: positive impact but modest in comparison with the above three factors;

  5. 5.

    Psychological factors: usually they have small effect on productivity (if the problems that influence the psychological condition of the research are not too big).

In recent years, the requirements on the quality of research have increased. Because of this, we shall discuss briefly below several characteristics of quality, performance, quality management systems , and performance management systems, since they are important for the assessment of the quality of the results of basic and applied research [145148].

7 Quality and Performance: Processes and Process Indicators

Scientific research and its product, scientific information, is multidimensional, and because of this, the evaluation of scientific research must also be multidimensional and based on quantitative indexes and indicators accompanied by qualitative tools of analysis. One important characteristic of research activity is its quality, because the performance of any organization is connected to the quality of its products [149153]. A simple definition of quality is this: Quality is the ability to fulfill a set of requirements with concrete and measurable actions. The set of requirements can include social requirements, economic requirements, productive requirements, and specific scientific requirements. The set of requirements depends on the stakeholders’ needs and on the needs of producers. These needs should be fulfilled effectively, and an important tool for achieving this is a quality management system. In order to manage quality, one introduces different quality management systems (QMS), which are sets of tools for guiding and controlling an organization with respect to quality aspects of human resources; working procedures, methodologies and practices; technology and know-how.

Research production is organized as a set of processes. A simple definition of a process is as follows: A process is an integrated system of activities that uses resources to transform inputs into outputs [149]. We can observe and assess processes by means of appropriate indicators. An indicator is the quantitative and/or qualitative information on an examined phenomenon (or process or result), which makes it possible to analyze its evolution and to check whether (quality) targets are met, driving actions and decisions [154]. Let us note that we do not need simply to use some indicators. We have to identify the indicators that properly reflect the observed process. These indicators are called key performance indicators.

The main functions of indicators are as follows.

  1. 1.

    Communication. Indicators communicate performance to the internal leadership of the organization and to external stakeholders.

  2. 2.

    Control. Indicators help the leadership of an organization to evaluate and control performance of the corresponding resources.

  3. 3.

    Improvement. Indicators show ways for improvement by identifying gaps between performance and expectations.

Indicators supply us with information about the state, development, and performance of research organizations. Performance measurements are important for taking decisions about development of research organizations [155]. In general, performance measurements supply information about meeting the goals of an organization and about the state of the processes in the organization (for example, whether the processes are in control or there are some problems in their functioning). In more detail, the performance measurement supplies information about the effectiveness of the processes : the degree to which the process output conforms to the requirements, and about efficiency of the processes: the degree to which the process produces the required output at minimal resource cost. Finally, the performance measurements supply information about the need for process improvement.

8 Latent Variables, Measurement Scales, and Kinds of Measurements

Latent features of the studied objects and subjects often are the features we want to measure. One such feature is the scientific productivity of a researcher [156, 157]. Latent features are characterized by latent variables. Latent variables may reflect real characteristics of the studied objects or subjects, but a latent variable is not directly measurable. The indicators are what we measure in practice, e.g., the number of publications or the number of citations. Many latent variables can be operationally defined by sets of indicators. In the simplest case, a latent variable is represented by a single indicator. For example, the production of a researcher may be represented by the number of his/her publications. If we want a more complete characterization of the latent variables, we may have to use more than one indicator for their representation, e.g., one has to avoid (if possible) the reduction of representation of a latent variable to a single indicator. Instead of this, a set of at least two indicators should be used.

A measurement means that certain items are compared with respect to some of their features. There are four scales of measurement:

  1. 1.

    Nominal scale: Differentiates between items or subjects based only on their names or other qualitative classifications they belong to. Examples are language, gender, nationality, ethnicity, form. A quantity connected to the nominal scale is mode: this is the most common item, and it is considered a measure of central tendency.

  2. 2.

    Ordinal scale: Here not only are the items and subject distinguished, but also they are ordered (ranked) with respect to the measured feature. Two notions connected to this scale are mode and median: this is the middle-ranked item or subject. The median is an additional measure of central tendency.

  3. 3.

    Interval scale: For this scale, distinguishing and ranking are available too. In addition, a degree of difference between items is introduced by assigning a number to the measured feature. This number has a precision within some interval. An example for such a scale is the Celsius temperature scale. The quantities connected with the interval scale are mode, median, arithmetic mean, range : the difference between the largest and smallest values in the set of measured data. Range is a measure of dispersion. An additional quantity connected to this kind of scale is standard deviation: a measure of the dispersion from the (arithmetic) mean.

  4. 4.

    Ratio scale: Here in addition to distinguishing, ordering, and assigning a number (with some precision) to the measured feature, there is also estimation of the ratio between the magnitude of a continuous quantity and a unit magnitude of the same kind. An Example of ratio scale measurement is the measurement of mass. If a body’s mass is 10 kg and the mass of another body is 20 kg, one can say that the second body is twice as heavy. If the temperature of a body is 20 \(^{\circ }\)C and the temperature of another body is 40 \(^{\circ }\)C, one cannot say that the second body is twice as warm (because the measure of the temperature in degrees Celsius is a measurement by interval scale and not by ratio scale. The measure of temperature by a ratio scale is the measure in kelvins.

    In addition to all quantities connected to the interval scale of measurement, for the ratio scale of measurement one has the following quantities: geometric mean, harmonic mean, coefficient of variation, etc.

With respect to the four scales, there are the following two kinds of measurements:

  1. 1.

    Qualitative measurements: measurements on the basis of nominal or ordinal scales.

  2. 2.

    Quantitative measurements: measurements on the basis of interval or ratio scales.

Before the start of a measurement, a researcher has to perform:

  1. 1.

    qualitative analysis of the measured class of items or subjects in order to select features that are appropriate for measurement from the point of view of the solved problems;

  2. 2.

    choice of the methodology of measurement.

After the measurements are made, it is again time for qualitative analysis of the adequacy of the results to the goals of the study: some measurement can be adequate for one problem, and other measurements can be adequate for another problem. The adequacy depends on the choice of the features that will be measured.

Fig. 1.1
figure 1

Gaussian distributions are much used for description of natural systems and structures. Many distributions used for describing social systems and structures are non-Gaussian

9 Notes on Differences in Statistical Characteristics of Processes in Nature and Society

Let us assume that measurements have led us to some data about a research organization of interest. Research systems are also social systems, and because of this, we have to know some specific features of these systems and especially the characteristics connected to the possible non-Gaussianity of the system.

A large number of processes in nature and society are random. These processes have to be described by random variables. If x is a random variable, it is characterized by a probability distribution that gives the probability of each value associated with the random variable x arising. Probability distributions are characterized by a probability distribution function \(P(x \le X)\) or probability density function \(p(x)=dP/dx\) .

If we want to study the statistical characteristics of some population of items, we study statistical characteristics of samples of the population. We have to be sure that if the sample size is large enough, then the results will be close to the results that would be obtained by studying the entire population.

For the case of a normal (Gaussian) distribution, the central limit theorem guarantees this convergence. For the case of non-Gaussian distributions, however, there is no such guarantee.

Let us discuss this in detail. We begin with the central limit theorem. The central limit theorem of mathematical statistics is the cornerstone of the part of the world described by Gaussian distributions. It is connected to the moments of a probability distribution p(x) with respect to some value X:

$$\begin{aligned} M^{(n)} = \int dx \ (x-X)^n p(x). \end{aligned}$$
(1.1)

The following two moments are of interest for us here:

  1. 1.

    The first moment (\(n=1\)) with respect to \(X=0\): this is the mean value \(\overline{x}\) of the random variable;

  2. 2.

    The second moment (\(n=2\)) with respect to the mean (\(X=\overline{x}\)): dispersion of the random variable (denoted also by \(\sigma ^2\)).

The central limit theorem answers the following question. We have a population of items or subjects characterized by the random variable x. We construct samples from this population and calculate the mean \(\overline{x}\). If we take a large enough number of samples, then what will be the distribution of the mean values of those samples?

The central limit theorem states that if for the probability density function p(x), the finite mean and dispersion exist, then the distribution of the mean values converges to the Gaussian distribution as the number of samples increases. The distributions that have this property are called Gaussian.

But what will be the situation if a distribution does not have the Gaussian property (for example, the second moment of this distribution is infinite)? Such distributions exist [158160]. They are called non-Gaussian distributions, and some of them play an important role in mathematical models of social systems, and in particular in the models connected to science dynamics. There exists a theorem (called the Gnedenko–Doeblin theorem) that states the central role of one distribution in the world of non-Gaussian distributions. This distribution is called the Zipf distribution. Non-Gaussian distributions (and the Zipf distribution) will be discussed in Part III of this book.

Most distributions that arise in the natural sciences are Gaussian. Many distributions that arise in the social sciences are non-Gaussian (Fig. 1.1). Such distributions arise very often in the models of science dynamics [161, 162]. We do not claim that only Gaussian distributions are observed in the natural sciences and that the distributions that are observed in the social sciences are all non-Gaussian. Non-Gaussian distributions arise frequently in the natural sciences, and Gaussian distributions exist also in the social sciences. The point is that the dominant number of continuous distributions observed in the natural sciences are Gaussian, and many distributions observed in the social sciences are non-Gaussian [163].

Many distributions in the social sciences are non-Gaussian. Several important consequences of this are as follows.

  1. 1.

    Heavy tails. The tails of non-Gaussian distributions are larger than the tails of Gaussian distributions. Thus the probability of extreme events becomes larger, and the moments of the distribution may depend considerably on the size of the sample. Then the conventional statistics based on the Gaussian distributions may be not applicable.

  2. 2.

    The limit distribution of the sample means for large values of the mean is proportional (up to a slowly varying term) to the Zipf distribution (and not to the Gaussian distribution). This is the statement of the Gnedenko–Doeblin theorem.

  3. 3.

    In many natural systems, the distribution of the values of some quantity is sharply concentrated around its mean value. Thus one can perform the transition from a probabilistic description to a deterministic description. This is not the case for non-Gaussian distributions. There is no such concentration around the mean, and because of this, a probabilistic description is appropriate for all problems of the social sciences in which non-Gaussian distributions appear.

There exist differences between the objects and processes studied in the natural and social sciences. Several of these differences are as follows.

  1. 1.

    The number of factors. The objects and processes studied in the social sciences usually depend on many more factors than the objects and processes studied in the natural sciences. Let us connect this to the non-Gaussian distributions in the social sciences [164]. Let y be a variable that characterizes the influences on the studied object. Let n(y)dy be the number of influences in the interval \((y, y+dy)\). Then n(y) is the distribution of the influences. In order to define (a discrete) factor, we separate the area of values of y into subareas each of width \(\varDelta y\). Then if the area of values of y has length L, the number of factors will be \(L/\varDelta y\). Thus n(y) has now the meaning of a distribution of factors. This distribution is Gaussian in most cases in the natural sciences and non-Gaussian in many cases of the social sciences. As we have mentioned above, the non-Gaussian distributions are not very concentrated around the mean value as compared to the Gaussian distributions. In other words, many more factors have to be taken into account when one analyzes items or subjects that are described by non-Gaussian distributions. Thus the analysis of many kinds of social objects or processes must be a multifactor analysis.

  2. 2.

    Dominance of parameters. In the case of systems from the natural sciences, usually there are several dominant latent parameters. In the case of social systems, usually there is no dominant latent parameter. The links among parameters are weak, and in addition, many latent parameters can be important.

  3. 3.

    Subjectivity of the results of measurements. The measurements in the study of social problems must be made very carefully. The main reasons for this are as follows: the measured system often cannot be reproduced; the researchers can easily influence the measurement process; the measurement can be very complicated.

  4. 4.

    Mathematics should be applied with care. The quantities that obey the laws of arithmetic are additive. There are two kinds of measurement scales that are used in the social sciences, and only one of them leads to additive quantities in most cases (i.e., to quantities that can be successfully studied by mathematical methods): closed measurement scales and open measurement scales. The closed measurement scales have a maximum upper value. Such a scale is, for example, the scale of school-children’s grades. Closed scales may lead to nonadditive quantities. The open measurement scales do not have a maximum upper value. Open scales lead in most cases to additive quantities. The measurement scales in the natural sciences are mostly open scales. Thus mathematical methods are generally applicable there. Open scales must be used also in the social sciences if one wants to apply mathematical methods of analysis successfully. The application of mathematical methods (developed for analysis of additive quantities) to nonadditive quantities may be useless. One can also use closed measurement scales, of course. The results of these measurements, however, have to be analyzed mostly qualitatively.

10 Several Notes on Scientometrics, Bibliometrics, Webometrics, and Informetrics

The term scientometrics was introduced in [44]. Scientometrics was defined in [44] as the application of those quantitative methods which are dealing with the analysis of science viewed as an information process. Thus fifty years ago, scientometrics was restricted to the measurement of science communication. Today, the area of research of scientometrics has increased. This can be seen from a more recent definition of scientometrics:

Scientometrics is the study of science, technology, and innovation from a quantitative perspective [165170].

In several more words, by means of scientometrics one analyzes the quantitative aspects of the generation, propagation, and utilization of scientific information in order to contribute to a better understanding of the mechanism of scientific research activities [171]. The research fields of scientometrics include, for example, production of indicators for support of policy and management of research structures and systems [172177]; measurement of impact of sets of articles, journals, and institutes as well as understanding scientific citations [178189]; mapping scientific fields [190192]. Scientometrics is closely connected to bibliometrics [193201] and webometrics [202210]. The term bibliometrics was introduced in 1969 (in the same year as the definition of scientometrics in [44]) as application of mathematical and statistical methods to books and other media of communication [211]. Thus fifty years ago, bibliometrics was used to study general information processes, whereas (as noted above) scientometrics was restricted to the measurement of scientific communication. Bibliometrics has received much attention [212215], e.g., in the area of evaluation of research programs [216] and in the area of analysis of industrial research performance [217]. Today, the border between scientometrics and bibliometrics has almost vanished, and the the terms scientometrics and bibliometrics are used almost synonymously [218]. The rapid development of information technologies and global computer networks has led to the birth of webometrics. Webometrics is defined as the study of the quantitative aspects of the construction and use of information resources, structures, and technologies on the Web, drawing on bibliometric and informetric approaches [209, 210]. Informetrics is a term for a more general subfield of information science dealing with mathematical and statistical analysis of communication processes in science [219, 220]. Informetrics may be considered an extension of bibliometrics, since informetrics deals also with electronic media and because of this, includes, e.g., the statistical analysis of text and hypertext systems, models for production of information, information measures in electronic libraries, and processes and quantitative aspects of information retrieval [221, 222].

Many researchers have made significant contributions to scientometrics, bibliometrics, and informetrics. We shall mention several names in the following chapters. Let us mention here the name of Eugene Garfield, who started the Science Citation Index (SCI) in 1964 at the Institute for Scientific Information in the USA. SCI was important for the development of bibliometrics and scientometrics and was a response to the information crisis in the sciences after World War II (when the quantity of research results increased rapidly, and problems occurred for scientists to play their main social role, i.e., to produce new knowledge). SCI used experience from earlier databases (such as Shepard’s citations [223, 224]). In 1956, Garfield founded the company Eugene Garfield Associates and began publication of Current Contents, a weekly containing bibliographic information from the area of pharmaceutics and biomedicine (the number of covered areas increased very rapidly). In 1960, Garfield changed the name of the company to Institute of Scientific Information. Let us note that the success of the Current Contents was connected to the use of Bradford’s law for “scattering” of research publications around research journals (Bradford’s law will be discussed in Chap. 4 of the book) [225]. According to the Bradford’s law, the set of publications from some research area can be roughly separated into three subsets: a small subset of core journals, a larger subset of journals connected to the research area, and a large set of journals in which papers from the research area could occur. Bradford’s law was used in the selection of journals contributing to the multidisciplinary index SCI. In the following years, the SCI and ISI became the world leaders in the area of scientific information. This position remained unchallenged for almost fifty years, even after the rise of the Internet.

Below we consider three topics from the area of scientometrics that are of interest for our discussion. These topics are:

  1. 1.

    Quantities that may be analyzed in the process of study of research dynamics;

  2. 2.

    Inequality of scientific achievements;

  3. 3.

    Knowledge landscapes.

10.1 Examples of Quantities that May Be Analyzed in the Process of the Study of Research Dynamics

Below we present a short list of some quantities, kinds of time series, and other units of data that may be used in the process of assessment of research and research organizations. The list is as follows.

  1. 1.

    Time series for the number of published papers in groups of journals (for example in national journals).

  2. 2.

    Time series for the total number and for the percentage of coauthored papers [226]. Coauthorship is an important phenomenon, since the development of modern science is connected to a steady increase in the number of coauthors, especially in the experimental branches of science. Coauthorship contributes to the increase of the length of an author’s publication list, and this length is important for the quality of research [227], for a scientific career, and for the process of approval of research projects.

    The percentage of coauthored publications varies in the different sciences. In the social sciences, it is very low, and in the natural sciences it can reach 90 % and even more. There are interesting notes of Price and Rousseau with respect to coauthorship [228, 229]. Price notes that important factors for the growth of coauthorship of publications are (i) the expansion of the material base of scientific research, e.g., new equipment stimulates coauthorship; (ii) in times of expansion, the number of very good scientists increases at a slower rate than the number of scientists. In such conditions, the most productive authors increase their productivity further by becoming leaders of scientific collectives. In these collectives, scientists can be found who want to have publications but are unable to publish alone (because they are inexperienced PhD students, for example). Let us note here that in recent years, one observes frequently the phenomenon of hyperauthorship (a very large number of coauthors of a publication) [230].

  3. 3.

    Network analysis of coauthorship groups [231240] and especially detection of dense and very productive coauthorship networks : “invisible colleges” [241246]. An invisible college has a core and periphery. The core usually consists of researchers from the same research structure or from a few research structures, e.g., from the same research institute or from several universities where productive groups exist.

  4. 4.

    Cluster analysis of research publications [247, 248].

  5. 5.

    Time series for the number of patents and discoveries. What can be expected in times of fast growth of the number of scientific discoveries is that their period of doubling is about ten years [249].

  6. 6.

    Distribution of publications among research organizations [250].

  7. 7.

    Distribution of patents and discoveries among the countries from a group of countries (for example, EU countries or the entire world).

  8. 8.

    Statics and dynamics of landscapes of scientific discoveries and engineering patents for different scientific or engineering fields.

  9. 9.

    Time series for the number of scientists (in a country). When a country’s research structures grow, one may expect doubling of the number of researchers every fifteen years. When the scientific structure becomes mature, the growth slows and may come to a halt.

  10. 10.

    Territorial distribution of scientists—national and international [251, 252]. Distribution of scientists with respect to their qualifications.

  11. 11.

    Dynamics of the age structure of scientists at the national level and comparison of the dynamics among countries from a group of countries.

Other kinds of quantities are connected to another important characteristic of research work: the citations of research publications [253260]. One may analyze:

  1. 1.

    Time series for citations of individual scientists, scientific groups, or scientific organizations [261268]. We note that the number of citations depends on the number of researchers who work in the corresponding scientific area [267], and there can be also negative citations of the publications of a researcher. Citation analysis allows us to identify different categories of researchers such as identity-creators and image-makers [269]. The number of citations depends on the rate of aging of research information [270]. This rate of aging may be different for different scientific disciplines.

  2. 2.

    Distribution of journals with respect to the citations of the papers in these journals (the impact factor is one possible indicator that can be constructed on the basis of such studies [271]).

  3. 3.

    Distribution of scientific organizations with respect to the citations of the publications of the organization. One has to be very careful here, since in some areas of science there are many more citations than in other areas of science.

  4. 4.

    Citation networks [272276]. Usually there are subnetworks of leading scientists in some scientific areas, and every leading scientist cites predominantly the other leading scientists. The nonleading scientists cite the leading scientists much more than other nonleading scientists.

  5. 5.

    Distribution of scientists with respect to the number of citations of their publications. Here different possibilities exist, e.g., the study of the distribution of citations of the most cited papers of scientists from a scientific group or scientific organization or the study of the distribution of the number of citations of the papers that contribute to the h-factors or g-indexes of the researchers from the assessed research groups or research organizations.

  6. 6.

    Distribution of publications of a scientific group or scientific organization with respect to the number of citations they have.

  7. 7.

    Distribution of citations among scientific fields [277, 278].

  8. 8.

    Distribution of the time interval between appearance of a publication and its first citation.

  9. 9.

    Landscapes of citations [279, 280] with respect to scientific discipline; countries; kind of publications; research organizations in a country, etc.

  10. 10.

    Distributions of self-citations for scientific disciplines and in research groups and research organizations.

In addition, one may analyze other characteristics of science dynamics such as interdisciplinarity of scientific journals on the basis of the betweenness centrality measure used in social networks analysis [281]; aging of scientific literature [282, 283]; dynamics of scientific communication [284], etc.

10.2 Inequality of Scientific Achievements

Different researchers have different scientific achievements. Many factors influence the achievement of individual researchers or group of researchers. If we consider individual researchers, four main factors may be considered [218]: the subject matter; the author’s age; the author’s social status; the observation period. Experienced researchers usually have larger scientific production and larger scientific achievements in comparison to the newcomers without research experience. Chemists usually have larger research production than mathematicians. An established professor with high social status usually has a much larger network of collaborations and contributes to more scientific achievements in comparison to a young researcher without such social status.

One of the tasks of the assessment of research organizations is the measurement of the inequality of scientific achievements of researchers and research collectives. One may use the notion of Coulter [285 ] that the distribution of some characteristics of research productivity is the division of the units of these characteristics among the components of the corresponding structure of the research organization. Inequality then may be defined as variation of the above division. Observation of large research groups shows that researchers are distributed usually within three classes with respect to a part of their production (the part that can be measured by the number of authored and coauthored papers in the international databases such as ISI Web of Science or SCOPUS): a small class of very productive scientists; a large class of very unproductive scientists; a large class of scientists who fall between the above two extreme classes.

In the study of inequality in research organizations, one may use not only quantitative methods but also qualitative methods such as expert evaluations [286288], surveys, and content analysis. We shall discuss numerous indexes of inequality of scientific achievements in the following chapters and especially in Chap. 4. Now we shall focus our attention on the concept of knowledge landscape.

10.3 Knowledge Landscapes

An understanding of the evolution of research organizations requires research complementary to mathematical investigations [289302]. Very useful tool for such research are knowledge maps [303325] and knowledge landscapes [326333]. They may be used for identification of potential collaborators [334]; for creation of document-level maps of research fields [335]; for international comparisons of research systems [336]; for modeling of science [337]; etc.

The concept of knowledge landscape is as follows: Describe the corresponding field of science or technology through a function of parameters such as height, weight, size, technical data, etc. Then a virtual knowledge landscape can be constructed from empirical data in order to visualize and understand innovation and other processes in science and technology.

One example of a technological knowledge landscape can be given by the function \(E=E(S,v)\), where E are the expenses for developing a new car, S and v being the size and velocity of the car. The landscape is constructed as the values of S and v are plotted on the horizontal axes, and the values of E are plotted on the vertical axis.

Knowledge landscapes may be used for evaluation and tracking evolution of research structures and systems [338]. Two selected examples are:

  1. 1.

    Application of knowledge landscapes for evaluating national research strategies. The national research structures can be considered to be groups of researchers who compete for scientific results following optimal research strategies. The efforts of each research structure or the efforts of each country become visible, comparable, and measurable by means of appropriate landscapes connected, for example, to the number of publications. The aggregate research strategies of a country can thereby be represented by the distribution of publications in the various scientific disciplines. In so doing, within a two-dimensional space, i.e., axes being the scientific disciplines and number of publications, different countries occupy different locations. Various political discussions can follow, and evolution strategies invented thereafter. In addition, one can track scientific areas of strategic importance on the basis of journal mappings [339] or can construct global maps of science [340] that can be very useful as elements of national research decision support systems.

  2. 2.

    Landscapes for research evaluation based on scientific citations. Citations are important in the retrieval and evaluation of information in scientific communication systems [341344]. This is based on the objective nature of the citations as components of a global expert evaluation system, as represented by the Science Citation Index. Thus the importance of the citation landscapes increases in the process of formation of a research policy. One example of this is personnel management decisions, which influence individual research careers or investment strategies.

11 Notes on Research Production and Research Productivity

Researchers produce numerous kinds of items as a result of their work: research publications, technical reports, patents, etc. This research production may be counted, and the corresponding numbers can be divided by units of time (month, year, etc.). The obtained numbers are characteristics of the researcher’s research productivity. The values of the characteristics of the research productivity depend on the considered time interval. If the time interval for calculation of the characteristics of the research productivity is the same as the career length of the researcher, then the characteristics of the research productivity coincide with the characteristics of the research production of the researcher.

Research structures and their system of functioning are extremely important for research productivity [345349]. Research organizations provide the material conditions for research work [350], but they also support research environment that may stimulate or may influence negatively the research work [351]. A part of this environment are the interactions among researchers. These interactions influence the external motivation for research work (external means that the source of motivation is rooted in the actions of other persons). In addition, the above interactions influence the internal motivation for research work (internal means that the source of motivations is rooted in the ideas of the researcher). It is observed that the more productive researchers search for and establish more connections with colleagues. Thus the presence of at least one productive researcher in any of the small units of a research organization (laboratories, sections, or chairs) creates more stimuli for research work.

Research productivity is age-dependent [352355]. In most cases, the research productivity of an individual researcher decreases as he/she ages and comes close to retirement age. In most cases, research productivity of a research organization decreases with increasing average age of the researchers in the organization (if the average age is large enough). There exist other factors that may affect research productivity [356]. Eleven of these factors [357] are persistence, resource adequacy (adequate funds for research, adequate equipment, etc.), access to literature, professional commitment, intelligence, initiative, creativity, learning capability, concern for advancement, stimulative leadership, and external orientation (adequate contacts with superior researchers and participation in seminars and conferences).

Research production is an important quantity [358] with a complex structure. There exist research collectives, and these research collectives produce publications (but not only publications). Resources are spent for organizing researchers into structures and for supporting a system of functioning for those structures. One result of the actions of researchers is the set of their research publications, which is an important quantitative measure of the scientific productivity. If additional analysis of the content of the publications is made, the set of publications can be used also as a qualitative measure of research productivity. The connection between the research structures and the set of publications is relatively complicated for two reasons: there are scientists who don’t have publications, and some publications are produced by more than one author. It is very interesting [143] that s large number of researchers (about \(50\,\%\)) do not have publications. This will be further discussed in Chap. 4. But the absence of publications does not mean that the corresponding researchers are lazy or incompetent. It may mean that their contribution to the publications is not large enough for their names to be included in the list of coauthors of the publication. In order to evaluate the scientific production of such researchers, one may have to use units that are smaller than a research publication. Such a smaller unit may be, for example, the time spent for support of research work, the number of collaborations with researchers who are authors of publications, etc. The current units used widely for measurement of scientific performance are (i) unit for information: research publication (scientific paper); (ii) unit for impact of the unit for information: a citation of the research publication. It seems that these units are too crude or at least they are too large to measure the research performance of large classes of researchers. Then for the low-productive (in terms of research publications) researchers, one may think about other units for measurement of their performance, and research papers and their citations may be used to measure performance of the top class of researchers. The research community is greatly interconnected, and it needs all kind of “workers”: full “workers” who produce scientific publications and partial “workers” who support the full “workers” in the process of producing the scientific publications.

Statistical methods are much used for evaluation of research work [359]. One has to be careful in applying such methods, because the number of components of a scientific organization may be large, but it is not very large in comparison with the number of molecules in a glass of water, for example. In addition, in contrast to molecules (which are weakly connected), the elements of a research organization may be strongly connected, and these connections are an important part of their structure and system of working. Thus one has to be careful about the direct application of methods from thermodynamics (such as temperature or entropy) to scientific systems and organizations. Such methods may be useful, but they reflect only some of the properties of the research organizations, since a research organization is not only a statistical collection of elements each of which is in random motion. For example, in addition to the random processes in research organizations, there exist also deterministic processes that may lead to the rise of complex research structures.

This book is focused on research publications and their citations. Because of this, we shall discuss below in more detail the importance of research publications for assessment of research production. The importance of citations of research publications for assessment of research production will be demonstrated in Chap. 2, where we shall discuss numerous indexes for evaluation of research production constructed on the basis of citations of research publications.

Research production has the form of books, monographs, reports, theses, articles in journals, etc. For the purposes of scientometric studies, papers published in refereed scientific journals seems to be the most suitable unit (at least today and for some time in the future). Usually, scientific papers are not subdivided into smaller units (this may cause some problems, since as the scientific contribution of large groups of research personnel is not enough to put the names of the corresponding researchers as coauthors of research articles). Other elements used in the evaluation of research are, e.g., the number of citations of a paper, the number of references, the number of coauthors of the paper, etc. The number of publications of a researcher may be used as an indicator of latent characteristics of researchers and scientific organizations such as prestige of the researcher; prestige of the research organization of the researcher; contribution to science of researchers or research organization; productivity of researcher and research organization; eliteness of the researcher or research organization. The speed of increase of the number of research publications can be used as an indicator for the currency of a scientific research area; perspectives of a scientific research area; phase of development of a research group or organization (at the initial phase of development, there is a rapid increase in the number of publications. Then the speed of increase of publications falls, and finally one observes maturity of the research group or organization with an almost constant number of research publications per year).

A significant number of publications are published jointly, and the tendency for joint publications increases with the increasing complexity of scientific research. But how to count joint publications? There are many ways to do this. The three most popular of them are [360]:

  1. 1.

    Normal count: Every author of a publication has received full credit. For an example, if a publication has four authors, it counts as one publication for every author [266].

  2. 2.

    Straight count: Only the first author receives credit for the publication. This count discriminates against the second and subsequent authors.

  3. 3.

    Adjusted count: Every author of a publication receives an equal fraction of the total credit of one unit [361]. For example, if a publication has four authors, every author is given one-fourth credit. By this measure, the relative contribution of every author to the article is ignored.

Several additional formulas for counting joint publications are as follows (N is the number of coauthors in all relationships below and \(S_n\) is the score assigned to the nth author)

  • HCM-count (Howard–Cole–Maxwell) [362 ]

    $$\begin{aligned} S_n = \frac{(3/2)^{N-n}}{\sum \limits _{n=1}^N (3/2)^{n-1}}. \end{aligned}$$
    (1.2)
  • EKW-count (Ellwein–Khahab–Waldman) [363 ]

    $$\begin{aligned} S_n = \frac{b^{n-1}}{\sum \limits _{n=1}^N b^{n-1}}. \end{aligned}$$
    (1.3)

    In [363], \(b=0.8\).

  • LV-count (Lukovits–Vinkler) [364 ]

    $$\begin{aligned} S_1 = \frac{N+1}{2 N F}; \ \ S_n = \frac{n + T}{2 n F T}, \end{aligned}$$
    (1.4)

    where

    $$ T = \frac{100}{A}; \ \ F = \frac{1}{2} \left( \frac{1}{N} + \frac{N-1}{T} + \sum \limits _{n=1}^N \frac{1}{n}\right) $$

    and A is the authorship threshold (percentage as the lowest share of contribution to a paper: five or ten percent of total credit).

  • TG-count (Trueba–Guerrero ) [365 ]

    $$\begin{aligned} S_n = \frac{2(2N-n+2)}{3N(N+1)}(1-f) + C_n f, \end{aligned}$$
    (1.5)

    where f is the share for crediting favored authors (usually favored authors are the first, the second, and the last author) \(0< f < 1\); \(C_n\) is the rate of favoring the nth author \(\sum \limits _{i=1}^N C_n=1\).

There are additional systems for distribution of scores among coauthors [366, 367]. The problem of coauthorship will be discussed again in Chap. 2 in connection with the h-index.

The following additional indicators based on publications may be used for assessment of publication activity of single researchers, research groups, or research organizations:

  1. 1.

    Distribution of number of publications in research organizations or in research groups of a research organization.

  2. 2.

    Distribution of publications in research journals.

  3. 3.

    Distribution of researchers with respect of number of publications for the three kinds of counts mentioned above.

  4. 4.

    Distribution of publications with respect to gender of the authors.

  5. 5.

    Distribution of publication with respect to language.

  6. 6.

    Distribution of publications with respect to their kind (articles, papers in conference proceedings, book chapters, books, etc.).

  7. 7.

    Distribution of publications with respect of the kind of research work (theoretical publications, experimental publications, reviews, etc.).

Finally, let us note here an interesting effect occurring when funding is linked to publication counts [368]. In this case, the publication numbers may jump dramatically, but with the highest percentage increase in the lower-impact journals. And the jump is larger as a percentage in universities than in government research institutes, which could mean that there is an unused research potential in universities, whereas research institutes already produce many publications and can’t increase their percentage by as much as is the case for universities.

12 Notes on the Methods of Research Assessment

12.1 Method of Expert Evaluation

Assessment of the production and productivity of researchers can be made by expert evaluation of the work of any researcher, e.g., for the last three or five years. The method of expert evaluation judges mainly the quality of the work of a researcher. One variant of realization of this method is to use a commission of five to seven experts. These experts evaluate a researcher with respect to two criteria: contribution of researchers to the corresponding area of science (external criterion) and importance and usefulness of researchers for his/her scientific organization (internal criterion). The commission of experts ranks researchers with respect to the above two criteria. After the ranking, the ranks of the researchers can be converted to points according to some appropriate conversion schema [143]. Evaluated researchers are divided into groups, since the productivity of a researcher depends on the scientific environment (conditions of work and relationships among the scientists from the scientific group to which the scientists belong) and on its status in the scientific organization. There are five groups of researchers:

  1. 1.

    Researchers without doctoral degree who perform assistant work in the research or applied units of the scientific organization.

  2. 2.

    Researchers without doctoral degree who perform research work in the applied units of the research organization.

  3. 3.

    Researchers without doctoral degree who perform research work in the research units of the research organization.

  4. 4.

    Researchers with doctoral degree who perform research work in the applied units of the research organization.

  5. 5.

    Researchers with doctoral degree who perform research work in the research units of the research organization.

There can be subgroups of these groups. For example, the researchers with doctoral degrees who perform research work in the research units of the research organization can be assistant professors, associate professors, and full professors.

Expert evaluation may be a part of the complex evaluation of a researcher. Such a complex evaluation may contain, for example [369]:

  1. 1.

    Evaluation of production by number of publications, number of internal and external scientific reports, number of developed methods, discovered effects or other significant achievements of the researcher.

  2. 2.

    Further evaluation of production by total number of pages of publications, reports, manuscripts, etc.

  3. 3.

    Evaluation of importance of produced knowledge by number of citations. An important problem here is the scaling of citations from different years or different subject categories [370, 371].

  4. 4.

    Evaluation by prestige of the journals and publishing houses that published the articles, book chapters, or books of the scientist.

  5. 5.

    Evaluation on the basis of obtained national and international awards.

  6. 6.

    Expert evaluation of the significance of the achievements for solving real problems.

  7. 7.

    Expert evaluation on the influence of the scientist on the basis of the following categories: influence in the area of research, influence in the corresponding scientific discipline, influence in other scientific disciplines.

  8. 8.

    Evaluation of influence by the number of colleagues from the scientific organization who think that this researcher is a high-quality scientist.

  9. 9.

    Further evaluation of influence on the basis of number of publications that are judged as important by the colleagues, and number of times each of the important publications is pointed to as important by the colleagues of the researcher.

In addition to the above evaluation, there may also be an evaluation of the distribution of the time spent by researcher as follows [143]: time for the main scientific work (time for the research work of the scientist, time for scientific research of general interest that leads to the solution of large classes of problems, time for scientific research of special interest that leads to the solution of specific problems, time for transfer of scientific research for improving products or processes or for obtaining new kinds of products or processes); time for directing research work of other researchers; time for collaboration with other researchers; time for consultations and expert evaluations; time for pedagogical work; time for administrative work (time for internal administrative work within the scientific unit (laboratory, section or chair), time for communication with the higher levels of the administrative hierarchy, time for relations with other scientific groups and clients). The list of evaluations may be enlarged, e.g., by numerous indexes presented in Chap. 2.

12.2 Assessment of Basic Research

Assessment of basic research is a problem for research administrators, since there are no simple measures of the contribution to scientific knowledge made by researchers. Many partial indicators exist, and each of them accounts for one (or several) factors that influence the basic research and is influenced by other factors that are not connected to the basic research. Thus in order to obtain reliable results, one has to minimize the influence of the factors that are not connected to research [372]. This can be done on the basis of the concept according to which basic research is considered as a process with inputs, process body (scientific production), and outputs [373]. The elements of the process of basic research are

  1. 1.

    Inputs: stock of scientific knowledge and existing techniques; financial resources; institutional scientific resources (scientific instruments, skilled personal, etc.); recruited personnel (untrained students, etc.); environmental conditions (such as diverse natural influences).

  2. 2.

    Scientific production: conceptual, experimental and technical work of scientists; support work by engineers, technicians, etc.; dissemination of research results; education work for development and reproduction of scientific skills (training young scientists, etc.); administrative support work (including organizing adequate inputs).

  3. 3.

    Outputs: scientific contributions to the discipline, new techniques and new scientific knowledge; scientific contributions to other areas of science; educational contribution: trained students, PhD students, and scientists; economic contribution (such as engineers and workers with increased skills for industry, technological spin-offs, commercial benefits for equipment suppliers, etc.; cultural contributions.

In order to minimize the influence of factors not connected to basic research, Martin and Irvin [372] have proposed the method of converging partial indicators for assessment of basic research. This method is based on the following five principles:

  1. 1.

    Indicators are applied to research groups rather than to the individual scientists;

  2. 2.

    Citation-based indicators are seen as reflecting the impact (and not the quality or importance) of the research work [374, 375];

  3. 3.

    A range of indicators is used, each of which focuses on different aspects of the group performance;

  4. 4.

    As far as possible, indicators are applied to a matched group (comparing like with like principle)

  5. 5.

    As the indicators used have an imperfect or partial nature, only in those cases where they yield convergent results can it be assumed that the influence of peripheral factors has been kept small. In these cases, it can be assumed that the indicators provide a reliable estimate of the contribution of the different groups to scientific progress.

An algorithm for scientometric assessment that may be used also for assessment of basic research has been proposed by Moravcsic [376]. This algorithm includes

  1. 1.

    specifying the purpose of the assessment;

  2. 2.

    specifying the system to be assessed;

  3. 3.

    deciding on the level of assessment;

  4. 4.

    setting criteria;

  5. 5.

    selecting methods and specifying indicators for each criterion;

  6. 6.

    determining the links among the components within and outside the system (scientific political issues, type of the subject field and activity, etc.);

  7. 7.

    carrying out measurements;

  8. 8.

    interpreting the results obtained;

  9. 9.

    drawing conclusions of the assessment.

The evaluators should combine the scientometric assessment and peer reviews [377, 378] in order to obtain a complete picture of the evaluated research group or research organization [379384].

Applications of indicators for assessment of basic research is related to certain problems. Several of these problems are as follows [372]: (i) the contribution of each publication to the scientific knowledge is different; (ii) publication rates are different for different research fields. Four important problems connected to indicators based on citation analysis are [385, 386]: (i) technical problems with databases such as authors with identical names, variation of names, incomplete coverage of journals; (ii) variation of citation rate of a paper during its lifetime; (iii) Presence of critical citations or halo-effect citations (Halo effect means that a researcher’s overall impression about other researchers influences the observer’s feelings and thoughts about the properties of their publications); (iv) self-citations and in-house citations. Finally, there are problems connected to peer evaluation, e.g.: (i) individuals evaluate scientific contributions on the basis of their cognitive and social status (which can be quite different); (ii) perceived implication of results of one’s own center and competitors may affect evaluation; (iii) conformist assessments (for example, triggered by halo effect or by lack of knowledge about the contributions of different centers).

12.3 Evaluation of Research Organizations and Groups of Research Organizations

Evaluation of research and technology programs [387], research organizations, and groups of research organizations becomes increasingly important especially when limited resources for research should be distributed. Below we mention three examples of such systems used in continental Europe.

  1. 1.

    The SEP system of The Netherlands

    SEP (Standard Evaluation Protocol) has two key objectives: (i) to improve research quality based on an external peer review, including scientific and societal relevance of research, research policy and management; (ii) to be accountable to the board of the research organization, and toward funding agencies, government, and society. SEP has two levels: evaluation of the research institute as a whole, and evaluation of specific research groups or programs. The criteria for evaluation are quality, productivity, (social) relevance, and vitality and feasibility. Each of these criteria contains several subcriteria.

  2. 2.

    AERES system (France)

    The Evaluation Agency for Research and Higher Education (AERES) (www.aeres-evaluation.com) is an independent administrative authority whose task is to evaluate French research organizations and institutions, research and higher education institutions, scientific cooperation foundations and institutions as well as the French National Research Agency. Usually each year, 25 % of all institutions are evaluated, and thus the national evaluation cycle is four years. The evaluation criteria of AERES are: scientific quality and output; academic reputation and drawing power; interactions with the social, economic, and cultural environment; organization and life of the institution; involvement in training by research; strategy and scientific prospects for the next period.

  3. 3.

    Evaluation of national research policy: OECD indicators

    Various systems of indicators for evaluation of national research policy can be constructed (for example, see [388, 389]). One example is the system of eight categories of indicators applied by OECD for benchmarking of national research policies [390]. These categories of indicators are: human resources in research and technology development; public and private investment in research and technology development; scientific and technological production; impact of research and technology development on economic competitiveness and employment; human resources, knowledge creation; transmission and application of knowledge; innovation finance, output, and market.

13 Mathematics and Quantification of Research Performance. English–Czerwon Method

Egghe [391] discusses performance in twelve diverse information production systems. Below, we shall focus our attention on a mathematical method for quantification of research performance: the English-Czerwon method that may be used for assessment of research performance within the scope of the following kinds of information production systems discussed in [391]: papers—citations system; authors—publications system; authors—citations system. The English–Czerwon method [392] is an interesting example of a combination of evaluation on the basis of quantitative indicators with evaluation on the basis of peer review. Its description is as follows.

Suppose there are k research units, and let the performance of each unit for the nth year of evaluation be denoted by \(p_i(n)\ge 0\). The performance has two components,

$$\begin{aligned} p_i(n)=o_i(n)+s_i(n), \end{aligned}$$
(1.6)

where:

  • \(o_i(n)\) is the “objective” part of the performance, calculated on the basis of quantitative indicators;

  • \(s_i(n)\) is the “subjective” part of performance, calculated on the basis of the judgment of all evaluated units about the research of the ith unit for the nth year of evaluation.

There are two variants of the method: (i) weighting without accounting for the current performance; (ii) weighting with accounting for the current performance.

13.1 Weighting Without Accounting for the Current Performance

In this variant of the method, the “subjective” part of the performance is calculated as follows:

$$\begin{aligned} s_i(n)=\sum \limits _{j=1}^{k} q_{ij} w_j(n), \end{aligned}$$
(1.7)

where the weight \(w_j(n)\) is defined as \(w_j(1)=o_j(1)\) for the first year of evaluation, and

$$\begin{aligned} w_j(n)=o_j(n)+\frac{\sum \limits _{l=1}^{n-1} p_j(l)}{n-1} \end{aligned}$$
(1.8)

for the years \(n>1\). We note that in this variant of the methodology, the performance for the current year \(p_i(n)\) is not taken into account. The methods for calculation of the parameters \(o_j\), \(q_{ij}\), and \(w_j(n)\) will be discussed in a separate paragraph below.

13.2 Weighting with Accounting for the Current Performance

The performance in the current year \(p_i(n)\) is taken into account by defining the weight as follows:

$$\begin{aligned} \overline{w}_j(n)= \frac{1}{n} \sum \limits _{l=1}^n p_j(l). \end{aligned}$$
(1.9)

Let us substitute (1.9) in (1.6). Taking into account (1.7), we obtain an equation for \(p_i(n)\) as follows:

$$\begin{aligned} p_i(n)=o_i(n)+ \sum \limits _{j=1}^{k} q_{ij} \overline{w}_j(n) = \nonumber \\ o_i(n)+\sum \limits _{j=1}^{k} q_{ij} \frac{p_j(1)+\dots +p_j(n-1)}{n} + \sum \limits _{j=1}^{k} q_{ij} \frac{p_j(n)}{n}. \end{aligned}$$
(1.10)

We have the following relationship from (1.9):

$$\begin{aligned} p_j(1)+ \dots + p_j(n-1) = (n-1)\overline{w}_j(n-1). \end{aligned}$$
(1.11)

The substitution of (1.11) in (1.10) leads to the following equation for \(p_i(n)\):

$$\begin{aligned} p_i(n) = o_i(n) + \sum \limits _{j=1}^k q_{ij} \frac{(n-1)\overline{w}_j(n-1) + p_j(n)}{n}. \end{aligned}$$
(1.12)

We note that (1.12) defines a system of equations for \(p_i(n)\), and this system still has to be solved.

The solution of (1.12) can be presented in matrix form. Let I be the identity matrix (which contains 1 in all diagonal positions and 0 elsewhere) and Q the matrix whose elements are the evaluations \(q_{ij}\). Let the vectors \(\mathbf {p}(n)\), \(\mathbf {o}(n)\), and \(\mathbf {\overline{w}}(n-1)\) have components \(p_i(n)\), \(o_i(n)\), and \(\overline{w}_i(n-1)\) respectively. Then the solution of (1.12) is

$$\begin{aligned} \mathbf {p}(n) = (nI - Q)^{-1}[n \mathbf {o}(n)+(n-1)Q\mathbf {\overline{w}}(n-1)]. \end{aligned}$$
(1.13)

In order to use this solution, we have to impose an additional requirement on the matrix Q. This requirement has to ensure positive values of \(p_i(n)\). The requirement is that the eigenvector of Q with positive elements have an eigenvalue less than n. Then each eigenvalue is less than n (Frobenius–Perron theorem), and the time series expansion of \((nI - Q)^{-1}\) (where the exponent \(-1\) means inversion of the matrix \((nI - Q)\)) converges to a matrix with nonnegative matrix elements. This will ensure positive values of \(p_i(n)\). For small values of the elements of Q, we have the approximation

$$\begin{aligned} (nI - Q)^{-1} \approx \frac{I}{n} + \frac{Q}{n^2}, \end{aligned}$$
(1.14)

and the solution (1.13) for \(\mathbf {p}(n)\) becomes

$$\begin{aligned} \mathbf {p}(n) \approx \mathbf {o}(n) + \frac{Q \mathbf {o}(n) + (n-1) Q \mathbf {\overline{w}}(n-1)}{n}. \end{aligned}$$
(1.15)

13.3 How to Determine the Values of Parameters

In order to use the two variants of the English–Czerwon method, we have to determine the values \(o_i(n)\) of the “objective” part of the performance as well as the values of the evaluation coefficients \(q_{ij}\) from the “subjective” part of the performance. The simplest way to set the values of \(o_i\) is just to consider the value of one indicator: the number of citations of the papers of the research organization for the current year, for example. Of course, more that one indicator can be incorporated in \(o_i\) by means of appropriate weighting.

The values of \(q_{ij}\) can be set as follows. We note that these values have to be small (in order to satisfy the assumption for small Q on which basis we obtained the relationship (1.15) above). The determination of the values can be made as follows:

  1. 1.

    Evaluation of performance by ranking research organizations.

    One takes several (five to ten) leading scientists from each research organization and asks each of them to rank the research organizations with respect to their performance. The ranking is between rank 1 and rank k. Then the average rank \(\overline{r}_{ij}\) is calculated by averaging the assigned ranks to the ith organization from the scientists of the jth organization:

    $$\begin{aligned} \overline{r}_{ij} = \frac{1}{L} \sum \limits _{j=1}^{L} r_{ij}, \end{aligned}$$
    (1.16)

    where L is the number of evaluating scientists.

  2. 2.

    Calculation of \(q_{ij}\).

    The \(q_{ij}\) are calculated as follows:

    $$\begin{aligned} q_{ij}=\frac{k-\overline{r}_{ij}}{M}, \end{aligned}$$
    (1.17)

    where M is a large number (of order of 5k or larger), which ensures that the values of \(q_{ij}\) are small.

  3. 3.

    Obtaining \(p_i(n)\).

    The \(p_{i}(n)\) are then obtained on the basis of (1.15). The final step is to perform a normalization

    $$\begin{aligned} \overline{p}_i(n) = \frac{p_i(n)}{\sum \limits _{j=1}^k p_i(n)}, \end{aligned}$$
    (1.18)

    and one can rank the institutions with respect to the values of \(\overline{p}_i\) for the corresponding year.

14 Concluding Remarks

At the end of Part I of this book, the reader already may have a impression about the complexity of science and research organizations; about the importance of science for society; about the features of research production and non-Gaussianity of some statistical characteristics of quantities used for assessment of research; about quantities, methods, and systems used for assessment of research and research organizations. Thus the reader is prepared to move to the world of quantities and models used for the study of science dynamics and for assessment of research, researchers, and research organizations.