Keywords

1 Introduction

At the early stage risk was involving to political or military games for a decision making with the minimum risk. The pioneering work of Quincy Wright [40] on the study of war was devoted to this line of thought. The Mathematics and Statistics involved, could be considered in our day as low-level, he applied eventually the differential equation theory with a successful application.

In principle Risk is defined as an exposure to the chance of injury or loss—it is a hazard or dangerous chance for an event under consideration. Therefore the probability of a damage, for the considered phenomenon (in Politics, Economy, Epidemiology, Food Science, Industry etc.) caused by external or internal factors has to be evaluated, especially the essential ones influence the Risk. That is why we refer eventually to Relative Risk (RR), as each factor influences the Risk in a different way. In principle the relative risk (RR) is the ratio of the probability of an outcome in an exposed group to the probability of an outcome in an unexposed group. That is why a value of RR = 1 means that the exposure does not affect the outcome and a “risk factor” is assigned when RR\(>\)1, i.e. when the risk of the outcome is increased by the exposure.

This is clear in Epidemiological studies where in principle it is needed to identify and quantitatively assess the susceptibility of a partition of the population to specify risk factors, so we refer to RR. For a nice introduction to statistical terminology for RR see Everitt [13, Chapter 12].

Such an early attempt was by John Graunt (1620–1674), founder of Demography, trying to evaluate “bills of Mortality” as he explained in his work “Observations”, while almost at the same time his friend Sir William Petty (1623–1687), economist and Philosopher, published the “Political Anatomy of Ireland”. So there was an early attempt to evaluate Social and Political Risk.

Still there is a line of thought loyal to the idea that Risk Analysis is only related to political problems through the Decision Theory; William Playfair (1759–1823) was among the first working with empirical data in 1796 publishing “For the Use of the Enemies of England: A Real Statement of the Finances and Resources of Great Britain”. Quincy Wright (1890–1970) in his excellent book “A study of War” offers a development of simple indexes, evaluating Risk successfully, as it has been pointed out for such an important problem as war.

The statistical work of Florence Nightingale (1820–1910) is essential as with her “Notes on Matters effecting the Health, Efficiency and Hospital Administration of the British Army” opened the problem of analysing Epidemiological Data, adopting the Statistical methods of that time.

It is really Armitage and Doll [2] who introduced the recent Statistical framework to the Cancer Problem. Latter Crump et al. [11] can be referred for their work on carcinogenic process, while [14] provided a global work for the Bioassays, and Megill [34] worked on Risk Analysis (RA) for Economical Data. It was emphasising that, at the early stages, the fundamental in RA was to isolate the involved variables. Still the Statistical background was not too high. But the adoption of the triangle distribution was essentially useful. The triangle distribution has been faced under a different statistical background recently, but still the triangle obtained from the mode, the minimum value its high and the maximum value of the data can be proved very useful, as a special case of trapezoidal distributions, see also Appendix 1. For a compact new presentation, while a more general framework was developed by Ngunen and McLachian [35]. The main characteristic of the triangular distribution is its simplicity and can be easily adopted in practice. There are excellent examples with no particular mathematical difficulty in Megill [34].

In Food Science the Risk Assessment problem is easier to be understood by those who are not familiar to RA. In the next section we discuss the existing Practical Background, which is not that easy to be developed, despite the characterisation as “practical”. Most of the ideas presented are from the area of Food Science where RA is very clear under a chemical analysis orientation.

In Sect. 3 the existing theoretical insight is discussed briefly and therefore the Discussion in Sect. 4 is based on Sects. 2 and 3.

2 Practical Background

Risk factors can be increased during the food processing and food can be contaminated due to filtering and cleaning agents or during packaging and storage. Therefore, in principle, chemical hazards can be divided in two primary categories:

  1. (i)

    Naturally occurring chemical hazards (mycotoxins, pyrrolizidine, alkaloids, polychlorinated biphenyls etc.)

  2. (ii)

    Added chemical hazards (pesticides, antibiotics, hormones, heavy metals etc.)

The effect of each chemical as a Risk factor has been studied and we refer briefly to mycotoxins as dairy products belong to the most susceptible foodstuffs (one possible reason humidity, among others) to be contaminated by them and might result to, Kitsos and Tsaknis [28] among others.

  1. 1.

    Direct contamination

  2. 2.

    Indirect contamination

Example 1 (Indirect Contamination)

Recall that due to decontamination bacteria become resistant and therefore interhospital various can appear. Moreover a number of countries have introduced or proposed regulations for the control and analysis of aflatoxins in food.

As far as milk is concerned, EU requires the maximum level of aflatoxin \(M_1\), \(\max M_1\) say, \(\max M_1=0.5\) mg/kg. The maximum tolerated level for aflatoxin \(M_1\), in dairy products, it is not the same all over the world and therefore it is regulated in some countries.

The problem of mixtures has been discussed from a statistical point of view, for the cancer problem, in Kitsos and Elder [23]. In practice the highly carcinogenic polychlorinated biophenyls (PCBs) are mixtures of chlorinated biphenyls with varying percentages of chlorine/weight. It has been noticed, Biuthgen et al. [3], that PCBs led to a worldwide contamination of the environment due to their physical/chemical properties. Moreover PCBs have been classified as probable human carcinogens, while no Tolerance Daily Intake (TDI), the main safety standards, have been established for them. Eventually the production of PCBs was banned in USA in 1979 and internationally in 2001.

Example 2

Dioxins occur as complex mixtures, Kitsos and Edler [23], and mixtures act through a common mechanism but vary in their toxic potency. As an example Tetrachlorodibenzo-p-dioxin (TCCD) has been classified as a human carcinogen, as there are epidemiological studies on exposure to 2,3,7,8- tetrachlorodizen-p-dioxin and cancer risk. It might not be responsible for producing substantial chronic disability in humans but there are experimental evidence for its carcinogenicity, McConnell et al. [33].

The TDI for dioxins is 1–4 pg TEQ/kg body-weight/day, which is exceeded in industrialised countries. Recall that Toxic Equivalence Quotient (TEQ) is the USA Environmental Protection Agency (EPA),with TEQ being the, threshold for safe dioxin exposure at Toxicity Equivalence of \(0.7\) picograms per kilogram of body weight per day.

The Lethal Dose is an index of the percentage P of the lethal toxicity LD\({ }_P\) of a given toxic substance or different type radiation. LD\({ }_{0.5}\) is the amount of given material at once that causes the death of 50% in the group of animals (usually rats and mice) under investigation. Furthermore the median lethal dose LD\({ }_{0.5}\) is widely used as a measure of the same effect in toxic studies. Not only the lethal dose but also the low percentiles need special consideration, see Kitsos [18], who suggested a sequential approach to face the problem.

Now if we assume that two components \(C_1\) and \(C_2\) are identical, except that \(C_1\) is thinned by a factor \(T<1\), then we can replace the same dose as \(d_1\) of \(C_1\) by an appropriate dose of \(C_2\), so that to have the same effect as dose \(d_1\). In such a case the effect of a combination of doses \(d_1\) and \(d_2\), for the components under consideration, \(C_1\) and \(C_2\) are: \(Td_1+d_2\) of \(C_2\), \(d_1+(1/T)d_2\) of \(C_1\) respectively. The factor T is known as relative potency of \(C_1\) to \(C_2\) and \(\lambda =1/T\) is called as relative potency of \(C_2\) to \(C_1\). Such simple but practical rules are appreciated by experimenters. Another practical problem in RA appears with the study of the involved covariates. The role of covariates, in this context, is of great interest and has been discussed by Kitsos [17].

Therefore, in principle, to cover as many as possible sources of risk as possible, we can say that the target in human risk assessment is the estimation of the probability of an adverse effect to human being, and the identification of such a source.

3 Theoretical Inside

In Biostatistics and in particular in Risk Analysis for the Cancer problem, the evolution of the Statistical applications can be considered in the over 1000 references in Edler and Kitsos [12]. The development of methods and the application of particular probabilistic models, Kopp-Schneider et al. [30] and statistical analyses appear on extended development after 1980. Recently, Stochastic Carcinogenesis Models, Dose Response Models on Modeling Lung Cancer Screening are medical ideas with a strong statistical insight which have been adopted by the scientific community, Kitsos [21].

The variance-covariance matrix is related to Fisher’s information matrix and it is the basis for evaluating optimal designs in chemical kinetics, Kitsos and Kolovos [24], while for a recent review of the Mathematical models, facing breast cancer see Mashekova et al. [32]. The Fisher’s information measure appears either in a parametric form, or in an entropy type. The former plays an important role to a number of Statistical applications, such as the optimal experimental design, the calibration problem, the variance estimation of the parameters in Logit model in RA, Cox [9, 10], etc. The latter is fundamental to the Information Theory. Both have been extended by Kitsos an Tavoularis [26].

Indeed: With the use of an extra parameter, which influences the “shape” of the distribution, the generalised Normal distribution was introduced. This is useful in cases where “fat tails” exist, i.e. the Normal distribution devotes 0.05 probability in details but there are cases where the distribution provides in “tails” more than 0.05 probability. Such cases are covered under the generalised Normal distribution.

The \(\gamma \)-ordered Normal Distribution emerged from the Logarithm Sobolev Inequality and it is a generalisation of the multivariable Normal distribution, with an extra parameter \(\gamma \) in the following, see also Appendix 2. It can be useful to RA to adopt the general cumulative hazard function, see (2), (3) below. Therefore a strong mathematical background exists, which is certainly difficult to be followed by toxicologists, medical doctors, etc who mainly work on RA. Still it has not been developed an appropriate software for it.

The Normal distribution has been extended by Kitsos and Tavoularis [26], with a rather complicated form, quite the opposite of the easy to handle the triangular distribution, see Appendix 1.

Let, as usual, \(\Gamma (a)\) be the gamma function and \(\Gamma (x,a)\) the upper incomplete gamma function. Then the cumulative distribution function (cdf) of the \(GN(\mu ,\sigma ^2;\gamma )\), say,

$$\displaystyle \begin{aligned} {} \Phi_G(x)=1-\frac{\Gamma(\gamma_0,\gamma_0z^{\frac{1}{\gamma_0}})}{2\Gamma(\gamma_0)}\,,\,\,\, \gamma_0=\frac{\gamma-1}{\gamma}\,,\,\,\,\gamma\in\mathbb{R}-[0,1] ,\,\, z=\frac{x-\mu}{\sigma} \end{aligned} $$
(1)

with \(\mu \) the position parameter, \(\sigma \) the scale parameter and \(\gamma \) an extra, shape parameter. In this line of thought Kitsos and Toulias [25] as well as Toulias and Kitsos [37] worked on the Generalised Normal Distribution \(GN(\mu ,\sigma ^2 ; \gamma )\) with \(\gamma \in \mathbf {R}-[0,1]\) being an extra shape parameter. This extra parameter \(\gamma \) makes the difference: when \(\gamma =2\) the usual Normal is obtained, with \(\gamma >0\) it is still normal with “heavy tails”.

Under this foundation the cumulative hazard function, \(H(\cdot )\) say, of a random variable \(X\sim GN(\mu ,\sigma ^2 ; \gamma )\) can be proved equal to

$$\displaystyle \begin{aligned} {} H(x)= -\log {A(\gamma_0,z)}\,,\,\,x>\mu \end{aligned} $$
(2)

while

$$\displaystyle \begin{aligned} {} H(x)= -\log (1-{A(\gamma_0,\vert z \vert)})\,,\,\,x\le\mu \end{aligned} $$
(3)

with

$$\displaystyle \begin{aligned} z=\frac{x-\mu}{\sigma}\,,\,\, A(\gamma_0, z )=\frac{\Gamma(\gamma_0,\gamma_0z^{\frac{1}{\gamma_0}})}{2\Gamma(\gamma_0)}\,. \end{aligned}$$

with \(\Gamma (a)\) being the gamma function and \(\Gamma (x,a)\) the upper incomplete gamma function.

Example 3

As \(\gamma \rightarrow \pm \infty \) the Generalised Normal Distribution tends to Laplace, \(L(\mu ,\sigma )\). Then it can be proved that:

$$\displaystyle \begin{aligned} H(x)=\log(2+\frac{x-\mu}{\sigma})\,,\,\,x>\mu \end{aligned} $$
(4)

while

$$\displaystyle \begin{aligned} {} H(x)=\log(1-\frac{1}{2}e^{\frac{x-\mu}{\sigma}})\,,\,\,x\le\mu \,. \end{aligned} $$
(5)

See Toulias and Kitsos [37] for more examples.

Let X be the rv denoting the time of death. Recall that the future lifetime of a given time \(x_0\) is the remaining time \(X-x_0\) until death. Therefore the expected value, \(E(X)\), of the future life time can be evaluated . In principle it has to be a function of the involved survival function, Breslow and Day [5]. This idea can be extended with the \(\gamma \)-order Generalised Normal. Moreover for the future lifetime rv \(X_0\) at point \(x_0\), \( X\sim GN(\mu ,\sigma ^2 ; \gamma )\) the density function (df), the cdf can be evaluated and the corresponding expected future lifetime is

$$\displaystyle \begin{aligned} {} E(X)=\frac{2(\mu-x_0) }{A(\gamma_0,z_0)}\,,\,\,z_0=\frac{x_0-\mu}{\sigma} \,\,\text{.} \end{aligned} $$
(6)

The above mentioned results, among others, provide evidence to discuss, that the theoretical inside is moving faster than the applications are needed such results. These comments need special consideration and further analysis. We try in Sect. 4.

4 Discussion

To emphasise how difficult the evaluation of Risk might be, we recall the Simpson’s paradox, Blyth [4], when three events \(A,B,C\) are considered. Then if we assume

$$\displaystyle \begin{aligned} {} \begin{matrix} P(A \vert BC) \ge P(A\vert \bar{B}C)\,\text{,}\\ \\ P(A \vert B\bar{C}) \ge P(A\vert \bar{B}\bar{C})\,\text{,} \end{matrix} \end{aligned} $$
(7)

we might have

$$\displaystyle \begin{aligned} {} P(A \vert B) \le P(A\vert \bar{B}) \,\,\text{.} \end{aligned} $$
(8)

Therefore there is a prior, a scepticism of how “sure” a procedure might be. In Epidemiological studies it is needed to identify and quantitatively assess the susceptibility of a portion of the population to specific Risk factors. It is assumed that they have been exposed to the same possible hazardous factors. The difference that at the early stage of the study, is only on a particular factor which acts as a susceptibility factor. In such cases Statistics provides the evaluation of the RR. That is why J. Grant was mentioned in Sect. 1.

Concerning the \(2 \times 2\) setup, for correlated binary response, the backbone of medical doctors research, a very practical line of thought, with a theoretical background was faced by Mandal et al. [31] and is exactly the spirit we would like to encourage, following Cox believes, Kitsos [22] of how Statistics can support other Sciences, especially medicine. They provide the appropriate proportions and their variances in a \(2 \times 2\) setup, so that 95% confidence interval can be constructed. The Binary Response problem has been early discussed by Cox [9] while for a theoretical approach for Ca problems see Kitsos [20].

The area of interest of RA is wide; it covers a number of fields, with completely different backgrounds sometimes, such as Politics and food Science. But Food Science is related to Cancer problems as we briefly discussed.

Excellent Economical studies with “elementary” statistical work are covered by Megill [34] who provides useful results as Wright [40] did earlier. Therefore we oscillate between Practice and Theory. We have theoretical results, waiting to be applied as in the 60s we had Cancer problems waiting for statistical considerations.

The cancer problem was eventually the problem under consideration and Sir David Cox provided a number of examples working on this, Cox [8,9,10], and offers ideas of how we can proceed on medical data analysis, Kitsos [22], trying to keep it simple. In contrast Tan [36], offers a completely theoretical approach, understanding perhaps from mathematicians, Kopp-Schneider [29] reviews the theoretical stochastic models and in lesser extent Kitsos [19, 21], Kitsos and Toulias [25] the appropriate modeling, which are difficult to be followed by medical doctors and not only.

A compromise between theory and practice has been attended in Edler and Kitsos [12],where different approaches facing cancer are discussed, while Cogliano et al. [7] discuss more toxicological oriented cases. The logit method took some time to be appreciated, but provides a nice tool for estimating Relative Risk, Kitsos [20], among others. The role of covariates in such studies, and not only for cancer it is of great interest and we believe is needed to be investigated, Kitsos [17]. In this paper we provided food for thought for a comparison of an easy to understand work with the triangular distribution and the rather complicated Generalised Normal, see Appendix 2. It is not only a matter of choice. It depends heavily on the structure of data—we would say graph your data and then proceed your analysis.

The logit methods can be applied on different applied areas. Certain qualities have been adopted for different areas from international organisations, see IARC [16], WHO [39], US EPA [38] among others. As it is mentioned in Sect. 2, as far as Food Risk Assessment concerns, Fisher et al. [15], Kitsos and Tsaknis [28], Binthgen et al. [3] among others, there are more chemical results and guidance for the involved risk, while Amaral-Mendes and Pluyger [1] offer an extensive list of Biomarkers for Cancer Risk Assessment in humans.

In Cancer problems, and not only, the hazard function identification is crucial and only Statistical Analysis can be adopted, Armitage and Doll [2], Crump et al. [11], Cogliano et al. [7], Kitsos [17, 18]. The extended work, based on generalised Normal distribution, mentioned in Sect. 2 in a global form, generalising the hazard function, needs certainly not only an appropriate software cover but also to bridge the differences between statistical line of thought and applications.

Meanwhile recent methods can be applied to face cancer, Carayanni and Kitsos [6], where the existent software offers a great support. More geometrical knowledge is needed, or even fractals, to describe a tumour. But the communication with Medicine might be difficult.

We need to keep the balance of how “Statistics in Action” has to behave offering solutions to crucial problems of Risk Analysis, see Mandal et al. [31], while the theoretical work of Tan [36] adds a strong background but not useful to practical problems. Since the time that Cox [10] provided a general solution for hazard functions, there is an extensive development of Statistical Theory for Risk problems.

It might be eventually helpful to offer results, but now we believe it is also very crucial to offer solutions, to the corresponding fields, working in Risk Analysis. That is the practical background is needed, we believe, to be widely known, as it is easier to be absorbed from practician and the theoretical framework is needed to be supported from the appropriate software so that to bridge the gap with practical applications.