Keywords

These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

As pointed out in the text above, the findings of a single study may well be a chance finding in spite of a formal statistical significance. This should not prevent publication but further information would be needed before a final decision is made about the risk with the drug exposure. Such repetitions can be made by other scientists or by a new study by the original authors, using a different material.

We can illustrate this with a concrete problem – the possible association between maternal use of the antihistamine loratadine and an increased risk of hypospadias in the male offspring. When the Swedish Medical Birth Register began recording maternal drug use (on July 1, 1994), this was initiated by a committee which also contained representatives from the drug industry. When data had been collected for a couple of years, it was suggested that one should test the system using a relatively new antihistamine, loratadine, as the test drug. This was done and the first analysis was made in June 1997. At that time there were 355 exposed infants and the total number of congenital malformations was the expected one, but there were two cases of hypospadias (0.7 expected). This could obviously be a chance finding, and there was nothing known in the mechanism of action of the drug which suggested an antiandrogenic effect. The monitoring continued as seen in Fig. 12.1. In May 1998 there were seven infants with hypospadias against 1.6 expected and a real concern was raised about a possible causal association. The next 3 years, however, only two further cases were seen which supported the thought that the original cluster was random. Then a new outburst of cases occurred so in November 2001, there were 15 cases against the expected number of 5.6 – since the first observed cluster in May 1988, a further eight cases had occurred against the expected number of four. Numbers were small and the malformation not very serious, but it was felt necessary to report the finding (Källén and Otterblad Olausson 2001) even though it was stressed in the article that the finding could be random.

Fig. 12.1
figure 1

Observed and expected numbers of hypospadias after maternal use of loratadine in early pregnancy at eight occasions (month, year) (Data from Källén and Otterblad Olausson (2001))

Relatively quickly studies were published from other scientists. Some small studies were published (Diav-Citrin et al. 2003; Moretti et al. 2003) but they had no power to detect a doubling of the rate of hypospadias (a total of 210 and 161 exposed infants, respectively). A third study came from CDC in Atlanta and was based on the National Birth Defects Prevention Study. In this study only penile or more severe hypospadias were studied, and in our material all but one case was of the most common type with the urethral opening in the sulcus coronaries of the penis. Two somewhat larger studies were published from Denmark (Pedersen et al. 2006a, b): one was based on linkage with a prescription register and the other on the prospective Danish National Birth Cohort. No increased risk for hypospadias was detected but the confidence intervals were wide and based on few exposures. Upper confidence limits were 10.4 and 6.9, respectively. These studies illustrate the difficulties to falsify a statement when it concerns a malformation – very large studies are needed.

In the meantime we had continued the monitoring of loratadine and hypospadias using further sources of malformation identification (Källén and Otterblad Olausson 2006). For the period 2002–2004, we identified 1911 infants exposed to loratadine – only two had hypospadias and the expected number was 4.6, and the RR was 0.47 with a 95 % CI of 0.06–1.68. The rates of hypospadias after loratadine exposure in the two periods were 25/2780 and 2/1911. These two rates differ significantly (p < 0.001). Our conclusion was that most likely the high number during the first period of observation was due to the multiple testing situation which exists in the monitoring process.

This conclusion is supported by the continued monitoring. During the years 2005–2013, there were 4315 loratadine exposures and 11 cases of hypospadias, OR = 0.85 (95 % CI 0.47–1.53).

This example illustrates how even a strong association may arise by chance and how large materials are needed to eliminate a suspicion of causality. At the present time, one can think it was unnecessary to publish the first cluster, notably as it referred to a malformation which was not very severe. A likely mechanism of action was unknown – no antiandrogenic effect was known of the drug – but one hypothesis was that the drug could affect the fetal testicles which could have had effects on the future reproductive capacity of exposed boys, something which would not be evident until decades later.

Another example refers to the observation of an association between maternal use of erythromycin and infant cardiovascular defects. This was first noticed in a study of maternal drug use and infant cardiovascular defects (Källén and Otterblad Olausson 2003) based on Swedish Medical Birth Register data from 1995 to 2001. The study tabulated 68 different drugs or drug groups and found 14 with a “statistically significant” increased risk and one with a decreased risk. Some of these were previously known or suspected associations like insulin, antihypertensives, fertility drugs, and anticonvulsants, but some were not known or suspected before, including macrolides and erythromycin. This association was scrutinized in a further paper (Källén et al. 2005) which contained data from one more year (2002). The OR for any cardiovascular defect was 1.84 (95 % CI 1.29–2.62) based on 31 exposed cases. Eighteen of them were ventricular or atrial septum defects. A hypothesis was presented to explain the association: a side effect of erythromycin is an inhibition of a specific cardiac potassium current (IKr) channel which according to animal experiments could result in a cardiac malformation. The possibility that the observation was a result of multiple testing was also stressed.

A follow-up study using the same source of data for a few years more (Källén K 2005) found a lower and nonsignificant OR, but the two estimates did not differ significantly. A few studies from other parts of the world could not verify the association (e.g., Bérard et al. 2015). Extended Swedish data were examined again in a paper by Källén and Danielsson (2014). The OR for a cardiovascular defect for the period 1996–2011 was 1.70 (95 % CI 1.26–2.29). When the observation period was divided into two halves (1996–2003, 2004–2011), the OR estimates were nearly identical: 1.69 and 1.71. What had happened in the meantime was that the use of erythromycin in early pregnancy had drastically decreased: from 2.7 per 1000 the first 8-year period to 0.7 per 1000 the second 8-year period. The RR for the second period was therefore not statistically significant (95 % CI 0.78–3.25). One thus had an estimate (1.71) which can be compared to two values. One is 1.0 which indicates no effect; the other is 1.69 which was the significant estimate for the first 8 years. Obviously, the second comparison is more relevant than the former. This way of reasoning is related to Bayesian statistics. It can be pointed out that one study who declared that erythromycin lacked teratogenic capacity actually registered an odds ratio of 1.6 which was not statistically significant from 1.0 but neither from our estimate of 1.7 (Romøren et al. 2012).

A more formalized method to analyze data from repeated studies of the same problem is a meta-analysis. The idea is simple: if a series of studies present risks which all are estimates of one true risk, one would get a better risk assessment if the various risks were pooled and weighted according to the size of the studies. In the ideal situation, the meta-analysis should be made on all studies (published or not) which were performed with identical methodology; all studies should report the same experiment or type of observation. This is often the case in clinical studies which is also the situation where meta-analyses have played the largest role.

When epidemiological studies are to be compared, the basic prerequisites for a meta-analysis are seldom fulfilled. So, for instance, methods of ascertainment of drug use and of presence of malformations vary and various biases may be included differently in the studies. Typically one or a few studies are much larger than other studies and will dominate the common risk estimate – which means that the end result to a large extent will depend on the methodology and quality of these large studies.

As a part of the meta-analysis, a selection of studies based on quality is usually made, dismissing, for instance, studies without controls. In spite of formal guidelines, it will end with the subjective idea of the persons who select data for the analysis. Let us take a simple example: the US National Birth Defects Research Program is probably regarded by many as being of a high standard (otherwise results would not get published in major journals) in spite of the fact that exposure data are retrieved retrospectively and that the percentage of nonresponders is high, about 30 %, facts which make other researches regard their results with suspicion.

Unfortunately, the fact that a paper is published in a large medical journal does not mean that the results are believable; sometimes it only means that they are spectacular!

Personally, I think that a detailed discussion of the pros and cons of the individual studies is more important than to get a weighted common risk estimate. If large studies differ in results, this may more likely be an effect of design than of random variation around the true risk. In the next chapter, I will summarize some questions which can be put when one wants to scrutinize the validity of a published study. A critical discussion of the results of large and methodologically acceptable studies should be carried out, and a conclusion can be drawn, perhaps without an effort to pinpoint a specific risk level.