Background

Subarachnoid hemorrhage is commonly felt to be admitted in clusters. The suspicion that external factors might play a role has been maintained for decades [30]. More specifically, meteorological variables have been suspected to contribute to rupture of brain aneurysms. The influence of weekday, moon phase, or major public events has been occasionally considered [15]. Regarding meteorological factors, reports from various locations appear to be conflicting and so far, no accepted consensus has been achieved regarding whether meteorological and seasonal factors play a role, and what they might be.

From our climatic zone, two published analyses emerged some decade ago: one from Frankfurt and one from Dusseldorf [5, 32]. The authors of the Frankfurt report concluded that atmospheric pressure change of more than 10 hPa within 24 h is an independent predictor of clustering of patients with SAH and that arterial hypertension is an independent risk factor for the occurrence of SAH at change days [32]. The report from Düsseldorf found a peak incidence of aneurysmal SAH during the month of April, but no influence of short-term weather fluctuations on the incidence of aneurysmal SAH [5].

The actual reanalysis was motivated by two factors: first, the climatic change, which may render formerly discreet patterns more pronounced and second, the advent of machine learning and more refined statistical tools to identify patterns.

In the current analysis, we followed a pragmatic concept, in that we studied the number of admissions for SAH, with the concept in mind to develop a prognostic tool, ideally allowing to allocate the necessary workforce; i.e., can we identify a very low risk meteorological condition so that we can spare the necessary vascular on-call service. Therefore, we focused on the admissions of SAH rather than the time of occurrence of the hemorrhage, which usually is identical with our referral pattern.

Methods

Data collection

On May 6, 2019, hospital admissions for the admissions dates for the admission diagnosis I60.* (all subarachnoid hemorrhage) from January 1, 2009 until December 31, 2018 were retrieved from our hospital administration system. Historical meteorological data for the nearest meteorological station, Düsseldorf Airport, was retrieved for the same time segment, January 1, 2009 until December 31, 2018, from the download archive of the Deutsche Wetterdienst (DWD, https://www.dwd.de/DE/leistungen/klimadatendeutschland/klarchivstunden.html). Average daily temperature (T D0), barometric pressure (QNH D0), and precipitation (Precip. D0), and also from the preceding days, were selected as primary parameters (Table 1). The differences of temperature from the previous days (T D0 − (T D-1), T0 − (T D-2), T0 − T (D-3)) and of the barometric pressure (QNH D0 − QNH D-1) were calculated as secondary parameters.

Table 1 Abbreviations and definitions of meteorological variables used

Data analysis

Data analysis and graphical work was done using open source Python and the included MatPlotlib module (https://www.python.org). A Pearson correlation matrix of the retrieved meteorological factors and admission for SAH was calculated, and the 8 most correlated values with SAH admissions were selected for further evaluation.

The data set was split 80 to 20% as training and testing set. Prior to further analysis, the training data set was corrected for imbalance of the two classes “No SAH” and “One or more SAH” by Synthetic Minority Over-sampling Technique (SMOTE) [8].

A deep learning neuronal network based on open sourced TensorFlow software was used for analysis and prediction (https://www.tensorflow.org). Following preliminary testing with several network architectures, we chose a feedforward network (FFN) including one hidden layer with 12 neurons and an output layer consisting of a single neuron for the Boolean output function (admission of one or more SAH cases on that day, yes (1) or no (0)) [6, 12]. Keras’ Adam optimizer was chosen with default parameters, i.e., learning rate 0.001, and dropout with a value of 0.2 was used to prevent overfitting [33].

For comparison, predictive accuracy was also evaluated with help of the a Logistic Regression Classifier (LRC) algorithm of the scikit-learn platform (https://scikit-learn.org/stable/) [4, 12, 29]. The LRC algorithm was chosen, because it compared favorably with other sample tested tools, such as K-Neighbors Classifier (KNN), Gaussian NB (NB), Support Vector Classifier (SVC), Linear SVC (LSVC), Random Forest Classifier (RFC), Decision Tree Regressor (DTR), and bootstrapped Bagging Classifier (BAG).

The accuracy of the models was calculated upon the correct prediction for the days of the test set, regarding the admission of one or more patients with the diagnosis SAH. Stability of the predictions was assessed with k-fold cross validation. Calibration curves were calculated relating the predicted probabilities to the fraction of positives. Finally, prognostications for putative meteorological data were generated by the two algorithms and compared.

Results

Compared with the period of our previous analysis 2003–2005, the climatic conditions in the Düsseldorf area changed only modestly and without statistical significance (see Table 2). Temperature was higher by 0.4 °C during the second period, which might be explained by the fact that the first period included only the first half of the year 2005. Descriptive weather analysis for the 10-year period confirmed the mild climate in the Düsseldorf area with limited temperature spread (Fig. 1).

Table 2 Meteorological key parameters during the 2 study periods
Fig. 1
figure 1

Histograms of the distribution of the meteorological key parameters and the admission of one or more SAH cases. A total of 1569 patients with SAH were admitted during the 10-year period. No SAH was admitted on 2400 days and one or more cases on 1252 days

For the 10-year period from January 1, 2009 until December 31, 2018, a total of 1569 patients with SAH were admitted. No SAH was admitted on 2400 days (65.7%), 1 SAH on 979 days (26.7%), 2 cases on 233 days (6.4%), 3 SAH on 37 days (1.0%), 4 in 2 days (0.05%), and 5 cases on 1 day (0.03%). The distribution corresponded to a Poisson distribution with a Lambda value of 0.43 SAH per day (https://en.wikipedia.org/wiki/Poisson_distribution). The stochastic simulation with the Poisson distribution is given in Table 3.

Table 3 Clustering of SAH daily admissions and stochastic simulation with a Poisson distribution and an expected Lambda value of 0.43 SAH per day

Pearson’s correlation matrix suggested a weak positive correlation of admissions for SAH with precipitation on the previous day, and weak inverse relations with the actual mean daily temperature, the temperature change from the previous days, and with barometric pressure on the index day and the day before (Figs. 2 and 3).

Fig. 2
figure 2

Heatmap of Pearson correlation factors between the meteorological variables and admission of patients after SAH. Pearson correlation matrix suggested a weak positive correlation of admissions for SAH with precipitation on the previous day and weak inverse relations with the actual mean daily temperature and the temperature change from the previous days, and weak inverse correlations with barometric pressure on the index day and the day before

Fig. 3
figure 3

3-D scatterplot illustrating the relation between the temperature increase over 2 days, precipitation on the previous day, scaled barometric pressure, and the admissions for SAH. Blue dots represent day without admissions and red dots show days with one or more admissions for SAH

The deep learning algorithm with the feedforward network achieved a predictive accuracy of 53%. For comparison, analysis using the Logistic Regression Classifier (LRC) achieved a predictive accuracy of 62%. The mean area under curve (AUC) score for both models was 52%. Standard deviation of the AUC sore with k-fold cross validation was ± 2% for the FFN and ± 6% for the LRC. The detailed metrics are given in Table 4 and the calibration curves in Fig. 4.

Table 4 Metrics of the two models on the test data set
Fig. 4
figure 4

Calibration curves of the two models (upper graph) on the test data set and histograms of predicted values (lower graph). Predictions appear somewhat erratic and center around 0.5 with both models

The two models made corresponding prognostications for the seven of nine putative meteorological constellations and discordant prognostications for two (Table 5).

Table 5 Hypothetical meteorological values and the modeled prognostication of the chance to admit one or more patients with SAH at the University Hospital Düsseldorf

Discussion

Meteorological factors for the occurrence of SAH have been suspected for more than 30 years [30] and several reports addressed the issue over the last decades. Apart from a seasonal variation with a peak in spring and fall, which is acknowledged by most analyses from various climatic zones, results regarding potential meteorological variables influencing SAH disagreed widely. Colder temperatures, higher or lower humidity, high or low barometric pressure, or changing barometric pressure have been suspected [1,2,3, 7, 13,14,15,16,17,18,19,20,21,22,23, 25,26,27,28, 32,33,34,35,36,37]. In some reports, different correlations were seen for women and men [9]. Our former analysis from the greater Düsseldorf area published in 2006 as well as several particularly of the more recent reports denies relevant meteorological factors for the occurrence of SAH [5, 10, 11, 24, 31].

The potential interaction of weather and the occurrence of SAH may be complex and be mediated by the activity level in the population, which is obviously dependent on meteorological conditions. The rationale to study the correlation of weather conditions and the incidence of SAH in the greater Düsseldorf area again was based mainly on the hope that the newer analytics methods; i.e., machine learning using deep neural networks might identify so far occult complex interactions between meteorological variables.

Our results confirmed some weak correlation between meteorological variables and the probability of admission of patients with SAH, but random factors dominate. The accuracy of the algorithms achieved values of only 53% and 62%, which is not much better than flipping a coin. Although we have to admit that these values do not allow clinically relevant prognostication, the result is interesting from the theoretical point of view. The 3 to 12% gain over just flipping a coin is likely to correspond to the size of the meteorological influence on the occurrence of SAH, or in other words, the importance of meteorological factors for the occurrence of SAH is 3–12%, while the rest is just random or due to other unknown factors. The AUC scores of all models of both models were 52%, suggesting that meteorological influences can be made responsible for some 2%.

Our correlation analysis identified a weak positive correlation of admission for SAH with precipitation on the previous day, and weak inverse associations with the actual mean daily temperature and the temperature change from the previous days, and with barometric pressure on the index day and the day before. In other words, higher and increasing temperature and higher barometric pressure appear to protect against SAH and on the other hand, colder and decreasing temperature and rain on the previous day appear to increase the risk. Comparing this result of other reports from our neighbor cities, we see a contrast to the report from Frankfurt published in 2006, where atmospheric pressure change within the previous 24 h was identified as the main meteorological risk factor [32]. The absolute value of barometric pressure was more relevant in our analysis than the change. A recent analysis by Herten and coauthors of the University Hospital Essen found that days with preceding 5-day mean temperature below 0 °C had a significantly higher rate of SAH and days with mean temperature above 22 °C had a significantly lower rate of aneurysmal SAH than average [15]. Considering the somewhat different approach to data selection, the association of SAH with colder temperature is in line with our current results.

In addition to meteorological respectively seasonal factors, Herten and coauthors also analyzed circaseptan influences and they found the highest frequency on Sundays and the lowest on Wednesdays. We had previously reported a diurnal rhythm of SAH incidence with two peaks during morning and in the evening, and a statistically significant nadir during forenoon and midday was evident [5]. These reports again suggest that the general activity level in the population is an important factor. The activity level is partly related to the weather, but the interaction is complex. For example, our finding that precipitation on the previous day is more important than precipitation on the day of hemorrhage itself may also be related to increased activity level after a rainy day.

As the number of apples falling from a tree during a storm in fall obviously depends on the number and ripeness of the apples on the tree, which means on the quality of the previous summer, the incidence of SAH depends certainly not only on specific triggering events but also on the proceeding timeline. These timelines are difficult to grasp statistically and are therefore usually reduced to a few key values, such as average temperature during the previous week. So far, according to our knowledge, no analysis has been made about external factors for the incidence of SAH that included meteorological factors over a longer timeline than a few days before admission. The situation is further complicated by the fact that the longer the delay between course and effect, the more the effect spreads over several days. For example, a drop in temperature over one, two, three, or even more days appears to be important for the occurrence of SAH. It is quite possible that the difference from the previous week is more important than from the previous day. However, after 1 week, any effect spreads over a few days. We found that the effect of the difference TD0-(TD-2) was most pronounced.

In summary, although our results confirmed a weak correlation between admissions for SAH, even deep neural networks appear unable to produce a useful prognostic model. Therefore, we need to continue to keep the necessary services available constantly. However, the actual analysis allowed to appreciate the size of the meteorological influence on the occurrence of SAH.

Conclusion

Although a weak positive correlation of the probability to admit one or more cases of SAH, with precipitation on the previous day and weak inverse relations with the actual mean daily temperature and the temperature change from the previous days, and with barometric pressure on the index day and the day before were present in the analyzed time period, no helpful prognostic model could be deduced with current state machine learning methods. The size of the meteorological influence for the occurrence of SAH appears to account for only a few percent, while chance of admission of one or more patients with SAH is mainly random or due to unknown factors.