Abstract
Background
COVID-19 pandemic has indeed plunged the global community especially African countries into an alarming difficult situation culminating into a great deal amounts of catastrophes such as economic recession, political instability and loss of jobs. The pandemic spreads exponentially and causes loss of lives. Following the outbreak of the omicron new variant of concern, forecasting and identification of the COVID-19 infection cases is very vital for government at various levels. Hence, having knowledge of the spread at a particular point in time, swift actions can be taken by government at various levels with a view to accordingly formulate new policies and modalities towards minimizing the trajectory of the consequences of COVID-19 pandemic to both public health and economic sectors.
Methods
Here, a potent combination of Convolutional Neural Network (CNN) learning algorithm along with Long Short Term Memory (LSTM) learning algorithm has been proposed in this work in order to produce a hybrid of a deep learning algorithm Convolutional Neural Network - Long Short Term Memory (CNN-LSTM) for forecasting COVID-19 infection cases particularly in Nigeria, South Africa and Botswana. Forecasting models for COVID-19 infection cases in Nigeria, South Africa and Botswana, were developed for 10 days using deep learning-based approaches namely CNN, LSTM and CNN-LSTM deep learning algorithm respectively.
Results
The models were evaluated on the basis of four standard performance evaluation metrics which include accuracy, MSE, MAE and RMSE respectively. However, the CNN-LSTM deep learning-based forecasting model achieved the best accuracy of 98.30%, 97.60%, and 97.74% for Nigeria, South Africa and Botswana respectively; and in the same manner, achieved lesser MSE, MAE and RMSE values compared to models developed with CNN and LSTM respectively.
Conclusions
Taken together, the CNN-LSTM deep learning-based forecasting model for COVID-19 infection cases in Nigeria, South Africa and Botswana dramatically surpasses the two other DL based forecasting models (CNN and LSTM) for COVID-19 infection cases in Nigeria, South Africa and Botswana in terms of not only the best accuracy of with 98.30%, 97.60%, and 97.74% but also in terms of lesser MSE, MAE and RMSE.
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.Avoid common mistakes on your manuscript.
1 Introduction
Coronaviruses are group of viruses capable of infecting and killing mammals and birds particularly Middle East Respiratory Syndrome Coronavirus (MERS-CoV), Severe Acute Respiratory Syndrome Coronavirus (SARS-CoV), and 2019 Novel Coronavirus (2019-nCoV) [1,2,3]. 2019 Novel Coronavirus is designated and called COVID-19 or 2019-nCoV which is caused by Severe Acute Respiratory Syndrome Coronavirus 2 called SARS-Cov-2 [4]. The 2019-nCoV outbreak started from Wuhan Province, China in late December, 2019. Since the beginning of COVID-19 pandemic, many variants of novel coronavirus have emerged as the virus, SARS-CoV-2, continues to mutate and evolve [5, 6]. However, many of the mutations of the variants are innocuous on how the SARS-CoV-2 affects humans. Nevertheless, the mutations of others variants such as genetic changes in delta variants, is often made the SARS-CoV-2 more transmissible, contagious than its original version of the virus discovered in late 2019 [7, 8].
In Late November, 2021, a new variant of SARS-CoV-2 emerged and it was named Omicron by the World Health Organization (WHO). The variant was first reported on 24th November, 2021 in South Africa from the specimen collected on 9th November, 2021 in the country. On 26th November, 2021 WHO designed the variant as the “variant of concern” (VOC), named Omicron, due to how easily the variant spreads, severity of illness it causes to humans and its ability to cause a higher risk of reinfection as compared to other previously identified and reported variants [7]. Therefore, WHO has called on countries to develop monitoring and follow-up efforts to better understand the transmission of the new variant, to report cases the variant, to submit complete genetic sequences and related metadata to public and multilateral websites and to co-operate with the international community accordingly, conduct laboratory tests and field investigations to understand the potential impact of the new mutation on COVID19 epidemiology, social and effective community health measures, sensitivity, diagnostic techniques and methods, antibody neutrality, immune responses and appropriate traits [7, 9]. WHO also asked individual to take measures to reduce the risk of the new variant to the barest minimum especially social and public health measures such as hand hygiene, wearing well-fitting mask, improving ventilation of indoor space, getting vaccinated, physical distance and avoiding crowded spaces, among others. According to Sir Jeremy Farrarm, a British Scientist who has warned the world as he said “the emergence and spread of the Omicron variant of coronavirus shows that the world is closer to the start of a pandemic than the end”. According to WHO, 57 countries across the different regions of the world were reported to have Omicron variant cases and 46% of the cases are from African countries particularly South Africa. As of March 10, 2022, confirmed cases of Covid-19 from 55 African countries reached 11,453,505, however, over 273,418,843 vaccinations have been administered across the continent. South Africa, Nigeria and Botswana are African countries with highest number COVID-19 Omicron variant of concern cases.
Hence, many different artificial intelligence, statistical and mathematical approaches have been developed for predicting and forecasting the severity and spread of COVID-19 pandemic in many countries [10,11,12]. In the work of [1] LSTM was designed for predicting either 1 or 7 days, or both of COVID-19 cases. A study of [13] developed a COVID-19 Essential Supplies Forecasting Tool for Predicting different cases of COVID-19 in Northwest, Syria. A beb based predictive method for capturing data of COVID-19 in real-time-dependent manner from NCDC (Nigeria Centre for Disease Control) was developed in the work of [2]. The work of [14] has created a model for predicting ICU beds in Switzerland for individual hospitals at the subregional level. In the work of [15], two models, under public health intervention, namely ML (machine learning) and SEIR (susceptible-exposed infectious removed) were developed for predicting the spread of COVID-19 pandemic in China. The work of [16] developed DM (data mining) model for the prediction of 2019-nCoV events using Google Trend data in Iran. LSTM and linear regression models were employed for the estimation of favorable number of cases for 2019-nCoV thereby aiding the prediction and forecast of the effort and support of health care managers and policymakers plan and allocate and implement health care resources appropriately. In the work of [17], a real-time ML method based on the 2019-nCoV outbreak using news alerts was launched and disseminated through Clouds and official health report which appeared to be an online search function capable of forecasting daily cases of COVID-19 from either Baidu or GLEAM (agent-based machine model) in accordance with the Chinese Center Disease for Control and Prevention guidelines. This method appears by far robust in generating accurate forecast of COVID-19 cases even two days in advance. The study of [18] introduced ML-based study method for early infection fluctuations of 2019-nCoV epidemic data especially the rate of inflation over time using US data from the first case on January 20, 2020. Active health insights public records were released that included infectious power, the rate of mild and severe infections, rates of untreated diseases and the prediction of new diseases over time.
Following the outbreak of Omicron, the VOC, forecasting and identification of the COVID-19 infection cases is indeed invaluable for government at various levels [15]. Hence, having knowledge of the spread at a particular point in time, swift actions can be taken by government at various levels with a view to accordingly formulating new policies and modalities in order to mitigate the implications of the pandemic. In this work, we proposed a new hybrid deep learning algorithm and used to develop the forecasting model for COVID-19 infection cases in Nigeria, South Africa and Botswana respectively. We combined Convolutional Neural Network (CNN) learning algorithm, together with Long Short Term Memory (LSTM) learning algorithm, to propose a hybrid a learning algorithm Convolutional Neural Network—Long Short Term Memory (CNN-LSTM). We developed three deep learning forecasting models for COVID-19 infection cases of the three countries with CNN, LSTM and CNN-LSTM learning algorithms. The CNN-LSTM learning algorithm has just been proposed by this research team as a forecast-driven algorithm for COVID-19 infection cases of Nigeria, South Africa and Botswana; which by far appeared to be very accurate with lesser MSE, MAE and RMSE values than those of the individual CNN and LSTM models respectively. Although a lot of studies on either CNN models or LSTM models were reported as extensively cited above, to our knowledge, literature on the potent combination of CNN-LSTM DL algorithm has neither been reported nor documented in anyone of the three countries. Hence, the present study is indeed beneficial which aims at filling this gap with a view to redesigning the public health policies and strategies to mitigate the implications of COVID-19 pandemic not only in Nigeria, South Africa and Botswana but also in any African countries that may suffer from similar trajectory of the consequences.
2 Materials and methods
The Fig. 1 entails the diagrammatic representation of the adopted methodology in this research, as the materials and methods employed to develop the CNN-LSTM DL-Based Forecasting Model for COVID-19 Infection Cases in Nigeria, South Africa and Botswana respectively.
2.1 Dataset collection
The dataset that used in this study was sourced from the public dataset of “Our World in Data” organization. Our World in Data is a project of the Global Change Data Lab, a non-profit organization based in the United Kingdom (Registered Charity Number 1186433). The organization in collaboration with the university of oxford has developed a reliable repository of datasets about many topic focuses on problems affecting the world. The dataset used in this study, is the up to date World COVID-19 dataset, which contains sixty seven (67) features.
2.2 Feature extraction
In this section, we extract the relevant features of the dataset to build the time series forecasted models for COVID-19 infection cases of Nigeria, South Africa and Botswana respectively. Therefore, out 67 features of the dataset, only 9 features of the were extracted and used which include date, total_cases, new_cases, total_deaths, new_deaths, people_vaccinated_per_hundred, people_fully_vaccinated_per_hundred, new_deaths_per_million and new_cases_per_million.
2.3 Deep learning algorithm
2.3.1 Convolutional neural network
Convolutional Neural Network (CNN) is one of the most discriminatory algorithms using layers consisting of convolutional layers and pooling layers [19, 20]. Layers are stacked on top of each other [21]. The convolutional layer shares a lot of weight while the composite layers soften the convolutional layer release and reduce the amount of data rate from the lower layer [12]. The sharing of weight in the convolutional layer together with appropriate chosen pooling schemes, endows the CNN with some invariance properties [7, 22,23,24].CNN has been found to be one of the most effective DL vision algorithms used for image recognition and computer vision [25, 26]. In CNN's convolutional layer, convolutional features mapping from previous layer are convolved with learnable kernels with non-linear or linear activation functions such as hyperbolic tangent, sigmoid, softmax among others to output feature maps [27, 28]. As such, each feature map output combined with more than one input feature maps. As we can see in Eq. (1) below:-
where \({x}_{j}^{i}\) = output of the current layer; \({x}_{i}^{I-1}\) = previous layer output; \({k}_{ij }^{I}\) = kernel for the present layer; \({b}_{j}^{I}\) = biases for the current layer. \({M}_{j}\) is selection input maps while an addictive bias b is each output map.
At pooling layer of CNN the number of input and output features maps cannot change because if there are n input maps, then there must be n output map exactly [11]. But the size of the each dimension of the output map will be reduced due to the down sampling operation. The operation can be formulated with the equation below:
Convolution and pooling functions are always employed by CNN as its activation functions for incorporating non-linearity other than being employed as its normal function. The Fig. 2 depicts the overall architecture of CNN.
2.3.2 Long short term memory
Long short-term memory (LSTM) is used to solve the learning models for Recurrent Neural Networks (RNNs) to produce promising results on a variety of tasks including building prediction and language model [29,30,31]. LSTM solves tasks that are complex long time-lags which have never been solved by recurrent network algorithms [32,33,34,35]. LSTM learning algorithm is able to learn long-term relationships by replacing the RNN hidden layer with memory cells [36]. The key idea behind LSTM is the cell state, thus the horizontal line running through the top of Fig. 3.
The algorithm adds or removes information to the cell state called Get. An input gate (it), forget gate (ft) and output gate (ot) as shown in Fig. 2 and it can be defined as (see Fig. 4):
2.3.3 CNN-LSTM
Potent combination of CNN-LSTM (Convolutional Neural Network-Long Short-Term Memory) has been proposed in order to limit the employment of CNN for getting invaluable knowledge and learning ability with greater efficiency of LSTM algorithm in a time-series-internal-representation-dependent manner to dramatically detect as well as model the short- and long-term temporal interrelationships cornified in the dataset order. In order to achieve the abovementioned aim, the proposed CNN-LSTM is made up of two primary basic components as follows:
-
i.
One dimensional CNN, which made up of convolutional and pooling layers to perform the mathematical procedure on the input data and thus generate the features.
-
ii.
To make use of the generated the features, with the use of LSTM and dense layers
The proposed CNN-LSTM algorithm uses CNN as an encoder, while LSTM serves as the decoder respectively. The encoder learns a feature from the input data and fed into a decoder (LSTM). The decoder then, identifies and models both of which can be short- and long-term temporal relationship inherent from the dataset. The sequence of events of each step is briefly given as follows (see Fig. 5):
-
i.
Input layer: receives the input data;
-
ii.
1st Convolution layers: scans through the input information of (i) above prior to subjecting the outcomes on the feature maps;
-
iii.
2nd Convolution layer: finds the feature maps again, in order to enhance any noteworthy features with 32 feature maps per convolution layer and kernel size of 3 times steps to read the input sequence would be utilized;
-
iv.
Max pooling layer: removes specific features from (iii) above thereby oversimplifying the feature maps and produce matric having a small dimension;
-
v.
Dropout layer: enhances the learning network for securing the model from being overfitting;
-
vi.
Flatten layer: flattens the distilled feature maps into a single long vector that could be harnessed as input for decoding purposes;
-
vii.
Repeat Vector layer: the internal representation of the input sequence is repeated, once for each time step in the output sequence;
-
viii.
LSTM decoder: has a 100-unit hidden layer capable of outputting the full arrangement each of which having the 100 units providing a value on a daily basis, serving as the foundation for forecasting what would happen in the next days in the output order;
-
ix.
Fully connected layer: comprehends each step in the output sequence in order to end up having similar layers for forecast a single sequence output indicating that, the LSTM decoder is capable of operating at any given time akin to both output and fully connected layers;
-
x.
Output layer: Number of new COVID-19 cases for new infection and death cases for 10 days is forecasted.
2.4 Evaluation metric performance
The assessment of the proposed CNN-LSTM DL algorithm model, the CNN Learning model and the LSTM Learning model were carefully examined based on the standard performance evaluation metrics of the deep learning algorithm, which include accuracy, Mean Square Error (MSE), Mean Absolute Error (MAE) and Root Mean Squared Error (RMSE).
-
i.
Accuracy determines the overall performance of the forecasted model. It calculates the percentage of the ration of the sum difference between the actual and forecasted values to the total number of the actual value [37]. The Accuracy is determined with equation as follows:
$$Accuracy= \frac{Tdiff}{Tactual}*100$$(9) -
ii.
Mean Square Error (MSE) determines the squared difference between the forecasted values and observed parameters in terms of the mean [38] as shown in the equation below. The smaller value of MSE of the model, the better performance of the forecasting model.
$$MSE=\sum_{i=1}^{n}{x}_{i}-{\overline{x}}_{i}$$(10) -
iii.
Mean Absolute Error (MAE) determines the closeness of the forecasted cases of infection to the confirmed cases of COVID-19 [35, 38]. The MAE is determined with equation as follows:
$$MAE= \frac{1}{n}\sum_{i=1}^{n}\left|{x}_{i}- {\overline{x}}_{i}\right|$$(11) -
iv.
Root Mean Squared Error (RMSE) shows the standard deviation of the difference between the forecasted cases of infection and the confirmed cases of COVID-19. The RMSE is determined with equation as follows:
$$RMSE=\sqrt{\frac{1}{n}\sum_{i=1}^{n}{\left({x}_{i}- {\overline{x}}_{i}\right)}^{2}}$$(12)
3 Experimental setup
The experiment was conducted on Google Colab using python 3.0 with many open source libraries such as Pandas, Numpy, keras among others. This setup is based on the working environment of Corei5 with 8 GB of RAM and 2.8 GHz processor speed under 64-bit Windows 10 operating system. All the data analysis and development of the forecasting learning models with three deep learning algorithms which include CNN, LTSM and CNN-LTSM and evaluation of the models were done on Google Colab.
4 COVID-19 trend analysis
4.1 COVID-19 infection and death cases trend of the entire world
Before to deeply forecast the future occurrences of the COVID-19 infection in Nigeria, South Africa and Botswana respectively in this work, we conduct some analysis related to the behavior of the pandemic in the entire world since 1st January, 2020 to 15th March, 2022, adding up the numbers collected from all countries grouped by date. Figures 6 and 8 plotted the new cases and new deaths in the entire world, from 1st January, 2020 15th March, 2022.
In the Fig. 6 above, the four (4) COVID-19 waves which include original variant, beta, delta and omicron could be seen. The Fig. 6 shows that, the omicron variant of concern has become the dominant one and its peak was over 4 times higher than former peak which corresponds to variant beta wave, this phenomena occurs mainly because omicron variant of concern is “infectious as measles spreading in a non-immune population, with a much shorter incubation time, therefore much faster doubling time” as said Dr. Tom Frieden, the Director of Center for Disease Control and Prevention and Commission of New York City Department. The omicron variant of concern is the fastest-spreading virus known to humankind and Epidemiologist William Hanage further said that Omicron is certainly the most rapidly spreading virus among the ones. The Fig. 7 shows SARS-CoV-2 sequences by variants of South Africa, Botswana and Nigeria respectively and it shows how the omicron has wiped out other variants of COVID-19 with 99%, 99% and 97% in South Africa, Botswana and Nigeria respectively. The data on SARS-CoV-2 sequencing and variants in Fig. 7 is sourced from GISAID, a global science initiative that provides open-access to genomic data of SARS-CoV-2.
The Fig. 8 depicts the timeline of the deaths trend owing to COVID-19 in the world, at the beginning of the emergence of the omicron variant, there was a relatively small increase in the curve of new deaths which has made specialist to formulate a huge amount of hypotheses outstanding: Is the omicron less severe than earlier variants? It was further believed that, there is high confidence that Omicron is relatively mild for most adult and it was also believed that, COVID-19 vaccines providing high protection against serious illness causes by Omicron variant of concern which affects many African countries especially Nigeria, South Africa and Botswana. However, these three (Nigeria South Africa and Botswana) African countries are having problems to get access to COVID-19 vaccines. Therefore, in this work we have deeply analyses the behavior COVID-19 pandemic of Nigeria South Africa and Botswana, in order to recommend some strategic template to these three African countries so as to reduce down the pandemic with more certainty.
4.2 Cases of infection, death and vaccination trend of COVID-19 in Nigeria South Africa and Botswana
The cases of infection, death and vaccination trend of COVID-19 in in Nigeria, South Africa and Bostwana were deep analyzed respectively. Therefore, from the dataset, 9 features were selected based on the three countries which include date, total_cases, new_cases,total_deaths, new_deaths,people_vaccinated_per_hundred,people_fully_vaccinated_per_hundred,new_deaths_per_million and new_cases_per_million. We have derived the new features which include moving average for new_cases, new_deaths, new_deaths_per_million and new_cases_per_million based on 9 selected features of the dataset because they have sudden peaks which might occur due to typical mistakes in the counting, non-working days, massive testing among other factors.
4.2.1 Cases of infection, death and vaccination trend of COVID-19 in Nigeria
We have analyzed the trends of the COVID-19 Infection Cases, Death Cases and vaccination in Nigeria, before to deeply forecast the future occurrences of the COVID-19 infection cases in the country. The analysis was between from 1st January, 2020 to 15th March, 2022. Figures 9, 10 and 11 depicts the timeline of the new COVID-19 infection cases, the deaths occurred due to COVID-19 and percentage of people vaccinated in Nigeria, from 1st January, 2020 15th March, 2022.
From the timeline of the new COVID-19 infection cases in Nigeria in Fig. 9, it can be seen that, the curve of COVID-19 pandemic infection from January to April, 2020 is flat which means there was lesser infection of the pandemic. From May to September, 2020, the curve went up, till November, 2020, when it rises till March, 2021. From, April, 2021, the curve went down, till July, 2021, but from August, 2021 to early November, 2021 went up. However, at end of the November, 2021, when the Omicron variant of concern was first reported in South Africa from the specimen collected on 9th November, 2021 in the country, the COVID-19 infection in Nigeria happened to be highest, therefore, the curve rise very peak ever, till March, 2022.
From the timeline deaths due to COVID-19 in Fig. 10, it can be seen; there was lesser deaths occurrence in Nigeria from January, to March, 2020. From April 2020, the death curve went up, till October, 2020, when it went down to December, 2020. From, January, 2021, the curve went up, till April, 2021, but from June to November, 2022, the curve of the deaths occurrence in Nigeria, went ups and downs. However, at end of the November, 2021, when the Omicron variant of concern was first reported in South Africa from the specimen collected on 9th November, 2021 in the country, the COVID-19 infection in Nigeria happened to be highest, we can be seen that, the curve went up again, till March, 2022.
Figure 11 shows the timeline of the percentage of people vaccinated in Nigeria it can be seen; fewer people were vaccinated in Nigeria. Throughout, year 2020 up to April, 2021, people were not vaccinated in the country. We can only see that, only between May to June, 2021, November, to March, people were started vaccinated in the country. This is setback and very unfortunate and worrisome situation in Nigeria.
4.2.2 COVID-19 infection cases, death cases and vaccination trend in South Africa
Likewise, we have also analyzed the trends of the COVID-19 Infection Cases, Death Cases and percentage of the people vaccinated in South Africa, before to deeply forecast the future occurrences of the COVID-19 infection cases in the country. The analysis was between from 1st January, 2020 15th March, 2022. Figure 12, 13 and 14 show the timeline of the new COVID-19 infection cases, the deaths occurred due to COVID-19 and percentage of people vaccinated in South Africa, from 1st January, 2020 15th March, 2022.
Figure 12 shows the timeline of the new COVID-19 infection cases in South Africa in figure, It can be seen that, the curve of COVID-19 pandemic infection from March to June, 2020 is flat which means there was lesser infection of the pandemic. July, 2020, the curve went up, till September, 2020, it went down. December, 2020 the curve went down, till March, 2021, but from May, 2021 the curve went up, till November, 2021, it went down. However, at end of the November, 2021, when the Omicron variant of concern was first reported in South Africa from the specimen collected on 9th November, 2021 in the country, the COVID-19 infection in the country happened to be highest, we can be seen that, the curve rise very peak ever, till March, 2022.
Figure 13 shows the timeline deaths due to COVID-19 in figure, it can be seen; there was lesser deaths occurrence in South Africa January, to June, 2020. From June 2020, the death curve went up and down till Early November, 2021, when it went down, however, at end of the November, 2021, when the Omicron variant of concern was first reported in South Africa from the specimen collected on 9th November, 2021 in the country, the COVID-19 infection curve went up again in, till March, 2022.
Figure 14 shows the timeline of the percentage of people vaccinated in South Africa it can be seen; people were vaccinated in the country, right from March, 2021 till date (15th March, 2022).
4.2.3 Cases of infection, death and vaccination trend of COVID-19 in Botswana
The trends of cases of infection, death as well as the percentage of the people vaccinated for COVID-19 in Botswana were analyzed before to deeply forecast the future occurrences of the COVID-19 infection in the country. The analysis was between from 1st January, 2020 15th March, 2022. Figure show the timeline of the new COVID-19 infection cases, the deaths occurred due to COVID-19 and percentage of people vaccinated in Botswana, from 1st January, 2020 15th March, 2022.
Figure 15 shows the timeline of the new COVID-19 infection cases in Botswana in figure, it can be seen that, the curve of COVID-19 pandemic infection from April to October, 2020 is flat which means there were lesser infection of the pandemic. However, at end of the November, 2021, when the Omicron variant of concern was first reported in South Africa from the specimen collected on 9th November, 2021 in the country, the COVID-19 infection in the country happened to be highest, we can be seen that, the curve rise very peak ever, till March, 2022.
Figure 16 shows the timeline deaths due to COVID-19 in figure, it can be seen; there was lesser deaths occurrence in Botswana from April to January, 2021. From February, 2021, the death curve went up and down till October, 2021, when it went down to December, 2021. However, at end of the November, 2021, when the Omicron variant of concern was first reported in South Africa from the specimen collected on 9th November, 2021 in the country, the COVID-19 infection in Botswana, the curve went up again, till March, 2022.
Figure 17 shows the timeline of the percentage of people vaccinated in Botswana, it can be seen that, fewer people were vaccinated in the country. Throughout, year 2020 up to July, 2021, people were not vaccinated in the country. People were only in August, 2021 only. This is setback and very unfortunate and worrisome situation in the country.
4.3 Comparative analysis of COVID-19 new infection and death cases trend in Nigeria, Botswana and South Africa
The comparative analysis of COVID-19 New Infection Cases and Death Cases Trend in Nigeria, South Africa and Botswana by 7 day moving average of new cases per million was conducted. Figures 18 and 19 shows the comparative analysis.
From the above Fig. 18, the new COVID-19 cases curves for South Africa and Botswana respectively have little similar patterns but shifted in time. However, the curve for new COVID-19 cases has different pattern. The comparative analysis of COVID-19 new Infection in Nigeria shows that, Botswana has highest cases, followed by South Africa, while Nigeria has the least new COVID-19 cases among the tree countries.
From the above Fig. 19, the death COVID-19 cases curves for South Africa and Botswana respectively have little similar patterns but shifted in time. However, the curve for the death COVID-19 cases in Nigeria has different pattern. The comparative analysis of the death COVID-19 cases shows that, South Africa has highest cases, followed by Botswana, while Nigeria has the least COVID-19 death cases among the tree countries.
5 CNN-LSTM deep learning based forecasting model for COVID-19 infection cases in Nigeria, South Africa and Botswana
Three different learning based forecasting models for COVID-19 Infection in Nigeria, South Africa and Botswana, were developed for each country, for 10 days. The dataset has been split into 80% training and 20% testing sets. The three different learning based forecasting models, were developed with CNN, LSTM and CNN-LSTM deep learning algorithm respectively. Figures 20, 21 and 22 show the CNN- LSTM Based Forecasting COVID-19 Infection Cases in Nigeria, CNN- LSTM Based Forecasting COVID-19 Infection Cases in South Africa, and CNN- LSTM Based Forecasting COVID-19 Infection Cases in Botswana respectively.
CNN- LSTM Based Forecasting Model in Fig. 20 shows that, COVID-19 infection cases would continue rising in Nigeria, this could be due low vaccination rate of the population in the country and Omicron BA.2 which is known for being even more contagious than original Omicron variant.
Likewise, CNN-LSTM DL forecasting model in Fig. 22 shows that, cases of infection for COVID-19 in South Africa continue going down slowly and possibly may raise again, this could be due to Omicron BA.2 which is known for being even more contagious than the original Omicron variant and low vaccination rate of the population in the country.
Meanwhile, CNN-LSTM Based Forecasting in Fig. 22 shows that, COVID-19 new infection cases in Botswana will continue going down slowly and possibly raise again, this could be due to Omicron BA.2 which is known for being even more contagious than original Omicron variant and low vaccination rate of the population in the country.
The performance of the three different learning based forecasting models of each the country were evaluated based on the four standard performance evaluation metrics which include accuracy, MSE, MAE and RMSE respectively. Table shows the result of the performance evaluation result of the three different learning-based forecasting models of cases of infection for COVID-19 in Nigeria, South Africa and Botswana (see Table 1).
Based on the accuracy evaluation metric, the overall performance of the forecasting models namely CNN-LTSM, CNN, and LTSM was determined. The CNN-LTSM DL forecasting models for infection cases of COVID-19 in Nigeria, South Africa and Botswana achieved the best accuracy of 98.30%, 97.60%, and 97.74% respectively, compared to CNN and LSTM DL based forecasting models of the countries as shown in Fig. 23. All the models converge from about 600 epochs and before to eventually achieved their accuracies.
Likewise based on MSE evaluation metric, the CNN-LTSM DL forecasting model for the infection cases of COVID-19 in Nigeria, South Africa and Botswana got the smallest error values compared to CNN and LSTM DL based models respectively, as shown in Fig. 24. Hence, the LTSM DL based model for COVID-19 Infection Cases in South Africa, got the highest MSE value of 0.54432 as shown in Fig. 24.
With respect to MAE evaluation metric capable of determining the closeness of the forecasted infection cases of the COVID-19 to the confirmed cases of the disease, the CNN-LTSM DL forecasting models of Nigeria, South Africa and Botswana, also got the smallest error values compared to CNN and LSTM DL forecasting models for COVID-19 respectively as shown in Fig. 25. Hence, the LTSM DL model for COVID-19 Infection Cases in Botswana got the highest value of 0.64321 MAE as shown in Fig. 25.
Likewise based RMSE evaluation metric capable of determining the standard deviation of the difference between the forecasted cases of infection and the confirmed cases of COVID-19, the CNN-LTSM DL forecasting models of cases of infection for COVID-19 in Nigeria, South Africa and Botswana, also got the smallest error values compared to CNN and LSTM DL forecasting models of cases of infection for COVID-19 in Nigeria, South Africa and Botswana respectively. Hence, the LTSM DL based model for COVID-19 Infection Cases in Nigeria, got the highest RMSE value of 0.11321 as shown in Fig. 26.
6 Result and discussion
The COVID-19 pandemic spreads exponentially and causes the loss of lives; thus, plunging the world community especially African countries into an alarming difficult situation culminating into a great deal amounts of catastrophes such as economic recession, political instability and loss of jobs. With the emergence and spread of the Omicron new variant of concern, forecasting and identification of the COVID-19 infection and death cases is very vital for government at various levels. Hence, having knowledge of the spread at a particular point in time, swift actions can be taken by government at various levels with a view to accordingly formulating new policies and modalities towards minimizing the trajectory of the consequences of COVID-19 pandemic to both public health and economic sectors. Therefore, in this work, we combined Convolutional Neural Network (CNN) learning algorithm, together with Long Short-Term Memory (LSTM) learning algorithm, to propose a hybrid a learning algorithm Convolutional Neural Network—Long Short Term Memory (CNN-LSTM) and used to forecast the COVID-19 infection cases in Nigeria, South Africa and Botswana respectively. We have collected world COVID-19 dataset from “Our World in Data” organization and developed the models. Before to develop the forecasting models we conducted some analysis related to the behavior of the pandemic in the entire world since 1st January, 2020 to 15th March, 2022, adding up the numbers collected from all countries grouped by date. The cases of infection and death and vaccination trend of COVID-19 in three African countries which include Nigeria South Africa and Botswana respectively were also deeply analyzed. The result shows that, the curve of COVID-19 pandemic infection went up and down, till the end of the November, 2021, in all the three countries, till when the Omicron variant was first reported, the COVID-19 infection in all the three countries may likely raise again, perhaps happening to be highest ever In Nigeria, and South Africa excluding Botswana. Likewise for the death cases, the curve also, went ups and downs, in all the three countries. On the percentage of people vaccinated, it has been discovered that, fewer people were vaccinated in all the three countries. The comparative analysis of COVID-19 infection and death cases trend in Nigeria, South Africa and Botswana was also conducted. The result shows that, Botswana has highest cases, followed by South Africa, while Nigeria has the least new COVID-19 cases among the three countries. However, result of the comparative analysis of the COVID-19 death cases shows that, South Africa has the highest cases, followed by Botswana, while Nigeria has the least COVID-19 death cases among the tree countries.
Consequent upon above, DL forecasting models for cases infection of COVID-19 in Nigeria, South Africa and Botswana, were developed, for 10 days, with CNN, LSTM and CNN-LSTM DL algorithms respectively. The learning models were evaluated on the basis of four standard performance evaluation metrics which include accuracy, MSE, MAE and RMSE respectively. However, the CNN-LSTM DL based forecasting models achieved not only the best accuracy but also got lesser MSE, MAE and RMSE values compared to models developed with CNN and LSTM respectively. According to LSTM-RNN DL based forecasting models; the COVID-19 infection cases in Nigeria would continue rising, this could be due a low vaccination rate of the population in the country and emergence of Omicron variant on concern. Likewise, according the models, the COVID-19 the infection cases in South Africa and Botswana, would continue going down slowly and possibly may rise again, this could be due to emergence from Omicron and low vaccination rate of the population in the country.
Therefore, the three countries (Nigeria, South Africa and Botswana) should take all the necessary measures and formulate the policies and strategies that would ensure total eradication of the COVID-19 pandemic. Public awareness, which is also very important on the COVID-19, should be taken very serious in all level of government in the countries especially on COVID-19 vaccination. The government of these countries should focus on longer-term testing and surveillance approaches that can be integrated into, and will strengthen, their national health systems.
7 Conclusion
In this work, we combined CNN learning algorithm, together with LSTM learning algorithm, to propose a hybrid CNN-LSTM learning algorithm. The hybrid algorithm used the DL based forecasting model for COVID-19 infection cases in Nigeria, South Africa and Botswana, for 10 days. Two other DL based forecasting models (CNN and LSTM) for COVID-19 infection cases in Nigeria, South Africa and Botswana, for 10 days only, were developed. Hence, the models were limited to only 10 days, which can be extended to one month or more in the future work. Therefore, the learning models were evaluated based on the four standard performance evaluation metrics which include accuracy, MSE, MAE and RMSE respectively. However, the CNN-LSTM DL based forecasting models achieved not only the best accuracy of with 98.30%, 97.60%, and 97.74% but also got lesser MSE, MAE and RMSE values compared to models developed with CNN and LSTM respectively. According to LSTM-RNN deep learning based forecasting models; the COVID-19 infection cases in Nigeria would continue rising. Likewise, according the models, the COVID-19 the infection cases in South Africa and Botswana, would continue going down slowly and possibly raise again, this could be due to emergence Omicron and low vaccination rate of the population in the country [39,40,41,42,43].
Availability of data and material
Not applicable.
Code availability
Not applicable.
References
Arora P, Kumar H, Panigrahi BK. Prediction and analysis of COVID-19 Posi- tive cases using deep learning models: a descriptive case study of India. Chaos, Solitons and Fractals 2020. https://doi.org/10.1016/j.chaos.2020.110017.
Abdulmajeed K, Adeleke M, Popoola L. Online forecasting of Covid-19 cases in Nigeria using limited data. Data Brief 2020;30 105683. https://doi.org/10.1016/j.dib.2020.105683.
Islam M, Mahmud S, Muhammad LJ, et al. Wearable Technology to Assist the Patients Infected with Novel Coronavirus (COVID-19). SN Comput Sci. 2020;1:320. https://doi.org/10.1007/s42979-020-00335-4.
Muhammad LJ, Islam MM, Usman SS, et al. Predictive Data Mining Models for Novel Coronavirus (COVID-19) Infected Patients’ Recovery. SN Comput Sci. 2020;1. https://doi.org/10.1007/s42979-020-00216-w
Muhammad LJ, Algehyne EA, Usman SS, et al. Supervised Machine Learning Models for Prediction of COVID-19 Infection using Epidemiology Dataset. SN Comput Sci. 2020. https://doi.org/10.1007/s42979-020-00394-7.
Pathan RK, Biswas M, Khandaker MU. Time series prediction of COVID-19 by mutation rate analysis using recurrent neural network-based LSTM model. Chaos Solit Fractals 2020;138 110018. https://doi.org/10.1016/j.chaos.2020.110018 .
Muhammad LJ, et al. Deep Learning Models for Classification and Localization of COVID-19 Abnormalities on Chest Radiographs. 2021 4th Int Conf Comput Inform Sci (ICCIS). 2021, pp. 1–6. https://doi.org/10.1109/ICCIS54243.2021.9676401.
Sarumi OA, Aouedi O. Potential of Deep Learning Algorithms in Mitigating the Spread of COVID-19. In: J. Nayak, B. Naik, A. Abraham (eds) Understanding COVID-19: The Role of Computational Intelligence. Stud Comput Intell. 2021;963.
Muhammad LJ, et al. Deep Learning Models for Predicting COVID-19 Using Chest X-Ray Images. In: Johri P, Diván MJ, Khanam R, Marciszack M, Will A (eds) Trends and Advancements of Image Processing and Its Applications. EAI/Springer Innovations in Communication and Computing. Springer, Cham. 2022. https://doi.org/10.1007/978-3-030-75945-2_6
AlShourbaji I, Kachare P, Zogaan W, et al. Learning Features Using an optimized Artificial Neural Network for Breast Cancer Diagnosis. SN Comput Sci. 2022;3:229. https://doi.org/10.1007/s42979-022-01129-6.
Algehyne EA, Jibril ML, Algehainy NA, Alamri OA, Alzahrani AK. Fuzzy Neural Network Expert System with an Improved Gini Index Random Forest-Based Feature Importance Measure Algorithm for Early Diagnosis of Breast Cancer in Saudi Arabia. Big Data Cogn Comput. 2022;6:13. https://doi.org/10.3390/bdcc6010013.
Alsayed O, Rahim MSM, Bidewi IA. Selection of the Right Undergraduate Major by Students Using Supervised Learning Techniques. Appl Sci. 2021;11:10639. https://doi.org/10.3390/app112210639.
Hariri M, Obaid W, Rihawi H, Safadi S, McGlasson MA. The Covid-19 Forecast in Northwest Syria. medRxiv preprint. 2020. https://doi.org/10.1101/2020.05.07.20085365.this.
Cheng Z, Burcu T, Nicola GC, Perdo DWG, Matthias PH, et al. ICUmonitoring.ch: a platform for short-term forecasting of intensive care unit occupancy dur- ing the COVID-19 epidemic in Switzerland. Swiss Med Wkly 2020;150:w20277. https://doi.org/10.4414/smw.2020.20277 .
Yang Z, Zeng Z, Wang K, Wong SS, Liang W, Zanin M, Liu P, Cao X, Gao Z, Mai Z, Liang J, Liu X, Li S, Li Y, Ye F, Guan W, Yang Y, Li F, Luo S, Xie Y, Liu B, Wang Z, Zhang S, Wang Y, Zhong N, He J. Modified SEIR and AI prediction of the epidemics trend of COVID-19 in China under public health interventions. J Thorac Dis. 2020;12(3):165–74. https://doi.org/10.21037/jtd.2020.02.64.
Ayyoubzadeh SM, Ayyoubzadeh SM, Zahedi H, Ahmadi M, R Niakan Kalhori S. Predicting COVID-19 Incidence Through Analysis of Google Trends Data in Iran: Data Mining and Deep Learning Pilot Study. JMIR Public Health Surveillance. 2020;6(2):e18828. https://doi.org/10.2196/18828
Liu D, Clemente L, Poirier C, Ding X, Chinazzi M, Davis JT, Vespignani A, Santillana M. A machine learning methodology for real-time forecasting of the 2019–2020 COVID-19 outbreak using Internet searches, news alerts, and estimates from mechanistic models, https://arxiv.org/abs/2004.04019
Magdon-Ismail M. Machine Learning the Phenomenology of COVID-19 From Early Infection Dynamics. https://arxiv.org/abs/2003.07602
Hussain S, Muhammad LJ, Ishaq FS, Yakubu A, Mohammed IA. Performance Evaluation of Various Data Mining Algorithms on Road Traffic Accident Dataset. In Information and Communication Technology for Intelligent Systems. Smart Innovation, Systems and Technologies; Satapathy, S., Joshi, A., Eds.; Springer Nature, Singapore, Singapore, 2019.
Badi I, et al. Measuring Sustainability Performance Indicators Using FUCOM-MARCOS Methods. Operational Research in Engineering Sciences: Theory and Applications. 2022;5(2):99–116.
Haruna AA, et al. An Improved C4.5 Data Mining Driven Algorithm for the Diagnosis of Coronary Artery Disease. In Proceedings of the International Conference on Digitization (ICD), Sharjah, United Arab Emirates, 18–19 Nov. 2019; pp. 48–52.
Ullah A, Ahmad J, Muhammad K, Sajjad M, Baik SW. Action recognition in video sequences using deep bi-directional LSTM with CNN features. IEEE Access 2018;6:1155e66.
Zhao R, Yan R, Wang J, Mao K. Learning to monitor machine health with convolutional bi-directional LSTM networks. Sensors 2017;17(2):273e90.
He K, Sun J. Convolutional neural networks at constrained time cost. IEEE Conf. on Computer Vision and Pattern Recognition 2015:5353e60
Islam MZ, Islam MM, Asraf A. A Combined Deep CNN-LSTM Network for the Detection of Novel Coronavirus (COVID-19) Using X-ray Images. Inform Med Unlocked. 2020, 100412. https://doi.org/10.1016/j.imu.2020.100412
Kim T-Y, Cho S-B. Predicting Residential Energy Consumption using CNN-LSTM Neural Networks. Energy. 2019. https://doi.org/10.1016/j.energy.2019.05.23.
Muhammad LJ, Algehyne EA, Usman SS. Predictive Supervised Machine Learning Models for Diabetes Mellitus. SN Comput Sci. 2020;1:240. https://doi.org/10.1007/s42979-020-00250-8.
Muhammad LJ, Badi I, Haruna AA, Mohammed IA, Dada OS. Deep Learning Models for Classification of Brain Tumor with Magnetic Resonance Imaging Images Dataset. In: Raza K. (eds) Computational Intelligence in Oncology. Stud Comput Intell. 2022;1016. Springer, Singapore. https://doi.org/10.1007/978-981-16-9221-5_9
Agethen S, Hsu WH. Deep multi-kernel convolutional LSTM networks and an attention-based mechanism for videos. IEEE Trans Multimedia. 2020;22(3):819–29. https://doi.org/10.1109/TMM.2019.2932564.
Yu Y, Xi S, Hu C, Zhang J. A review of recurrent neural networks: LSTM cells and network architectures. Neural Comput 2019;31(1235–1270). https://doi.org/10.1162/neco_a_01199
Sainath TN, Vinyals O, Senior A, Sak H. Convolutional, long short-term memory, fully connected deep neural networks. IEEE Int Conf on Acoustics, Speech and Signal Processing. 2015:4580e4.
Breiman L. Bagging predictors. Mach Learn. 1996;24(2):123–40.
Rawat W, Wang Z. Deep convolutional neural networks for image classification: a comprehensive review. Neural Comput. 2017;29:9.
Muhammad LJ, Badi I, Haruna AA, Mohammed IA, Dada OS. Deep Learning Models for Classification of Brain Tumor with Magnetic Resonance Imaging Images Dataset. In: Raza, K. (eds) Computational Intelligence in Oncology. Stud Comput Intell. 2022;1016. Springer, Singapore. https://doi.org/10.1007/978-981-16-9221-5_9
Liu T, Bao J, Wang J, Zhang Y. A Hybrid CNN-LSTM Algorithm for Online Defect Recognition of CO2 Welding. Sensors (Basel, Switzerland). 2018;18(12):4369. https://doi.org/10.3390/s18124369.
Zhou B, Khosla A, Lapedriza A, Oliva A, Torralba A. Learning deep features for discriminative localization. Proc. of the IEEE Conf. on Comput Vis Pattern Recognit. 2016:2921e9.
Lu W, Li J, Li Y, Sun A, Wang J. A CNN-LSTM-Based Model to Forecast Stock Prices. Complexity. 2020;2020:1–10. https://doi.org/10.1155/2020/6622927.
Liang S, Zhu B, Zhang Y, Cheng S, Jin J. A Double Channel CNN-LSTM Model for Text Classification. 2020 IEEE 22nd International Conference on High Performance Computing and Communications; IEEE 18th International Conference on Smart City; IEEE 6th Int Conf Data Sci Syst (HPCC/SmartCity/DSS). 2020. https://doi.org/10.1109/hpcc-smartcity-dss50907.2020.00169
Muhammad LJ, Algehyne EA. Fuzzy based expert system for diagnosis of coronary artery disease in Nigeria. Health Technol. 2021;11:319–29. https://doi.org/10.1007/s12553-021-00531-z.
Muhammad LJ, Ahmad AH, Ibrahim AM, Mansir A, Bature B, Jamila MA. Performance Evaluation of Classification Data Mining Algorithms On Coronary Artery Disease Dataset. In Proceedings of the IEEE 9th International Conference on Computer and Knowledge Engineering (ICCKE 2019), Ferdowsi University of Mashhad, Mashhad, Iraq, 24 -25 Oct. 2019.
Muhammad LJ, Jibrin MB, Yahaya BZ, Jibrin IAMB, Ahmad A, Amshi JM. An Improved C4.5 Algorithm using Principle of Equivalent of Infinitesimal and Arithmetic Mean Best Selection Attribute for Large Dataset. In Proceedings of the 2020 10th International Conference on Computer and Knowledge Engineering (ICCKE), Mashhad, Iran, 2020; pp. 006–010.
Wang P, Zheng X, Ai G, Liu D, Zhu B. Time series prediction for the epidemic trends of COVID-19 using the improved LSTM deep learning method: case studies in Russia, Peru and Iran. Chaos, Solitons, and Fractals, 2020;140. Article ID 110214.
Our World in Data Retrieved from www.ourworldindata.org. Accessed date 1 Jan 2022.
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Ethical approval
This article does not contain any studies with human participants or animals performed by any of the authors.
Informed consent
Not applicable.
Consent to publish
Not applicable.
Conflict of interest
The authors declare that they have no conflict of interest.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
This article is part of the Topical Collection on COVID-19 Health Technology: Design, Regulation, Management, Assessment Production
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Muhammad, L.J., Haruna, A.A., Sharif, U.S. et al. CNN-LSTM deep learning based forecasting model for COVID-19 infection cases in Nigeria, South Africa and Botswana. Health Technol. 12, 1259–1276 (2022). https://doi.org/10.1007/s12553-022-00711-5
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s12553-022-00711-5