1 Introduction

Coronaviruses are group of viruses capable of infecting and killing mammals and birds particularly Middle East Respiratory Syndrome Coronavirus (MERS-CoV), Severe Acute Respiratory Syndrome Coronavirus (SARS-CoV), and 2019 Novel Coronavirus (2019-nCoV) [1,2,3]. 2019 Novel Coronavirus is designated and called COVID-19 or 2019-nCoV which is caused by Severe Acute Respiratory Syndrome Coronavirus 2 called SARS-Cov-2 [4]. The 2019-nCoV outbreak started from Wuhan Province, China in late December, 2019. Since the beginning of COVID-19 pandemic, many variants of novel coronavirus have emerged as the virus, SARS-CoV-2, continues to mutate and evolve [5, 6]. However, many of the mutations of the variants are innocuous on how the SARS-CoV-2 affects humans. Nevertheless, the mutations of others variants such as genetic changes in delta variants, is often made the SARS-CoV-2 more transmissible, contagious than its original version of the virus discovered in late 2019 [7, 8].

In Late November, 2021, a new variant of SARS-CoV-2 emerged and it was named Omicron by the World Health Organization (WHO). The variant was first reported on 24th November, 2021 in South Africa from the specimen collected on 9th November, 2021 in the country. On 26th November, 2021 WHO designed the variant as the “variant of concern” (VOC), named Omicron, due to how easily the variant spreads, severity of illness it causes to humans and its ability to cause a higher risk of reinfection as compared to other previously identified and reported variants [7]. Therefore, WHO has called on countries to develop monitoring and follow-up efforts to better understand the transmission of the new variant, to report cases the variant, to submit complete genetic sequences and related metadata to public and multilateral websites and to co-operate with the international community accordingly, conduct laboratory tests and field investigations to understand the potential impact of the new mutation on COVID19 epidemiology, social and effective community health measures, sensitivity, diagnostic techniques and methods, antibody neutrality, immune responses and appropriate traits [7, 9]. WHO also asked individual to take measures to reduce the risk of the new variant to the barest minimum especially social and public health measures such as hand hygiene, wearing well-fitting mask, improving ventilation of indoor space, getting vaccinated, physical distance and avoiding crowded spaces, among others. According to Sir Jeremy Farrarm, a British Scientist who has warned the world as he said “the emergence and spread of the Omicron variant of coronavirus shows that the world is closer to the start of a pandemic than the end”. According to WHO, 57 countries across the different regions of the world were reported to have Omicron variant cases and 46% of the cases are from African countries particularly South Africa. As of March 10, 2022, confirmed cases of Covid-19 from 55 African countries reached 11,453,505, however, over 273,418,843 vaccinations have been administered across the continent. South Africa, Nigeria and Botswana are African countries with highest number COVID-19 Omicron variant of concern cases.

Hence, many different artificial intelligence, statistical and mathematical approaches have been developed for predicting and forecasting the severity and spread of COVID-19 pandemic in many countries [10,11,12]. In the work of [1] LSTM was designed for predicting either 1 or 7 days, or both of COVID-19 cases. A study of [13] developed a COVID-19 Essential Supplies Forecasting Tool for Predicting different cases of COVID-19 in Northwest, Syria. A beb based predictive method for capturing data of COVID-19 in real-time-dependent manner from NCDC (Nigeria Centre for Disease Control) was developed in the work of [2]. The work of [14] has created a model for predicting ICU beds in Switzerland for individual hospitals at the subregional level. In the work of [15], two models, under public health intervention, namely ML (machine learning) and SEIR (susceptible-exposed infectious removed) were developed for predicting the spread of COVID-19 pandemic in China. The work of [16] developed DM (data mining) model for the prediction of 2019-nCoV events using Google Trend data in Iran. LSTM and linear regression models were employed for the estimation of favorable number of cases for 2019-nCoV thereby aiding the prediction and forecast of the effort and support of health care managers and policymakers plan and allocate and implement health care resources appropriately. In the work of [17], a real-time ML method based on the 2019-nCoV outbreak using news alerts was launched and disseminated through Clouds and official health report which appeared to be an online search function capable of forecasting daily cases of COVID-19 from either Baidu or GLEAM (agent-based machine model) in accordance with the Chinese Center Disease for Control and Prevention guidelines. This method appears by far robust in generating accurate forecast of COVID-19 cases even two days in advance. The study of [18] introduced ML-based study method for early infection fluctuations of 2019-nCoV epidemic data especially the rate of inflation over time using US data from the first case on January 20, 2020. Active health insights public records were released that included infectious power, the rate of mild and severe infections, rates of untreated diseases and the prediction of new diseases over time.

Following the outbreak of Omicron, the VOC, forecasting and identification of the COVID-19 infection cases is indeed invaluable for government at various levels [15]. Hence, having knowledge of the spread at a particular point in time, swift actions can be taken by government at various levels with a view to accordingly formulating new policies and modalities in order to mitigate the implications of the pandemic. In this work, we proposed a new hybrid deep learning algorithm and used to develop the forecasting model for COVID-19 infection cases in Nigeria, South Africa and Botswana respectively. We combined Convolutional Neural Network (CNN) learning algorithm, together with Long Short Term Memory (LSTM) learning algorithm, to propose a hybrid a learning algorithm Convolutional Neural Network—Long Short Term Memory (CNN-LSTM). We developed three deep learning forecasting models for COVID-19 infection cases of the three countries with CNN, LSTM and CNN-LSTM learning algorithms. The CNN-LSTM learning algorithm has just been proposed by this research team as a forecast-driven algorithm for COVID-19 infection cases of Nigeria, South Africa and Botswana; which by far appeared to be very accurate with lesser MSE, MAE and RMSE values than those of the individual CNN and LSTM models respectively. Although a lot of studies on either CNN models or LSTM models were reported as extensively cited above, to our knowledge, literature on the potent combination of CNN-LSTM DL algorithm has neither been reported nor documented in anyone of the three countries. Hence, the present study is indeed beneficial which aims at filling this gap with a view to redesigning the public health policies and strategies to mitigate the implications of COVID-19 pandemic not only in Nigeria, South Africa and Botswana but also in any African countries that may suffer from similar trajectory of the consequences.

2 Materials and methods

The Fig. 1 entails the diagrammatic representation of the adopted methodology in this research, as the materials and methods employed to develop the CNN-LSTM DL-Based Forecasting Model for COVID-19 Infection Cases in Nigeria, South Africa and Botswana respectively.

Fig. 1
figure 1

Methodology

2.1 Dataset collection

The dataset that used in this study was sourced from the public dataset of “Our World in Data” organization. Our World in Data is a project of the Global Change Data Lab, a non-profit organization based in the United Kingdom (Registered Charity Number 1186433). The organization in collaboration with the university of oxford has developed a reliable repository of datasets about many topic focuses on problems affecting the world. The dataset used in this study, is the up to date World COVID-19 dataset, which contains sixty seven (67) features.

2.2 Feature extraction

In this section, we extract the relevant features of the dataset to build the time series forecasted models for COVID-19 infection cases of Nigeria, South Africa and Botswana respectively. Therefore, out 67 features of the dataset, only 9 features of the were extracted and used which include date, total_cases, new_cases, total_deaths, new_deaths, people_vaccinated_per_hundred, people_fully_vaccinated_per_hundred, new_deaths_per_million and new_cases_per_million.

2.3 Deep learning algorithm

2.3.1 Convolutional neural network

Convolutional Neural Network (CNN) is one of the most discriminatory algorithms using layers consisting of convolutional layers and pooling layers [19, 20]. Layers are stacked on top of each other [21]. The convolutional layer shares a lot of weight while the composite layers soften the convolutional layer release and reduce the amount of data rate from the lower layer [12]. The sharing of weight in the convolutional layer together with appropriate chosen pooling schemes, endows the CNN with some invariance properties [7, 22,23,24].CNN has been found to be one of the most effective DL vision algorithms used for image recognition and computer vision [25, 26]. In CNN's convolutional layer, convolutional features mapping from previous layer are convolved with learnable kernels with non-linear or linear activation functions such as hyperbolic tangent, sigmoid, softmax among others to output feature maps [27, 28]. As such, each feature map output combined with more than one input feature maps. As we can see in Eq. (1) below:-

$${x}_{j}^{i}=f \left({\sum}_{i \in {M}_{j}}{x}_{i}^{I-1}*{k}_{ij }^{I}+ {b}_{j}^{I}\right),$$
(1)

where \({x}_{j}^{i}\) = output of the current layer; \({x}_{i}^{I-1}\) = previous layer output; \({k}_{ij }^{I}\) = kernel for the present layer; \({b}_{j}^{I}\) = biases for the current layer. \({M}_{j}\) is selection input maps while an addictive bias b is each output map.

At pooling layer of CNN the number of input and output features maps cannot change because if there are n input maps, then there must be n output map exactly [11]. But the size of the each dimension of the output map will be reduced due to the down sampling operation. The operation can be formulated with the equation below:

$${x}_{j}^{i}=down \left({x}_{j }^{I-1}\right),$$
(2)

Convolution and pooling functions are always employed by CNN as its activation functions for incorporating non-linearity other than being employed as its normal function. The Fig. 2 depicts the overall architecture of CNN.

Fig. 2
figure 2

Architecture of CNN

2.3.2 Long short term memory

Long short-term memory (LSTM) is used to solve the learning models for Recurrent Neural Networks (RNNs) to produce promising results on a variety of tasks including building prediction and language model [29,30,31]. LSTM solves tasks that are complex long time-lags which have never been solved by recurrent network algorithms [32,33,34,35]. LSTM learning algorithm is able to learn long-term relationships by replacing the RNN hidden layer with memory cells [36]. The key idea behind LSTM is the cell state, thus the horizontal line running through the top of Fig. 3.

Fig. 3
figure 3

Gated Recurrent Unit

The algorithm adds or removes information to the cell state called Get. An input gate (it), forget gate (ft) and output gate (ot) as shown in Fig. 2 and it can be defined as (see Fig. 4):

$${f}_{t\ =\ \sigma \left({W}_{f.} \left[{h}_{t-1,{ x}_{t}}\right]+{b}_{f}\right)}$$
(3)
$${i}_{t\ =\ \sigma \left({W}_{i.} \left[{h}_{t-1,{ x}_{t}}\right]+{b}_{i}\right)},$$
(4)
$${C}_{i\ =\ tanh\left({W}_{c.} \left[{h}_{c-1,{ x}_{t}}\right]+{b}_{c}\right)},$$
(5)
$${C}_{t\ =\ }{f}_{t }* {C}_{t-1 }+ {i}_{t }{* C}_{t ,}$$
(6)
$${O}_{i\ =\ \sigma \left({W}_{o.} \left[{h}_{t-1,{ x}_{t}}\right]+{b}_{o}\right)},$$
(7)
Fig. 4
figure 4

LSTM Diagram

$${h}_{t\ =\ }{O}_{t }* tanh{*(C}_{t })$$
(8)

2.3.3 CNN-LSTM

Potent combination of CNN-LSTM (Convolutional Neural Network-Long Short-Term Memory) has been proposed in order to limit the employment of CNN for getting invaluable knowledge and learning ability with greater efficiency of LSTM algorithm in a time-series-internal-representation-dependent manner to dramatically detect as well as model the short- and long-term temporal interrelationships cornified in the dataset order. In order to achieve the abovementioned aim, the proposed CNN-LSTM is made up of two primary basic components as follows:

  1. i.

    One dimensional CNN, which made up of convolutional and pooling layers to perform the mathematical procedure on the input data and thus generate the features.

  2. ii.

    To make use of the generated the features, with the use of LSTM and dense layers

The proposed CNN-LSTM algorithm uses CNN as an encoder, while LSTM serves as the decoder respectively. The encoder learns a feature from the input data and fed into a decoder (LSTM). The decoder then, identifies and models both of which can be short- and long-term temporal relationship inherent from the dataset. The sequence of events of each step is briefly given as follows (see Fig. 5):

  1. i.

    Input layer: receives the input data;

  2. ii.

    1st Convolution layers: scans through the input information of (i) above prior to subjecting the outcomes on the feature maps;

  3. iii.

    2nd Convolution layer: finds the feature maps again, in order to enhance any noteworthy features with 32 feature maps per convolution layer and kernel size of 3 times steps to read the input sequence would be utilized;

  4. iv.

    Max pooling layer: removes specific features from (iii) above thereby oversimplifying the feature maps and produce matric having a small dimension;

  5. v.

    Dropout layer: enhances the learning network for securing the model from being overfitting;

  6. vi.

    Flatten layer: flattens the distilled feature maps into a single long vector that could be harnessed as input for decoding purposes;

  7. vii.

    Repeat Vector layer: the internal representation of the input sequence is repeated, once for each time step in the output sequence;

  8. viii.

    LSTM decoder: has a 100-unit hidden layer capable of outputting the full arrangement each of which having the 100 units providing a value on a daily basis, serving as the foundation for forecasting what would happen in the next days in the output order;

  9. ix.

    Fully connected layer: comprehends each step in the output sequence in order to end up having similar layers for forecast a single sequence output indicating that, the LSTM decoder is capable of operating at any given time akin to both output and fully connected layers;

  10. x.

    Output layer: Number of new COVID-19 cases for new infection and death cases for 10 days is forecasted.

Fig. 5
figure 5

CNN-LSTM Architecture

2.4 Evaluation metric performance

The assessment of the proposed CNN-LSTM DL algorithm model, the CNN Learning model and the LSTM Learning model were carefully examined based on the standard performance evaluation metrics of the deep learning algorithm, which include accuracy, Mean Square Error (MSE), Mean Absolute Error (MAE) and Root Mean Squared Error (RMSE).

  1. i.

    Accuracy determines the overall performance of the forecasted model. It calculates the percentage of the ration of the sum difference between the actual and forecasted values to the total number of the actual value [37]. The Accuracy is determined with equation as follows:

    $$Accuracy= \frac{Tdiff}{Tactual}*100$$
    (9)
  2. ii.

    Mean Square Error (MSE) determines the squared difference between the forecasted values and observed parameters in terms of the mean [38] as shown in the equation below. The smaller value of MSE of the model, the better performance of the forecasting model.

    $$MSE=\sum_{i=1}^{n}{x}_{i}-{\overline{x}}_{i}$$
    (10)
  3. iii.

    Mean Absolute Error (MAE) determines the closeness of the forecasted cases of infection to the confirmed cases of COVID-19 [35, 38]. The MAE is determined with equation as follows:

    $$MAE= \frac{1}{n}\sum_{i=1}^{n}\left|{x}_{i}- {\overline{x}}_{i}\right|$$
    (11)
  4. iv.

    Root Mean Squared Error (RMSE) shows the standard deviation of the difference between the forecasted cases of infection and the confirmed cases of COVID-19. The RMSE is determined with equation as follows:

    $$RMSE=\sqrt{\frac{1}{n}\sum_{i=1}^{n}{\left({x}_{i}- {\overline{x}}_{i}\right)}^{2}}$$
    (12)

3 Experimental setup

The experiment was conducted on Google Colab using python 3.0 with many open source libraries such as Pandas, Numpy, keras among others. This setup is based on the working environment of Corei5 with 8 GB of RAM and 2.8 GHz processor speed under 64-bit Windows 10 operating system. All the data analysis and development of the forecasting learning models with three deep learning algorithms which include CNN, LTSM and CNN-LTSM and evaluation of the models were done on Google Colab.

4 COVID-19 trend analysis

4.1 COVID-19 infection and death cases trend of the entire world

Before to deeply forecast the future occurrences of the COVID-19 infection in Nigeria, South Africa and Botswana respectively in this work, we conduct some analysis related to the behavior of the pandemic in the entire world since 1st January, 2020 to 15th March, 2022, adding up the numbers collected from all countries grouped by date. Figures 6 and 8 plotted the new cases and new deaths in the entire world, from 1st January, 2020 15th March, 2022.

Fig. 6
figure 6

Timeline of the new COVID-19 infection cases in the world

In the Fig. 6 above, the four (4) COVID-19 waves which include original variant, beta, delta and omicron could be seen. The Fig. 6 shows that, the omicron variant of concern has become the dominant one and its peak was over 4 times higher than former peak which corresponds to variant beta wave, this phenomena occurs mainly because omicron variant of concern is “infectious as measles spreading in a non-immune population, with a much shorter incubation time, therefore much faster doubling time” as said Dr. Tom Frieden, the Director of Center for Disease Control and Prevention and Commission of New York City Department. The omicron variant of concern is the fastest-spreading virus known to humankind and Epidemiologist William Hanage further said that Omicron is certainly the most rapidly spreading virus among the ones. The Fig. 7 shows SARS-CoV-2 sequences by variants of South Africa, Botswana and Nigeria respectively and it shows how the omicron has wiped out other variants of COVID-19 with 99%, 99% and 97% in South Africa, Botswana and Nigeria respectively. The data on SARS-CoV-2 sequencing and variants in Fig. 7 is sourced from GISAID, a global science initiative that provides open-access to genomic data of SARS-CoV-2.

Fig. 7
figure 7

SARS-CoV-2 sequences by variants Source: www.ourworldindata.org

The Fig. 8 depicts the timeline of the deaths trend owing to COVID-19 in the world, at the beginning of the emergence of the omicron variant, there was a relatively small increase in the curve of new deaths which has made specialist to formulate a huge amount of hypotheses outstanding: Is the omicron less severe than earlier variants? It was further believed that, there is high confidence that Omicron is relatively mild for most adult and it was also believed that, COVID-19 vaccines providing high protection against serious illness causes by Omicron variant of concern which affects many African countries especially Nigeria, South Africa and Botswana. However, these three (Nigeria South Africa and Botswana) African countries are having problems to get access to COVID-19 vaccines. Therefore, in this work we have deeply analyses the behavior COVID-19 pandemic of Nigeria South Africa and Botswana, in order to recommend some strategic template to these three African countries so as to reduce down the pandemic with more certainty.

Fig. 8
figure 8

Timeline of the new deaths due to COVID-19 in the world

4.2 Cases of infection, death and vaccination trend of COVID-19 in Nigeria South Africa and Botswana

The cases of infection, death and vaccination trend of COVID-19 in in Nigeria, South Africa and Bostwana were deep analyzed respectively. Therefore, from the dataset, 9 features were selected based on the three countries which include date, total_cases, new_cases,total_deaths, new_deaths,people_vaccinated_per_hundred,people_fully_vaccinated_per_hundred,new_deaths_per_million and new_cases_per_million. We have derived the new features which include moving average for new_cases, new_deaths, new_deaths_per_million and new_cases_per_million based on 9 selected features of the dataset because they have sudden peaks which might occur due to typical mistakes in the counting, non-working days, massive testing among other factors.

4.2.1 Cases of infection, death and vaccination trend of COVID-19 in Nigeria

We have analyzed the trends of the COVID-19 Infection Cases, Death Cases and vaccination in Nigeria, before to deeply forecast the future occurrences of the COVID-19 infection cases in the country. The analysis was between from 1st January, 2020 to 15th March, 2022. Figures 9, 10 and 11 depicts the timeline of the new COVID-19 infection cases, the deaths occurred due to COVID-19 and percentage of people vaccinated in Nigeria, from 1st January, 2020 15th March, 2022.

Fig. 9
figure 9

Timeline of the new COVID-19 infection cases in Nigeria

Fig. 10
figure 10

Timeline of the new deaths due to COVID-19 in Nigeria

Fig. 11
figure 11

Timeline of the percentage of people vaccinated in Nigeria

From the timeline of the new COVID-19 infection cases in Nigeria in Fig. 9, it can be seen that, the curve of COVID-19 pandemic infection from January to April, 2020 is flat which means there was lesser infection of the pandemic. From May to September, 2020, the curve went up, till November, 2020, when it rises till March, 2021. From, April, 2021, the curve went down, till July, 2021, but from August, 2021 to early November, 2021 went up. However, at end of the November, 2021, when the Omicron variant of concern was first reported in South Africa from the specimen collected on 9th November, 2021 in the country, the COVID-19 infection in Nigeria happened to be highest, therefore, the curve rise very peak ever, till March, 2022.

From the timeline deaths due to COVID-19 in Fig. 10, it can be seen; there was lesser deaths occurrence in Nigeria from January, to March, 2020. From April 2020, the death curve went up, till October, 2020, when it went down to December, 2020. From, January, 2021, the curve went up, till April, 2021, but from June to November, 2022, the curve of the deaths occurrence in Nigeria, went ups and downs. However, at end of the November, 2021, when the Omicron variant of concern was first reported in South Africa from the specimen collected on 9th November, 2021 in the country, the COVID-19 infection in Nigeria happened to be highest, we can be seen that, the curve went up again, till March, 2022.

Figure 11 shows the timeline of the percentage of people vaccinated in Nigeria it can be seen; fewer people were vaccinated in Nigeria. Throughout, year 2020 up to April, 2021, people were not vaccinated in the country. We can only see that, only between May to June, 2021, November, to March, people were started vaccinated in the country. This is setback and very unfortunate and worrisome situation in Nigeria.

4.2.2 COVID-19 infection cases, death cases and vaccination trend in South Africa

Likewise, we have also analyzed the trends of the COVID-19 Infection Cases, Death Cases and percentage of the people vaccinated in South Africa, before to deeply forecast the future occurrences of the COVID-19 infection cases in the country. The analysis was between from 1st January, 2020 15th March, 2022. Figure 1213 and 14 show the timeline of the new COVID-19 infection cases, the deaths occurred due to COVID-19 and percentage of people vaccinated in South Africa, from 1st January, 2020 15th March, 2022.

Fig. 12
figure 12

Timeline of the new COVID-19 infection cases in South Africa

Fig. 13
figure 13

Timeline of the new deaths due to COVID-19 in South Africa

Fig. 14
figure 14

Timeline of the percentage of people vaccinated in South Africa

Figure 12 shows the timeline of the new COVID-19 infection cases in South Africa in figure, It can be seen that, the curve of COVID-19 pandemic infection from March to June, 2020 is flat which means there was lesser infection of the pandemic. July, 2020, the curve went up, till September, 2020, it went down. December, 2020 the curve went down, till March, 2021, but from May, 2021 the curve went up, till November, 2021, it went down. However, at end of the November, 2021, when the Omicron variant of concern was first reported in South Africa from the specimen collected on 9th November, 2021 in the country, the COVID-19 infection in the country happened to be highest, we can be seen that, the curve rise very peak ever, till March, 2022.

Figure 13 shows the timeline deaths due to COVID-19 in figure, it can be seen; there was lesser deaths occurrence in South Africa January, to June, 2020. From June 2020, the death curve went up and down till Early November, 2021, when it went down, however, at end of the November, 2021, when the Omicron variant of concern was first reported in South Africa from the specimen collected on 9th November, 2021 in the country, the COVID-19 infection curve went up again in, till March, 2022.

Figure 14 shows the timeline of the percentage of people vaccinated in South Africa it can be seen; people were vaccinated in the country, right from March, 2021 till date (15th March, 2022).

4.2.3 Cases of infection, death and vaccination trend of COVID-19 in Botswana

The trends of cases of infection, death as well as the percentage of the people vaccinated for COVID-19 in Botswana were analyzed before to deeply forecast the future occurrences of the COVID-19 infection in the country. The analysis was between from 1st January, 2020 15th March, 2022. Figure show the timeline of the new COVID-19 infection cases, the deaths occurred due to COVID-19 and percentage of people vaccinated in Botswana, from 1st January, 2020 15th March, 2022.

Figure 15 shows the timeline of the new COVID-19 infection cases in Botswana in figure, it can be seen that, the curve of COVID-19 pandemic infection from April to October, 2020 is flat which means there were lesser infection of the pandemic. However, at end of the November, 2021, when the Omicron variant of concern was first reported in South Africa from the specimen collected on 9th November, 2021 in the country, the COVID-19 infection in the country happened to be highest, we can be seen that, the curve rise very peak ever, till March, 2022.

Fig. 15
figure 15

Timeline of the new COVID-19 infection cases in Botswana

Figure 16 shows the timeline deaths due to COVID-19 in figure, it can be seen; there was lesser deaths occurrence in Botswana from April to January, 2021. From February, 2021, the death curve went up and down till October, 2021, when it went down to December, 2021. However, at end of the November, 2021, when the Omicron variant of concern was first reported in South Africa from the specimen collected on 9th November, 2021 in the country, the COVID-19 infection in Botswana, the curve went up again, till March, 2022.

Fig. 16
figure 16

Timeline of the new deaths due to COVID-19 in Botswana

Fig. 17
figure 17

Timeline of the percentage of people vaccinated in Botswana

Figure 17 shows the timeline of the percentage of people vaccinated in Botswana, it can be seen that, fewer people were vaccinated in the country. Throughout, year 2020 up to July, 2021, people were not vaccinated in the country. People were only in August, 2021 only. This is setback and very unfortunate and worrisome situation in the country.

4.3 Comparative analysis of COVID-19 new infection and death cases trend in Nigeria, Botswana and South Africa

The comparative analysis of COVID-19 New Infection Cases and Death Cases Trend in Nigeria, South Africa and Botswana by 7 day moving average of new cases per million was conducted. Figures 18 and 19 shows the comparative analysis.

Fig. 18
figure 18

Comparative analysis of COVID-19 Infection Cases Trend in Nigeria, South Africa and Botswana

Fig. 19
figure 19

Comparative analysis of COVID-19 Death Cases Trend in Nigeria, South Africa and Botswana

From the above Fig. 18, the new COVID-19 cases curves for South Africa and Botswana respectively have little similar patterns but shifted in time. However, the curve for new COVID-19 cases has different pattern. The comparative analysis of COVID-19 new Infection in Nigeria shows that, Botswana has highest cases, followed by South Africa, while Nigeria has the least new COVID-19 cases among the tree countries.

From the above Fig. 19, the death COVID-19 cases curves for South Africa and Botswana respectively have little similar patterns but shifted in time. However, the curve for the death COVID-19 cases in Nigeria has different pattern. The comparative analysis of the death COVID-19 cases shows that, South Africa has highest cases, followed by Botswana, while Nigeria has the least COVID-19 death cases among the tree countries.

5 CNN-LSTM deep learning based forecasting model for COVID-19 infection cases in Nigeria, South Africa and Botswana

Three different learning based forecasting models for COVID-19 Infection in Nigeria, South Africa and Botswana, were developed for each country, for 10 days. The dataset has been split into 80% training and 20% testing sets. The three different learning based forecasting models, were developed with CNN, LSTM and CNN-LSTM deep learning algorithm respectively. Figures 2021 and 22 show the CNN- LSTM Based Forecasting COVID-19 Infection Cases in Nigeria, CNN- LSTM Based Forecasting COVID-19 Infection Cases in South Africa, and CNN- LSTM Based Forecasting COVID-19 Infection Cases in Botswana respectively.

Fig. 20
figure 20

CNN- LSTM Based Forecasting COVID-19 Infection Cases in Nigeria

Fig. 21
figure 21

CNN- LSTM Based Forecasting COVID-19 Infection Cases in South Africa

Fig. 22
figure 22

CNN- LSTM Based Forecasting COVID-19 Infection Cases in Botswana

CNN- LSTM Based Forecasting Model in Fig. 20 shows that, COVID-19 infection cases would continue rising in Nigeria, this could be due low vaccination rate of the population in the country and Omicron BA.2 which is known for being even more contagious than original Omicron variant.

Likewise, CNN-LSTM DL forecasting model in Fig. 22 shows that, cases of infection for COVID-19 in South Africa continue going down slowly and possibly may raise again, this could be due to Omicron BA.2 which is known for being even more contagious than the original Omicron variant and low vaccination rate of the population in the country.

Meanwhile, CNN-LSTM Based Forecasting in Fig. 22 shows that, COVID-19 new infection cases in Botswana will continue going down slowly and possibly raise again, this could be due to Omicron BA.2 which is known for being even more contagious than original Omicron variant and low vaccination rate of the population in the country.

The performance of the three different learning based forecasting models of each the country were evaluated based on the four standard performance evaluation metrics which include accuracy, MSE, MAE and RMSE respectively. Table shows the result of the performance evaluation result of the three different learning-based forecasting models of cases of infection for COVID-19 in Nigeria, South Africa and Botswana (see Table 1).

Table1 Performance Evaluation Result of the Learning Based Forecasting Models for Infection Cases

Based on the accuracy evaluation metric, the overall performance of the forecasting models namely CNN-LTSM, CNN, and LTSM was determined. The CNN-LTSM DL forecasting models for infection cases of COVID-19 in Nigeria, South Africa and Botswana achieved the best accuracy of 98.30%, 97.60%, and 97.74% respectively, compared to CNN and LSTM DL based forecasting models of the countries as shown in Fig. 23. All the models converge from about 600 epochs and before to eventually achieved their accuracies.

Fig. 23
figure 23

Accuracy Performance Evaluation Result of the Forecasting Models

Likewise based on MSE evaluation metric, the CNN-LTSM DL forecasting model for the infection cases of COVID-19 in Nigeria, South Africa and Botswana got the smallest error values compared to CNN and LSTM DL based models respectively, as shown in Fig. 24. Hence, the LTSM DL based model for COVID-19 Infection Cases in South Africa, got the highest MSE value of 0.54432 as shown in Fig. 24.

Fig. 24
figure 24

MSE Performance Evaluation Result of the Forecasting Models

With respect to MAE evaluation metric capable of determining the closeness of the forecasted infection cases of the COVID-19 to the confirmed cases of the disease, the CNN-LTSM DL forecasting models of Nigeria, South Africa and Botswana, also got the smallest error values compared to CNN and LSTM DL forecasting models for COVID-19 respectively as shown in Fig. 25. Hence, the LTSM DL model for COVID-19 Infection Cases in Botswana got the highest value of 0.64321 MAE as shown in Fig. 25.

Fig. 25
figure 25

MAE Performance Evaluation Result of the Forecasting Models

Likewise based RMSE evaluation metric capable of determining the standard deviation of the difference between the forecasted cases of infection and the confirmed cases of COVID-19, the CNN-LTSM DL forecasting models of cases of infection for COVID-19 in Nigeria, South Africa and Botswana, also got the smallest error values compared to CNN and LSTM DL forecasting models of cases of infection for COVID-19 in Nigeria, South Africa and Botswana respectively. Hence, the LTSM DL based model for COVID-19 Infection Cases in Nigeria, got the highest RMSE value of 0.11321 as shown in Fig. 26.

Fig. 26
figure 26

RMSE Performance Evaluation Result of the Forecasting Models

6 Result and discussion

The COVID-19 pandemic spreads exponentially and causes the loss of lives; thus, plunging the world community especially African countries into an alarming difficult situation culminating into a great deal amounts of catastrophes such as economic recession, political instability and loss of jobs. With the emergence and spread of the Omicron new variant of concern, forecasting and identification of the COVID-19 infection and death cases is very vital for government at various levels. Hence, having knowledge of the spread at a particular point in time, swift actions can be taken by government at various levels with a view to accordingly formulating new policies and modalities towards minimizing the trajectory of the consequences of COVID-19 pandemic to both public health and economic sectors. Therefore, in this work, we combined Convolutional Neural Network (CNN) learning algorithm, together with Long Short-Term Memory (LSTM) learning algorithm, to propose a hybrid a learning algorithm Convolutional Neural Network—Long Short Term Memory (CNN-LSTM) and used to forecast the COVID-19 infection cases in Nigeria, South Africa and Botswana respectively. We have collected world COVID-19 dataset from “Our World in Data” organization and developed the models. Before to develop the forecasting models we conducted some analysis related to the behavior of the pandemic in the entire world since 1st January, 2020 to 15th March, 2022, adding up the numbers collected from all countries grouped by date. The cases of infection and death and vaccination trend of COVID-19 in three African countries which include Nigeria South Africa and Botswana respectively were also deeply analyzed. The result shows that, the curve of COVID-19 pandemic infection went up and down, till the end of the November, 2021, in all the three countries, till when the Omicron variant was first reported, the COVID-19 infection in all the three countries may likely raise again, perhaps happening to be highest ever In Nigeria, and South Africa excluding Botswana. Likewise for the death cases, the curve also, went ups and downs, in all the three countries. On the percentage of people vaccinated, it has been discovered that, fewer people were vaccinated in all the three countries. The comparative analysis of COVID-19 infection and death cases trend in Nigeria, South Africa and Botswana was also conducted. The result shows that, Botswana has highest cases, followed by South Africa, while Nigeria has the least new COVID-19 cases among the three countries. However, result of the comparative analysis of the COVID-19 death cases shows that, South Africa has the highest cases, followed by Botswana, while Nigeria has the least COVID-19 death cases among the tree countries.

Consequent upon above, DL forecasting models for cases infection of COVID-19 in Nigeria, South Africa and Botswana, were developed, for 10 days, with CNN, LSTM and CNN-LSTM DL algorithms respectively. The learning models were evaluated on the basis of four standard performance evaluation metrics which include accuracy, MSE, MAE and RMSE respectively. However, the CNN-LSTM DL based forecasting models achieved not only the best accuracy but also got lesser MSE, MAE and RMSE values compared to models developed with CNN and LSTM respectively. According to LSTM-RNN DL based forecasting models; the COVID-19 infection cases in Nigeria would continue rising, this could be due a low vaccination rate of the population in the country and emergence of Omicron variant on concern. Likewise, according the models, the COVID-19 the infection cases in South Africa and Botswana, would continue going down slowly and possibly may rise again, this could be due to emergence from Omicron and low vaccination rate of the population in the country.

Therefore, the three countries (Nigeria, South Africa and Botswana) should take all the necessary measures and formulate the policies and strategies that would ensure total eradication of the COVID-19 pandemic. Public awareness, which is also very important on the COVID-19, should be taken very serious in all level of government in the countries especially on COVID-19 vaccination. The government of these countries should focus on longer-term testing and surveillance approaches that can be integrated into, and will strengthen, their national health systems.

7 Conclusion

In this work, we combined CNN learning algorithm, together with LSTM learning algorithm, to propose a hybrid CNN-LSTM learning algorithm. The hybrid algorithm used the DL based forecasting model for COVID-19 infection cases in Nigeria, South Africa and Botswana, for 10 days. Two other DL based forecasting models (CNN and LSTM) for COVID-19 infection cases in Nigeria, South Africa and Botswana, for 10 days only, were developed. Hence, the models were limited to only 10 days, which can be extended to one month or more in the future work. Therefore, the learning models were evaluated based on the four standard performance evaluation metrics which include accuracy, MSE, MAE and RMSE respectively. However, the CNN-LSTM DL based forecasting models achieved not only the best accuracy of with 98.30%, 97.60%, and 97.74% but also got lesser MSE, MAE and RMSE values compared to models developed with CNN and LSTM respectively. According to LSTM-RNN deep learning based forecasting models; the COVID-19 infection cases in Nigeria would continue rising. Likewise, according the models, the COVID-19 the infection cases in South Africa and Botswana, would continue going down slowly and possibly raise again, this could be due to emergence Omicron and low vaccination rate of the population in the country [39,40,41,42,43].