The Prediction Analysis of COVID-19 Cases Using ARIMA and KALMAN Filter Models: A Case of Comparative Study

Iyyanki, Murali Krishna; Prisilla, Jayanthi

doi:10.1007/978-981-15-8097-0_7

Murali Krishna Iyyanki⁶ &
Jayanthi Prisilla⁷

Part of the book series: Studies in Big Data ((SBD,volume 80))

723 Accesses
2 Citations

Abstract

The time series technique in machine learning is one of the important spaces for analysis and prediction. It includes many approaches to predict that involves time component. In the chapter, two approaches, i.e., autoregressive integrated moving average (ARIMA) and KALMAN filter models were demonstrated on the corona data (India) that was obtained from Ministry of Health and Family Welfare Web site. On modeling, it was found that ARIMA model gave better performance model over the KALMAN filter model. ARIMA (1, 1, 0) gave the approximate value of 35,303 for May 1, 2020 with sigma equal to 199.32, whereas the state-space model and error model of KALMAN filter generated the value of 33,116 and variance equal to 1356.18. The key purpose of the study is to understand and estimate the number of hospital beds and nursing care beds for the COVID-19 (CV-19) patients and make the indispensable arrangement for the patient treatment and avoid delay in action. In recovery cases, the highest value of difference is observed as 1153 on April 27, 2020, whereas the increases in reported cases are 2082 on April 28, 2020. More number of cases are reported with the peak in Maharashtra of 9915 (confirmed) and 1593 (recovery) on April 30, 2020. COVID-19 data visualization was carried out geographical information system with red color referring to the danger or more number of COVID-19 affected areas/state. Green color refers to normal and blue color refers to safe zone with no or single digit cases reported.

Access provided by Autonomous University of Puebla. Download chapter PDF

Predicting the Growth of COVID-19 in Morocco by Adopting an ARIMA Model

Comparison of ARIMA, ETS, NNAR, TBATS and hybrid models to forecast the second wave of COVID-19 hospitalizations in Italy

Article Open access 04 August 2021

Medical service demand forecasting using a hybrid model based on ARIMA and self-adaptive filtering method

Article Open access 19 September 2020

1 Introduction

In machine learning, the analysis of time series is found to be very popular and standard that is performed using various models. The experimental data analysis was observed at various points in time leads to new and unique complications in statistical modeling and inference [1]. In this chapter, ARIMA and KALMAN filter models are discussed for predicting COVID-19 cases. The prediction approach of events through a time sequence is referred as time series forecasting. By analyzing the historical trends of the past, assumption is favored for future trends. Time series (TS) are used in every field from medicine to finance, business, inventory planning, and dynamic system theory. The modern application of TS forecasting uses computer technologies that include machine learning, artificial neural networks, support vector machines, and so on. It is well-quoted by a data scientist that “time series forecasting is something of a dark horse in data science.” On the other hand, according to Tealab [2] time series is a general problem solution of great practical interest in various disciplines. TS have evidence about the predictor variables of any system which determines dynamically. It is a sequence of values over the time of a system y(t) which registers a sequence of experimental values given as y (t₁), y (t₂), y (t₃),…, y (t_n) for certain interval t = n where t₀ < t₁ < … < t_n. The aim of the study is to have the count of hospital beds and nursing beds made available on the prediction made to avoid delays and rushing. This would help the healthcare centers to arrange and be vigilant.

2 Predictive Modeling

Predictive modeling (PM) is a practice that uses data and mathematics to predict outcomes with data models. On the other hand, machine learning (ML) algorithms build the mathematical model based on the training data for prediction; ML algorithms uses statistical techniques to allow a computer to construct PMs. Predictive model stirs relations between ML, pattern recognition, and data mining. PM includes much more than the tools and techniques for unveiling patterns within data. PM training defines the development of a model process in a way that can understand and quantify the model’s prediction accuracy on future, yet-to-be-seen data. The prime aim of PM is to produce accurate predictions and next is to interpret the model and understand how it works. But unfortunate reality/certainty is that as the model is pushed toward higher accuracy, models become more complex and their interpretability becomes more difficult [3]. PM performs curve and surface fitting, TS regression, or/and ML methods. One such example of TS regression; where the key convention of regression methods is that the patterns in the past data will be repeated in the future [4]. In this work, time series approach is carried out using ARIMA and KALMAN filter approach, the predictive results of CV-19 were analyzed to find that the ARIMA model gave the nearest results of the confirmed cases in India. The objective of this prediction study is to understand the need of hospital beds and nursing care beds for CV-19 patients. This study helps to make the necessary arrangements for number of patient in-advance and to be cared for.

3 Time Series Using COVID-19 Datasets

A time series (TS) is a set of series of data points listed in the time order. A sequence that is successive equal spread out in points with time. The analysis encompasses methods for analyzing TS data to extract meaningful statistics and other data characteristics. The forecasting model of TS uses future values based on previously observed values for prediction. The time series data components are trend, seasonal variation, cyclical variation, and other irregular fluctuations.

Elmousalami [5] in their case study of CV-19 of analysis and modeling performed single exponential smoothing (SES) on the datasets of international confirmed cases. Figure 1 shows the graph of SES obtained and the Eq. 1 of SES is given as

$$F_{t + 1} = \left( {1 - \alpha } \right)F_{t} + \alpha \,D_{t}$$

(1)

The results in Table 1 show that SES has the most accurate model for forecasting recovered cases of CV-19 with 517.54, 523335.16, 723.42, and 16.38% for mean absolute deviation (MAD), mean square error (MSE), root mean square error (RMSE), mean absolute percentage error (MAPE), respectively, against moving average (MA) and weighted moving average WMA.

Table 1 Forecasting models for international confirmed cases [5]

The Prediction Analysis of COVID-19 Cases Using ARIMA and KALMAN Filter Models: A Case of Comparative Study

Abstract

Similar content being viewed by others

Predicting the Growth of COVID-19 in Morocco by Adopting an ARIMA Model

Comparison of ARIMA, ETS, NNAR, TBATS and hybrid models to forecast the second wave of COVID-19 hospitalizations in Italy

Medical service demand forecasting using a hybrid model based on ARIMA and self-adaptive filtering method

1 Introduction

2 Predictive Modeling

3 Time Series Using COVID-19 Datasets

4 ARIMA

4.1 The Notation of ARIMA (P, D, Q)

4.2 ARIMA in COVID-19 Cases—Datasets

4.3 ARIMA Model on COVID-19—India Dataset

5 KALMAN Filter

5.1 KALMAN Filter—for Prediction in Different Studies

5.2 KALMAN Filter—for COVID-19 Prediction—India Dataset

5.2.1 Covariance

5.2.2 Error Form

6 Geographic Information Systems—Visualization and Prediction—COVID-19 Datasets

7 Conclusions

References

Conflict of Interest

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this chapter

Cite this chapter

Download citation

Share this chapter

Publish with us

Search

Navigation