Skip to main content

Revealing COVID-19 Data by Data Mining and Visualization

  • Conference paper
  • First Online:
Advances in Intelligent Networking and Collaborative Systems (INCoS 2021)

Abstract

In the current era of big data, huge volumes of valuable data are generated and collected at a rapid velocity from a wide variety of rich data sources. Examples include disease and epidemiological data such as privacy-preserving statistics on patients who suffered from epidemic diseases like the coronavirus disease 2019 (COVID-19). Embedded in the huge volumes of COVID-19 data for large numbers of COVID-19 cases around the world is implicit, previously unknown and potentially useful information and knowledge—which can be discovered by data mining. As “a picture is worth a thousand words”, having the pictorial representation further enhances this knowledge discovery process. Visualization of COVID-19 data helps users discover useful information and knowledge—such as popular features and their associative relationships—related to COVID-19 cases. Moreover, visualization of discovered knowledge helps users get a better understanding and interpretation of discovered knowledge. Hence, in this paper, we present a data science solution that makes good use of both data mining and visualization for conducting data analytics and visual analytics of COVID-19 data to reveal important information and knowledge from COVID-19. Evaluation on real-life COVID-19 data demonstrates the effectiveness of our solution in revealing useful information and knowledge of COVID-19 by data mining and visualization.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 169.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 219.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

Notes

  1. 1.

    https://covid19.who.int/.

  2. 2.

    https://coronavirus.jhu.edu/map.html.

  3. 3.

    https://qap.ecdc.europa.eu/public/extensions/COVID-19/COVID-19.html.

  4. 4.

    https://www.canada.ca/en/public-health/services/diseases/2019-novel-coronavirus-infection.html.

  5. 5.

    https://newsinteractives.cbc.ca/coronavirustracker/.

  6. 6.

    https://www.ctvnews.ca/health/coronavirus/tracking-every-case-of-covid-19-in-canada-1.4852102.

  7. 7.

    https://beta.ctvnews.ca/content/dam/common/exceltojson/COVID-19-Canada-New.txt.

  8. 8.

    https://en.wikipedia.org/wiki/Template:COVID-19_pandemic_data/Canada_medical_cases.

  9. 9.

    https://www150.statcan.gc.ca/n1/pub/13-26-0003/132600032020001-eng.htm.

  10. 10.

    https://www.ctvnews.ca/health/coronavirus/tracking-variants-of-the-novel-coronavirus-in-canada-1.5296141.

  11. 11.

    https://www.who.int/en/activities/tracking-SARS-CoV-2-variants/.

References

  1. Bo, D., Ai, L., Chen, Y.: Research and application of big data correlation analysis in education. In: Barolli, L., Nishino, H., Miwa, H. (eds.) INCoS 2019. AISC, vol. 1035, pp. 454–462. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-29035-1_44

  2. Saberi, M., et al.: Challenges in efficient customer recognition in contact centre: state-of-the-art survey by focusing on big data techniques applicability. In: INCoS 2016, pp. 548–554 (2016)

    Google Scholar 

  3. Ray, J., Trovati, M.: On the need for a novel intelligent big data platform: a proposed solution. In: Xhafa, F., Barolli, L., Greguš, M. (eds.) INCoS 2018. LNDECT, vol. 23, pp. 473–478. Springer, Cham (2019). https://doi.org/10.1007/978-3-319-98557-2_43

  4. Anderson-Grégoire, I.M., et al.: A big data science solution for analytics on moving objects. In: Barolli, L., Woungang, I., Enokido, T. (eds.) AINA 2021. LNNS, vol. 226, pp. 133–145. Springer, Cham (2021). https://doi.org/10.1007/978-3-030-75075-6_11

    Chapter  Google Scholar 

  5. Choy, C.M., et al.: Natural sciences meet social sciences: census data analytics for detecting home language shifts. In: IMCOM 2021, pp. 520–527 (2021). https://doi.org/10.1109/IMCOM51814.2021.9377412

  6. Balco, P., Kajanová, H., Linhardt, P.: Economic interpretation of eHealth implementation in countrywide measures. In: Xhafa, F., Barolli, L., Greguš, M. (eds.) INCoS 2018. LNDECT, vol. 23, pp. 255–261. Springer, Cham (2019). https://doi.org/10.1007/978-3-319-98557-2_23

  7. Leung, C.K., et al.: Big data analysis and services: visualization of smart data to support healthcare analytics. In: IEEE iThings-GreenCom-CPSCom-SmartData 2019, pp. 1261–1268 (2019)

    Google Scholar 

  8. Shang, S., et al.: Spatial data science of COVID-19 data. In: IEEE HPCC-SmartCity-DSS 2020, pp. 1370–1375 (2020)

    Google Scholar 

  9. Souza, J., Leung, C.K., Cuzzocrea, A.: An innovative big data predictive analytics framework over hybrid big data sources with an application for disease analytics. In: Barolli, L., Amato, F., Moscato, F., Enokido, T., Takizawa, M. (eds.) AINA 2020. AISC, vol. 1151, pp. 669–680. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-44041-1_59

    Chapter  Google Scholar 

  10. Barkwell, K.E., et al.: Big data visualisation and visual analytics for music data mining. In: IV 2018, pp. 235–240 (2018)

    Google Scholar 

  11. Takano, A., Hirata, J., Miwa, H.: Method of generating computer graphics animation synchronizing motion and sound of multiple musical instruments. In: Xhafa, F., Barolli, L., Greguš, M. (eds.) INCoS 2018. LNDECT, vol. 23, pp. 124–133. Springer, Cham (2019). https://doi.org/10.1007/978-3-319-98557-2_12

  12. Lee, W., et al.: Reducing noises for recall-oriented patent retrieval. In: IEEE BDCloud 2014, pp. 579–586 (2014)

    Google Scholar 

  13. Leung, C., Lee, W., Song, J.J.: Information technology-based patent retrieval models. In: Glänzel, W., Moed, H.F., Schmoch, U., Thelwall, M. (eds.) Springer Handbook of Science and Technology Indicators. SH, pp. 859–874. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-02511-3_34

    Chapter  Google Scholar 

  14. Amato, F., Cozzolino, G., Moscato, F., Xhafa, F.: Semantic analysis of social data streams. In: Xhafa, F., Barolli, L., Greguš, M. (eds.) INCoS 2018. LNDECT, vol. 23, pp. 59–70. Springer, Cham (2019). https://doi.org/10.1007/978-3-319-98557-2_6

    Chapter  Google Scholar 

  15. Jiang, F., et al.: Finding popular friends in social networks. In: CGC 2012, pp. 501–508 (2012)

    Google Scholar 

  16. Singh, S.P., Leung, C.K.: A theoretical approach for discovery of friends from directed social graphs. In: IEEE/ACM ASONAM 2020, pp. 697–701 (2020)

    Google Scholar 

  17. Busse, V., Gregus, M.: Crowdfunding – an innovative corporate finance method and its decision-making steps. In: Barolli, L., Nishino, H., Miwa, H. (eds.) INCoS 2019. AISC, vol. 1035, pp. 544–555. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-29035-1_53

  18. Chanda, A.K., et al.: A new framework for mining weighted periodic patterns in time series databases. ESWA 79, 207–224 (2017)

    Google Scholar 

  19. Morris, K.J., et al.: Token-based adaptive time-series prediction by ensembling linear and non-linear estimators: a machine learning approach for predictive analytics on big stock data. In: IEEE ICMLA 2018, pp. 1486–1491 (2018)

    Google Scholar 

  20. Roy, K.K., Moon, M.H.H., Rahman, M.M., Ahmed, C.F., Leung, C.K.: Mining sequential patterns in uncertain databases using hierarchical index structure. In: Karlapalem, K., et al. (eds.) PAKDD 2021. LNCS (LNAI), vol. 12713, pp. 29–41. Springer, Cham (2021). https://doi.org/10.1007/978-3-030-75765-6_3

    Chapter  Google Scholar 

  21. Audu, A.-R.A., Cuzzocrea, A., Leung, C.K., MacLeod, K.A., Ohin, N.I., Pulgar-Vidal, N.C.: An intelligent predictive analytics system for transportation analytics on open data towards the development of a smart city. In: Barolli, L., Hussain, F.K., Ikeda, M. (eds.) CISIS 2019. AISC, vol. 993, pp. 224–236. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-22354-0_21

    Chapter  Google Scholar 

  22. Balbin, P.P.F., et al.: Predictive analytics on open big data for supporting smart transportation services. Procedia Comput. Sci. 176, 3009–3018 (2020)

    Article  Google Scholar 

  23. Leung, C.K., et al.: Data mining on open public transit data for transportation analytics during pre-COVID-19 era and COVID-19 era. In: Barolli, L., Li, K.F., Miwa, H. (eds.) INCoS 2020. AISC, vol. 1263, pp. 133–144. Springer, Cham (2021). https://doi.org/10.1007/978-3-030-57796-4_13

  24. Cox, T.S., et al.: An accurate model for hurricane trajectory prediction. In: IEEE COMPSAC 2018, vol. 2, pp. 534–539 (2018)

    Google Scholar 

  25. Leung, C.K., et al.: Explainable machine learning and mining of influential patterns from sparse web. In: IEEE/WIC/ACM WI-IAT 2020, pp. 829–836 (2020)

    Google Scholar 

  26. Singh, S.P., et al.: Analytics of similar-sounding names from the web with phonetic based clustering. In: IEEE/WIC/ACM WI-IAT 2020, pp. 580–585 (2020)

    Google Scholar 

  27. Dierckens, K.E., et al.: A data science and engineering solution for fast k-means clustering of big data. In: IEEE TrustCom-BigDataSE-ICESS 2017, pp. 925–932 (2017)

    Google Scholar 

  28. Leung, C.K., Jiang, F.: A data science solution for mining interesting patterns from uncertain big data. In: IEEE BDCloud 2014, pp. 235–242 (2014)

    Google Scholar 

  29. Alam, M.T., Ahmed, C.F., Samiullah, M., Leung, C.K.: Mining frequent patterns from hypergraph databases. In: Karlapalem, K., et al. (eds.) PAKDD 2021. LNCS (LNAI), vol. 12713, pp. 3–15. Springer, Cham (2021). https://doi.org/10.1007/978-3-030-75765-6_1

    Chapter  Google Scholar 

  30. Fariha, A., Ahmed, C.F., Leung, C.K.-S., Abdullah, S.M., Cao, L.: Mining frequent patterns from human interactions in meetings using directed acyclic graphs. In: Pei, J., Tseng, V.S., Cao, L., Motoda, H., Xu, G. (eds.) PAKDD 2013. LNCS (LNAI), vol. 7818, pp. 38–49. Springer, Heidelberg (2013). https://doi.org/10.1007/978-3-642-37453-1_4

  31. Leung, C.K.-S.: Uncertain frequent pattern mining. In: Aggarwal, C.C., Han, J. (eds.) Frequent Pattern Mining, pp. 339–367. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-07821-2_14

    Chapter  MATH  Google Scholar 

  32. Leung, C.K., et al.: Distributed uncertain data mining for frequent patterns satisfying anti-monotonic constraints. In: IEEE AINA Workshops 2014, pp. 1–6 (2014)

    Google Scholar 

  33. Zhang, J., Li, J.: Retail commodity sale forecast model based on data mining. In: INCoS 2016, pp. 307–310 (2016)

    Google Scholar 

  34. Jiang, F., Leung, C.K.: A data analytic algorithm for managing, querying, and processing uncertain big data in cloud environments. Algorithms 8(4), 1175–1194 (2015)

    Article  Google Scholar 

  35. Lee, W., Leung, C.K.., Nasridinov, A. (eds.): BIGDAS 2018. AISC, vol. 899. Springer, Singapore (2021). https://doi.org/10.1007/978-981-15-8731-3

    Book  Google Scholar 

  36. Leung, C.K.: Big data analysis and mining. In: Encyclopedia of Information Science and Technology, 4e, pp. 338–348 (2018)

    Google Scholar 

  37. Leung, C.K.-S., Jiang, F.: Big data analytics of social networks for the discovery of “following” patterns. In: Madria, S., Hara, T. (eds.) DaWaK 2015. LNCS, vol. 9263, pp. 123–135. Springer, Cham (2015). https://doi.org/10.1007/978-3-319-22729-0_10

  38. Vančová, M.H.: Place of analytics within strategic information systems: a conceptual approach. In: Xhafa, F., Barolli, L., Greguš, M. (eds.) INCoS 2018. LNDECT, vol. 23, pp. 479–485. Springer, Cham (2019). https://doi.org/10.1007/978-3-319-98557-2_44

  39. Jezowicz, T., et al.: Visualization of large graphs using GPU computing. In: INCoS 2013, pp. 662–667 (2013)

    Google Scholar 

  40. Leung, C.K., Carmichael, C.L.: FpVAT: a visual analytic tool for supporting frequent pattern mining. ACM SIGKDD Explor. 11(2), 39–48 (2009)

    Article  Google Scholar 

  41. Ahn, S., et al.: A fuzzy logic based machine learning tool for supporting big data business analytics in complex artificial intelligence environments. In: FUZZ-IEEE 2019, pp. 1259–1264 (2019)

    Google Scholar 

  42. Ibrishimova, M.D., Li, K.F.: A machine learning approach to fake news detection using knowledge verification and natural language processing. In: Barolli, L., Nishino, H., Miwa, H. (eds.) INCoS 2019. AISC, vol. 1035, pp. 223–234. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-29035-1_22

  43. Leung, C.K., et al.: Machine learning and OLAP on big COVID-19 data. In: IEEE BigData 2020, pp. 5118–5127 (2020)

    Google Scholar 

  44. Monno, S., Kamada, Y., Miwa, H., Ashida, K., Kaneko, T.: Detection of defects on SiC substrate by SEM and classification using deep learning. In: Xhafa, F., Barolli, L., Greguš, M. (eds.) INCoS 2018. LNDECT, vol. 23, pp. 47–58. Springer, Cham (2019). https://doi.org/10.1007/978-3-319-98557-2_5

  45. Leung, C.K.: Mathematical model for propagation of influence in a social network. In: Alhajj, R., Rokne, J. (eds.) Encyclopedia of Social Network Analysis and Mining, 2e, pp. 1261–1269. Springer, New York (2018). https://doi.org/10.1007/978-1-4939-7131-2_110201

  46. Nakamura, T., Shibata, M., Tsuru, M.: On retrieval order of statistics information from OpenFlow switches to locate lossy links by network tomographic refinement. In: Barolli, L., Nishino, H., Miwa, H. (eds.) INCoS 2019. AISC, vol. 1035, pp. 342–351. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-29035-1_33

  47. Arshadi, A.K., et al.: Artificial intelligence for COVID-19 drug discovery and vaccine development. Frontiers Artif. Intell. 3, 65:1-65:13 (2020)

    Google Scholar 

  48. Berber, B., Doluca, O.: A comprehensive drug repurposing study for COVID19 treatment: novel putative dihydroorotate dehydrogenase inhibitors show association to serotonin-dopamine receptors. Briefings Bioinform. 22(2), 1023–1037 (2021)

    Article  Google Scholar 

  49. Caruso, F.P., et al.: A review of COVID-19 biomarkers and drug targets: resources and tools. Briefings Bioinform. 22(2), 701–713 (2021)

    Article  Google Scholar 

  50. Dagliati, A., et al.: Health informatics and EHR to support clinical research in the COVID-19 pandemic: an overview. Briefings Bioinform. 22(2), 812–822 (2021)

    Article  Google Scholar 

  51. Dotolo, S., et al.: A review on drug repurposing applicable to COVID-19. Briefings Bioinform. 22(2), 726–741 (2021)

    Article  Google Scholar 

  52. Chen, Y.: A data science solution for supporting social and economic analysis. In: IEEE COMPSAC 2021, pp. 1690–1695 (2021). https://doi.org/10.1109/COMPSAC51774.2021.00252

  53. Kuo, W., He, J.: Guest editorial: crisis management - from nuclear accidents to outbreaks of COVID-19 and infectious diseases. IEEE Trans. Reliab. 69(3), 846–850 (2020)

    Article  Google Scholar 

  54. Oksanen, A., et al.: COVID-19 crisis and digital stressors at work: a longitudinal study on the Finnish working population. Comput. Hum. Behav. 122, 106853:1-106853:10 (2021)

    Article  Google Scholar 

  55. Jentner, W., Keim, D.: Visualization and visual analytic techniques for patterns. In: Fournier-Viger, P., Lin, J.C.-W., Nkambou, R., Vo, B., Tseng, V.S. (eds.) High-Utility Pattern Mining. SBD, vol. 51, pp. 303–337. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-04921-8_12

    Chapter  Google Scholar 

  56. Leung, C.-S., Irani, P.P., Carmichael, C.L.: FIsViz: a frequent itemset visualizer. In: Washio, T., Suzuki, E., Ting, K.M., Inokuchi, A. (eds.) PAKDD 2008. LNCS (LNAI), vol. 5012, pp. 644–652. Springer, Heidelberg (2008). https://doi.org/10.1007/978-3-540-68125-0_60

    Chapter  Google Scholar 

  57. Leung, C.K., et al.: PyramidViz: visual analytics and big data visualization of frequent patterns. In: IEEE DASC-PICom-DataCom-CyberSciTech 2016, pp. 913–916 (2016)

    Google Scholar 

  58. Leung, C.K., et al.: FpMapViz: a space-filling visualization for frequent patterns. In: IEEE ICDM 2011 Workshops, pp. 804–811 (2011)

    Google Scholar 

  59. Statistics Canada: Table 17-10-0005-01 population estimates on July 1st, by age and sex (2020). https://doi.org/10.25318/1710000501-eng

Download references

Acknowledgments

This project is partially supported by NSERC (Canada) and University of Manitoba.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Carson K. Leung .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2022 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Leung, C.K., Kaufmann, T.N., Wen, Y., Zhao, C., Zheng, H. (2022). Revealing COVID-19 Data by Data Mining and Visualization. In: Barolli, L., Chen, HC., Miwa, H. (eds) Advances in Intelligent Networking and Collaborative Systems. INCoS 2021. Lecture Notes in Networks and Systems, vol 312. Springer, Cham. https://doi.org/10.1007/978-3-030-84910-8_8

Download citation

Publish with us

Policies and ethics