Abstract
The COVID-19 pandemic has created a pressing need to integrate information from disparate sources to assist decision-makers. Social media is important in this respect; however, to make sense of the textual information it provides and automate large amounts of data, natural language processing methods are needed. Social media posts are often noisy, yet they may provide valuable insights regarding the severity and prevalence of the disease in the population. In particular, machine learning techniques for triage and diagnosis could better understand what social media may offer in this respect. In an effort that provides investigators and some other relevant individuals with new knowledge on the signs, seriousness, and incidence of COVID-19, this study intends to construct an end-to-end natural language processing pipeline for assessment and identification of COVID-19 using patient-authored status updates. The information retrieval pipeline retrieves COVID-19 ailments and similar theories such as severity, duration, negations, and bodily parts from patients’ messages using potential artificial fields. In the next step of the pipeline, an unsupervised rule-based algorithm establishes relationships between ideas. Following that, the retrieved concepts and relationships create two alternative adversarial examples of each post. These vectors are used individually to create gradient boosting machine learning that categorizes patients into three groups and diagnoses COVID-19. When the models are trained on human labelled data, we report Macro- and Micro-averaged F1 scores in the range of 71–96 percent and 61–87 percent, respectively, to triage and diagnose COVID-19. Our results show that when the models are trained using projected labels from notion extraction and directive classifiers, identical efficiency may be attained, resulting in end-to-end machine learning. We also compare the most common symptoms disclosed in another COVID-19 dataset to the essential traits identified by our diagnostic machine learning models. We discovered that the most important characteristics are not usually the most common. Our preliminary findings demonstrate that a machine learning pipeline can automatically triage and diagnose COVID-19 patients using natural language narratives, allowing us to provide further knowledge on the disease’s seriousness and prevalence via the lenses of social networking sites.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Obeid, J. S., Davis, M., Turner, M., et al. (2020). An artificial intelligence approach to COVID-19 infection risk assessment in virtual visits: A case report. Journal of the American Medical Informatics Association, 27(8), 1321–1325.
Schwab, P., DuMont Schütte, A., Dietz, B., & Bauer, S. (2020). Clinical predictive models for COVID-19: Systematic study. Journal of Medical Internet Research, 22(10), e21439.
Sarker, A., Lakamana, S., Hogg-Bremer, W., et al. (2020). Self-reported COVID-19 symptoms on Twitter: An analysis and a research resource. Journal of the American Medical Informatics Association, 27(8), 1310–1315.
Qin, L., Sun, Q., Wang, Y., et al. (2020). Prediction of the number of 2019 novel coronavirus (COVID-19) using social media search index. International Journal of Environmental Research and Public Health, 17(7), 2365.
Norman, G. (2010). Likert scales, levels of measurement and the “laws” of statistics. Advances in Health Sciences Education, 15(5), 625–632.
Drucker, H., Burges, C. J., Kaufman, L., et al. (1996). Support vector regression machines. Advances in Neural Information Processing Systems, 9, 155–161.
Marsland, S. (2014). Machine learning: An algorithmic perspective (second ed.). CRC.
Arrieta, A., García-Prado, A., Gonzalez, P., & Pinto-Prades, J. L. (2017). Risk attitudes in medical decisions for others: An experimental approach. Health Economics, 26, 97–113.
Sutton, C., & McCallum, A. (2012). An introduction to conditional random fields. Found Trends Mach Learn, 4(4), 267–373.
Bach, N., & Badaskar, S. (2007). A review of relation extraction. Literature Review Language Statistics, II(2), 1–15.
Zimmerman, R. K., Balasubramani, G., Nowalk, M. P., et al. (2016). Classification and Regression Tree (CART) analysis to predict influenza in primary care patients. BMC Infectious Diseases, 16(1), 1–11.
Brinati, D., Campagner, A., Ferrari, D., et al. (2020). Detection of COVID-19 infection from routine blood exams with machine learning: A feasibility study. Journal of Medical Systems, 44(8), 1–12.
Wynants, L., Van Calster, B., Collins, G. S., et al. (2020). Prediction models for diagnosis and prognosis of COVID-19: systematic review and critical appraisal. British Medical Journal, 369.
Zens, M., Brammertz, A., Herpich, J. et al. (2020). App-based tracking of self-reported COVID-19 symptoms: Analysis of questionnaire data. Journal of Medical Internet Research, 22(9):e21956.
Menni, C., Valdes, A. M., Freidin, M. B., et al. (2020). Real-time tracking of self-reported symptoms to predict potential COVID-19. Nature Medicine, 26(7), 1037–1040.
Judson, T. J., Odisho, A. Y., Neinstein, A. B., et al. (2020). Rapid design and implementation of an integrated patient self-triage and self-scheduling tool for COVID-19. Journal of the American Medical Informatics Association, 27(6), 860–866.
Liu, Y., Wang, Z., Tian, Y. et al. (2020). A COVID-19 risk assessment decision support system for general practitioners: Design and development study. Journal of Medical Internet Research, 22(6), e19786.
Einstein Data4u.Diagnosis of COVID-19 and its clinical spectrum AI and Data Science supporting clinical decision ((from 28th Mar to 3rd Apr)). Retrieved February 02, 2021 from https://www.kaggle.com/einsteindata4u/covid19.
Wang, K., Zuo, P., Liu, Y., et al. (2020). Clinical and laboratory predictors of in-hospital mortality in patients with coronavirus disease-2019: A cohort study in Wuhan China. Clinical infectious diseases, 71(16), 2079–2088.
Vaid, A., Somani, S., Russak, A. J. et al. (2020). Machine learning to predict mortality and critical events in a Cohort of patients with COVID-19 in New York City: Model development and validation. Journal of Medical Internet Research, 22(11), e24018.
Wagner, T., Shweta, F., Murugadoss, K. et al. (2020). Augmented curation of clinical notes from a massive EHR system reveals symptoms of impending COVID-19 diagnosis. Elife, 9:e58227.
Mizrahi, B., Shilo, S., Rossman, H., et al. (2020). Longitudinal symptom dynamics of COVID-19 infection. Nature Communications, 11(1), 1–10.
Wang, J., Abu-el Rub, N., Gray, J. et al. (2021). COVID-19 SignSym—A fast adaptation of general clinical NLP tools to identify and normalize COVID-19 signs and symptoms to OMOP common data model. Journal of American Medical Information Association.
Aakanksha Singhal, D.K. Sharma. (2020). Generalized ‘Useful’ Rényi & Tsallis information measures, some discussions with application to rainfall data. International Journal of Grid and Distributed Computing, 13(2), 681–688.
Bodenreider, O. (2004). The Unified Medical Language System (UMLS): Integrating biomedical terminology. Nucleic Acids Research, 32(1), 267–270.
Mikolov, T., Sutskever, I., Chen, K. et al. (2013). Distributed representations of words and phrases and their compositionality. In Advances in Neural Information Processing Systems, pp. 3111–3119, May.
Patient. Retrieved January 18, 2021 from https://patient.info/forums/discuss/browse/coronavirus-covid-19--4541.
Landis, J. R., & Koch, G. G. (1977). The Measurement of observer agreement for categorical data. Biometrics, 159–174.
Schober P, Boer C and Schwarte L. A. Correlation coefficients: Appropriate use and interpretation. Anesthesia & Analgesia, 126(5):1763–1768, 2018. de Vet H. C, Mokkink L. B, Terwee C. B et al. Clinicians are right not to like Cohen’s κ. British Medical Journal, 346:f2125, (2013).
Hooda, D. S., & Sharma, D. K. (2010). Exponential survival entropies and their properties. Advances in Mathematical Sciences and Applications, 20, 265–279.
Hooda, D. S., Upadhyay, K., & Sharma, D. K. (2015). On parametric generalization of ‘Useful’ R- norm information Measure. British Journal of Mathematics & Computer Science, 8(1), 1–15.
Manning, C., & Schutze, H. (1999). Foundations of Statistical Natural Language Processing. MIT Press.
Python-crfsuite. Retrieved March 14, 2018, from https://python-crfsuite.readthedocs.io/en/latest/.
Kumari, R., & Sharma, D. K. (2019). Generalized `useful non-symmetric divergence measures and Inequalities. Journal of Mathematical Inequalities, 13(2), 451–466.
Chang, C.-C., & Lin, C.-J. (2011). LIBSVM: A library for support vector machines. ACM Transaction Intelligent System Technology, 2(3), 1–27.
Pedregosa, F., Varoquaux, G., Gramfort, A., et al. (2011). Scikit-learn: Machine learning in Python. Journal of Machine Learning Research, 12, 2825–2830.
Support Vector Machines. Retrieved January 19, 2021 from https://scikit-learn.org/stable/modules/svm.html.
Weston, J., Mukherjee, S., Chapelle, O. et al. (2000). Feature selection for SVMs. In Advances in Neural Information Processing Systems.
Zhou, N. R., Liang, X. R., Zhou, Z. H., & Farouk, A. (2016). Relay selection scheme for amplify-and-forward cooperative communication system with artificial noise. Security and Communication Networks, 9(11), 1398–1404.
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2022 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this chapter
Cite this chapter
Gupta, R.K. (2022). Utilization of Digital Network Learning and Healthcare for Verbal Assessment and Counselling During Post COVID-19 Period. In: Hamdan, A., Hassanien, A.E., Mescon, T., Alareeni, B. (eds) Technologies, Artificial Intelligence and the Future of Learning Post-COVID-19. Studies in Computational Intelligence, vol 1019. Springer, Cham. https://doi.org/10.1007/978-3-030-93921-2_7
Download citation
DOI: https://doi.org/10.1007/978-3-030-93921-2_7
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-93920-5
Online ISBN: 978-3-030-93921-2
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)