Abstract
The discharge summary contains voluminous information regarding the patient like history, symptoms, investigations, treatment, medication, etc. Though the discharge summary has a general structured way of representation, it is still not structured in a way that clinical systems can process. Different natural language processing (NLP) and machine learning techniques have been explored on the discharge summaries to extract various interesting information. Text mining techniques have been carried out in public and private discharge summaries. This survey discusses different tasks performed on discharge summaries and the existing tools which have been explored. The major dataset which has been used in existing research is also discussed. A common outline of system architectures on discharge summaries across various researches is explored. Major challenges in extracting information from discharge summaries are also detailed.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
S.S. Shastri, P.C. Nair, D. Gupta, R.C. Nayar, R. Rao, A. Ram, Breast cancer diagnosis and prognosis using machine learning techniques, in The International Symposium on Intelligent Systems Technologies and Applications (Springer, Cham, 2017)
S. Khare, D. Gupta, K. Prabhavathi, M.G. Deepika, A. Jyotishi, Health and nutritional status of children: survey, challenges and directions, in International Conference on Cognitive Computing and Information Processing (Springer, Singapore, 2017)
D.P. Pragna, S. Dandu, M. Meenakzshi, C. Jyotsna, J. Amudha, Health alert system to detect oral cancer, in Inventive Communication and Computational Technologies (ICICCT) (2017)
T. Babu, T. Singh, D. Gupta, S. Hameed, Colon cancer detection in biopsy images for Indian population at different magnification factors using texture features, in 2017 Ninth International Conference on Advanced Computing (ICoAC) (IEEE, 2017)
A. Madabhushi, G. Lee, Image analysis and machine learning in digital pathology: challenges and opportunities 170–175 (2016)
S.V. Iyer, R. Harpaz, P. LePendu, A. Bauer-Mehren, N.H. Shah, Mining clinical text for signals of adverse drug-drug interactions. J. Am. Med. Inform. Assoc. 21(2), 353–362 (2014)
K.B. Wagholikar, K.L. MacLaughlin, M.R. Henry, R.A. Greenes, R.A. Hankey, H. Liu, R. Chaudhry, Clinical decision support with automated text processing for cervical cancer screening. J. Am. Med. Inform. Assoc. 19(5), 833–839 (2012)
R. Angus, R. Gaizauska, M. Hepple, Extracting clinical relationships from patient narratives, in Proceedings of the Workshop on Current Trends in Biomedical Natural Language Processing (2008)
W. Long, Extracting diagnoses from discharge summaries, in AMIA Annual Symposium Proceedings (2005)
S. Doan, N. Collier, H. Xu, P.H. Duy, T.M. Phuong, Recognition of medication information from discharge summaries using ensembles of classifiers. BMC Med. Inform. Dec. Mak. 12(1), 36 (2012)
D.T. Heinze, M.L. Morsch, R.E. Sheffer Jr, M.A. Jimmink, M.A. Jennings, W.C. Morris, A.E. Morsch, LifeCode™—a natural language processing system for medical coding and data mining, in AAAI/IAAI (2000)
C. Friedman, P.O. Alderson, J. Austin, J. Cimino, S. Johnson, A general natural-language text processor for clinical radiology. J. Am. Med. Inform. Assoc. 1(2), 161–174 (1994)
G.K. Savova, J.J. Masanz, P.V. Ogren J. Zheng, Mayo clinical Text Analysis and Knowledge Extraction System (cTAKES): architecture, component evaluation and applications. J. Am. Med. Inform. Assoc. 17(5), 507–513 (2010)
A.R. Aronson, Effective mapping of biomedical text to the UMLS Metathesaurus: the MetaMap program, in Proceedings of the AMIA Symposium (2001)
X. Zhou, H. Han, I. Chankai, A.A. Prestrud, A.D. Brooks, Converting semi-structured clinical medical records into information and knowledge, in 21st International Conference on Data Engineering Workshops (2005)
S. Keretna, C.P. Lim, D. Creighton, A hybrid model for named entity recognition using unstructured medical text. in 2014 9th International Conference on System of Systems Engineering (SOSE) (IEEE, 2014)
E. Aramaki, Y. Miura, M. Tonoike, T. Ohkuma, H. Masuichi, K. Waki, K. Ohe, Extraction of adverse drug effects from clinical records, in MedInfo (2010)
E. Aramaki et al., Text2table: medical text summarization system based on named entity recognition and modality identification, in Proceedings of the Workshop on Current Trends in Biomedical Natural Language Processing. Association for Computational Linguistics (2009)
L. Cui, S.S. Sahoo, S.D. Lhatoo, G. Garg, P. Rai, A. Bozorgi, G.-Q. Zhang, Complex epilepsy phenotype extraction from narrative clinical discharge summaries. J. Biomed. Inform. 51, 272–279 (2014)
L. Deléger, C. Grouin, P. Zweigenbaum, Extracting medical information from narrative patient records: the case of medication-related information. J. Am. Med. Inform. Assoc. 17(5), 555–558 (2010)
Ö. Uzuner, Y. Luo, P. Szolovits, Evaluating the state-of-the-art in automatic de-identification. J. Am. Med. Inform. Assoc. 14(5), 550–563 (2007)
E. Aramaki et al., Automatic deidentification by using sentence features and label consistency, in i2b2 Workshop on Challenges in Natural Language Processing for Clinical Data, vol. 2006 (2006)
R. Guillen, Automated de-identification and categorization of medical records, in i2b2 Workshop on Challenges in Natural Language Processing for Clinical Data (2006)
H. Scott, F. Xia, I. Solti, E. Cadag, Ö. Uzuner, Extracting medication information from discharge summaries, in Proceedings of the NAACL HLT Second Louhi Workshop on Text and Data Mining of Health Documents. Association for Computational Linguistics (2010)
X. Zhou, H. Han, I. Chankai, A. Prestrud, A. Brooks, Approaches to text mining for clinical medical records, in Proceedings of the 2006 ACM Symposium on Applied Computing (2006)
Y. Xu, K. Hong, J. Tsujii, E.I.-C. Chang, Feature engineering combined with machine learning and rule-based methods for structured information extraction from narrative clinical discharge summaries. J. Am. Med. Inform. Assoc. 19(5), 824–832 (2012)
C.A. Bejan, L. Vanderwende, F. Xia, M. Yetisgen-Yildiz, Assertion modeling and its role in clinical phenotype identification. J. Biomed. Inform. 46(1), 68–74 (2013)
Q.T. Zeng, S. Goryachev, S. Weiss, M. Sordo, S.N. Murphy, R. Lazarus, Extracting principal diagnosis, co-morbidity and smoking status for asthma research. BMC Med. Inform. Decis. Mak. 6(1), 30 (2006)
M. Sordoa, M. Topazb, F. Zhongb, M. Murralid, S., Navathed, R.A. Rochaa, Identifying patients with depression using free-text clinical documents, in MEDINFO (2015)
L. Zhou, J.M. Plasek, L.M. Mahoney, N. Karipineni, F. Chang, X. Yan, F. Chang, D. Dimaggio, D.S. Goldman, R.A. Rocha, Using Medical Text Extraction, Reasoning and Mapping System (MTERMS) to process medication information in outpatient clinical notes, in AMIA Annual Symposium Proceedings, vol. 2011
R.G. Jackson, R. Patel, N. Jayatilleke, A. Kolliakou, M. Ball, G. Gorrell, A. Roberts, R.J. Dobson, R. Stewart, Symptoms of severe mental illness from clinical text: the Clinical Record Interactive Search Comprehensive Data Extraction (CRIS-CODE) project. BMJ Open 7(1), e012012 (2017)
J.-W. Seol, W. Yi, J. Choi, K.S. Lee, Causality patterns and machine learning for the extraction of problem-action relations in discharge summaries. Int. J. Med. Inform. 98, 1–12 (2017)
Y. Xu, Y. Wang, L. Tianren, J. Tsujii, E.I.-C. Chang, An end-to-end system to identify temporal relation in discharge summaries: 2012 i2b2 challenge. J. Am. Med. Inform. Assoc. 20(5), 849–858 (2013)
A.R. Aronson, F.-M. Lang, An overview of MetaMap: historical perspective and recent advances. J. Am. Med. Inform. Assoc. 17(3), 229–236 (2010)
M. Kholghi, L. Sitbon, G. Zuccon, A. Nguyen, Active learning: a step towards automating medical concept extraction. J. Am. Med. Inform. Assoc. 23(2), 289–296 (2015)
K. Denecke, Extracting medical concepts from medical social media with clinical NLP tools: a qualitative study, in Proceedings of the Fourth Workshop on Building and Evaluation Resources for Health and Biomedical Text Processing (2014)
B. Wellner, M. Huyck, S. Mardis, J. Aberdeen, A. Morgan, L. Peshkin, A. Yeh, J. Hitzeman, L. Hirschman, Rapidly retargetable approaches to de-identification in medical records. J. Am. Med. Inform. Assoc. 14(5), 564–573 (2007)
A.M. Cohen, Five-way smoking status classification using text hot-spot identification and error-correcting output codes. J. Am. Med. Inform. Assoc. 15(1), 32–35 (2008)
Ö. Uzuner, I. Goldstein, Y. Luo, I. Kohane, Identifying patient smoking status from medical discharge records. J. Am. Med. Inform. Assoc. 15(1), 14–24 (2008)
H. Yang, I. Spasic, J.A. Keane, G. Nenadic, A text mining approach to the prediction of disease status from clinical discharge summaries. J. Am. Med. Inform. Assoc. 16(4), 596–600 (2009)
Ö. Uzuner, Recognizing obesity and co-morbidities in sparse data. J. Am. Med. Inform. Assoc. 16(4), 561–570 (2009)
I. Solt, D. Tikk, V. Gál, Z.T. Kardkovács, Semantic classification of diseases in discharge summaries using a context-aware rule-based classifier. J. Am. Med. Inform. Assoc. 16(4), 580–584 (2009)
V.N. Garla, C. Brandt, Ontology-guided feature engineering for clinical text classification. J. Biomed. Inform. 45(5), 992–998 (2012)
K.H. Ambert, A.M. Cohen, A system for classifying disease comorbidity status from medical discharge summaries using automated hotspot and negated concept detection. J. Am. Med. Inform. Assoc. 16(4), 590–595 (2009)
Ö. Uzuner, I. Solti, E. Cadag, Extracting medication information from clinical text. J. Am. Med. Inform. Assoc. 17(5), 514–518 (2010)
Ö. Uzuner, B.R. South, S. Shen, S.L. DuVall, 2010 i2b2/VA challenge on concepts, assertions, and relations in clinical text. J. Am. Med. Inform. Assoc. 18(5), 552–556 (2011)
K. Roberts, B. Rink, S.M. Harabagiu, A flexible framework for recognizing events, temporal expressions, and temporal relations in clinical text. J. Am. Med. Inform. Assoc. 20(5), 867–875 (2013)
W. Sun, A. Rumshisky, O. Uzuner, Evaluating temporal relations in clinical text: 2012 i2b2 challenge. J. Am. Med. Inform. Assoc. 20(5), 806–813 (2013)
C. Friedman, Towards a comprehensive medical language processing system: methods and issues, in Proceedings of the AMIA Annual Fall Symposium (American Medical Informatics Association, 1997)
S. Gold, N. Elhadad, X. Zhu, J.J. Cimino, G. Hripcsak, Extracting structured medication event information from discharge summaries, in AMIA Annual Symposium Proceedings (2008)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2021 Springer Nature Singapore Pte Ltd.
About this paper
Cite this paper
Nair, P.C., Gupta, D., Devi, B.I. (2021). A Survey of Text Mining Approaches, Techniques, and Tools on Discharge Summaries. In: Gao, XZ., Tiwari, S., Trivedi, M., Mishra, K. (eds) Advances in Computational Intelligence and Communication Technology. Advances in Intelligent Systems and Computing, vol 1086. Springer, Singapore. https://doi.org/10.1007/978-981-15-1275-9_27
Download citation
DOI: https://doi.org/10.1007/978-981-15-1275-9_27
Published:
Publisher Name: Springer, Singapore
Print ISBN: 978-981-15-1274-2
Online ISBN: 978-981-15-1275-9
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)