Skip to main content

Knowledge-Based Approaches to Intelligent Data Analysis

  • Conference paper
  • First Online:
Towards Digital Intelligence Society (DISA 2020)

Abstract

One of the most fundamental phenomena heavily influencing the digital society is Big Data. It is crucial not only to collect and analyze vast amounts of data but do it in an intelligent way. We believe that in order to do so, there needs to be a suitable interplay between the knowledge already known in the given application domain (background knowledge) and the knowledge inductively gained from data utilizing various data analysis techniques. We call it a knowledge-based approach to data analysis or intelligent data analysis. In this chapter, we will focus on two main types of the knowledge-based approach to data analysis. We start with the introduction of the semantic modelling of data analytics processes, which can efficiently cover an explicit form of background knowledge. The main focus here will be on the conceptualization of domain knowledge shared between the domain expert and data scientist and modelling of data mining workflows in order to achieve reproducibility and reusability. The second situation is typical for medical application, where the prevalent amount of background knowledge tends to stay tacit. In such a situation, the human-in-the-loop approach is a way how to perform data analysis intelligently. For both of these types of knowledge-based data analysis, specific case studies are presented to show how intelligent data analysis works in practice.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 129.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 169.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

References

  1. Shearer, C.: The CRISP-DM model: the new blueprint for data mining. J. Data Warehous. 5(4), 13–22 (2000)

    Google Scholar 

  2. Panov, P., Dzeroski, S., Soldatova, L.N.: OntoDM: an ontology of data mining. In: 2008 IEEE International Conference on Data Mining Workshops, pp. 752–760 (2008)

    Google Scholar 

  3. Hilario, M., Nguyen, P., Do, H., Woznica, A., Kalousis, A.: Ontology-based meta-mining of knowledge discovery workflows. In: Meta-Learning in Computational Intelligence (2011)

    Google Scholar 

  4. Kietz, J., Serban, F., Bernstein, A., Fischer, S.: Towards cooperative planning of data mining workflows. In: Proceedings of the Third Generation Data Mining Workshop at the 2009 European Conference on Machine Learning (ECML 2009) (2009)

    Google Scholar 

  5. Vanschoren, J., Soldatova, L.: Exposé: an ontology for data mining experiments. In: International Workshop on Third Generation Data Mining: Towards Service-Oriented Knowledge Discovery (SoKD-2010), pp. 31–46 (2010)

    Google Scholar 

  6. Sarnovsky, M., Bednar, P., Smatana, M.: Cross-sectorial semantic model for support of data analytics in process industries. Processes 7(5), 51–68 (2019)

    Google Scholar 

  7. Sarnovsky, M., Bednar, P., Smatana, M.: Big data processing and analytics platform architecture for process industry factories. Big Data and Cognitive Comput. 2(1), 3 (2018)

    Google Scholar 

  8. Sarnovský, M., Paralič, J.: Hierarchical intrusion detection using machine learning and knowledge model. Symmetry, 12(2) (2020)

    Google Scholar 

  9. Sabanovic, S., Majnaric Trtica, L., Babič, F., Vadovský, M., Paralič, J., Vcev, A., Holzinger, A.: Metabolic syndrome in hypertensive women in the age of menopause: a case study on data from general practice electronic health records. BMC Med. Inf. Decision Making 18(1), 1–24 (2018)

    Article  Google Scholar 

  10. Bekic, S., Babič, F., Filipčic, I., Majnaric Trtica, L.: Clustering of mental and physical comorbidity and the risk of frailty in patients aged 60 years or more in primary care. Med. Sci. Monitor 25, 6820–6835 (2019)

    Article  Google Scholar 

  11. Babič, F., Majnaric Trtica, L., Bekic, S., Holzinger, A.: Machine learning for family doctors: a case of cluster analysis for studying aging associated comorbidities and frailty. In: Holzinger, A., Kieseberg, P., Tjoa, A., Weippl, E. (eds.) Machine Learning and Knowledge Extraction. CD-MAKE 2019. Lecture Notes in Computer Science, vol 11713. Springer, Cham (2019)

    Google Scholar 

  12. Yin, J., Tian, L.: Optimal linear combinations of multiple diagnostic biomarkers based on Youden index. Stat. Med. 33(8), 1426–1440 (2013)

    Article  MathSciNet  Google Scholar 

  13. Barnett, K., Mercer, S.W., Norbury, M., et al.: Epidemiology of multimorbidity and implications for health care, research and medical education: a cross-sectional study. Lancet 38, 37–43 (2012)

    Article  Google Scholar 

  14. Hothor, T., Everitt, B.S.: A Handbook of Statistical Analyses Using R, 2nd edn. Chapman and Hall/CRC, Boca Raton (2009)

    Book  Google Scholar 

  15. Kodinariya, T.M., Makwana, P.R.: Review on determining number of Cluster in K-means clustering. Int. J. Adv. Res. Comput. Sci. Manage. Stud. 1(6), 90–95 (2013)

    Google Scholar 

  16. Rousseeuw, P.J.: Silhouettes: A graphical aid to the interpretation and validation of cluster analysis. Comput. Appl. Math. 20, 53–65 (1987)

    Article  Google Scholar 

  17. Calinski, T., Harabasz, J.: A dendrite method for cluster analysis. Commun. Stat. Theory Methods 3(1), 1–27 (1974)

    Article  MathSciNet  Google Scholar 

  18. Tolles, J., Meurer, W.J.: Logistic regression relating patient characteristics to outcomes. JAMA 316(5), 533–534 (2016)

    Article  Google Scholar 

  19. Habshah, M., Kumar Sakar, S., Rana, S.: Collinearity diagnostics of binary logistic regression model. J. Interdisciplinary Math. 13(3), 253–267 (2010)

    Article  Google Scholar 

  20. Lukáčová, A.: Approaches to extraction of decision support rules in medical domain. Dissertation thesis. Technical University of Košice, 99 p. (2016)

    Google Scholar 

  21. Holzinger, A.: Interactive machine learning for health informatics: when do we need the human-in-the-loop? Brain Inform. 3(2), 119–131 (2016). https://doi.org/10.1007/s40708-016-0042-6

    Article  Google Scholar 

  22. Begum, S. et al.: Case-based reasoning systems in the health sciences: a survey of recent trends and developments. In: IEEE Transactions On Systems, Man, And Cybernetics – Part C: Applications and Reviews, vol. 41, no. 4, pp. 421–434 (2011)

    Google Scholar 

  23. Choudhury, N., Begum, S.: A survey on case-based reasoning in medicine. Int. J. Adv. Comput. Sci. Appl. 7(8), 136–144 (2016)

    Google Scholar 

  24. Pella, Z., Milkovič, P., Paralič, J.: Application for text processing of cardiology medical records. In: Proceedings of the IEEE World Symposium on Digital Intelligence for Systems and Machines (DISA 2018), pp. 169–174, IEEE (2020)

    Google Scholar 

  25. Pella, D. et al.: Possible role of machine learning in the detection of increased cardiovascular risk patients – KSC MR Study (design). Archives of Medical Science (accepted)

    Google Scholar 

  26. Tocimáková, Z., Pusztová, L., Paralič, J., Pella, D.: Case-based reasoning for support of the diagnostics of cardiovascular diseases. In: Studies in Health Technology and Informatics, vol. 270, NLM (Medline), pp. 537–541 (2020)

    Google Scholar 

Download references

Acknowledgements

This work was supported by the Slovak Research and Development Agency under grants no. APVV-16-0213 and APVV-17-0550.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Peter Bednár .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2021 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Bednár, P., Paralič, J., Babič, F., Sarnovský, M. (2021). Knowledge-Based Approaches to Intelligent Data Analysis. In: Paralič, J., Sinčák, P., Hartono, P., Mařík, V. (eds) Towards Digital Intelligence Society. DISA 2020. Advances in Intelligent Systems and Computing, vol 1281. Springer, Cham. https://doi.org/10.1007/978-3-030-63872-6_4

Download citation

Publish with us

Policies and ethics