Skip to main content

How Scientists Are Brought Back into Science—The Error of Empiricism

  • Chapter
  • First Online:
A Critical Reflection on Automated Science

Part of the book series: Human Perspectives in Health Sciences and Technology ((HPHST,volume 1))

Abstract

This chapter aims to contribute to critically investigate whether human-made scientific knowledge and the scientist’s role in developing it, will remain crucial—or can data-models automatically generated by machine-learning technologies replace scientific knowledge produced by humans? Influential opinion-makers claim that the human role in science will be taken over by machines. Chris Anderson’s (2008) provocative essay, The End of Theory: The Data Deluge Makes the Scientific Method Obsolete, will be taken as an exemplary expression of this opinion. The claim that machines will replace human scientists can be investigated within several perspectives (e.g., ethical, ethical-epistemological, practical and technical). This chapter focuses on epistemological aspects concerning ideas and beliefs about scientific knowledge. The approach is to point out epistemological views supporting the idea that machines can replace scientists, and to propose a plausible alternative that explains the role of scientists and human-made science, especially in view of the multitude of epistemic tasks in practical uses of knowledge. Whereas philosophical studies into machine learning often focus on reliability and trustworthiness, the focus of this chapter is on the usefulness of knowledge for epistemic tasks. This requires distinguishing between epistemic tasks for which machine learning is useful, versus those that require human scientists. In analyzing Anderson’s claim, a kind of double stroke is made. First, it will be made plausible that the fundamental presuppositions of empiricist epistemologies give reason to believe that machines will ultimately make scientists superfluous. Next, it is argued that empiricist epistemologies are deficient, because neglect the multitude of epistemic tasks for which humans need knowledge that is comprehensible for them. The character of machine learning technology is such that it does not provide such knowledge.

It will be concluded that machine learning is useful for specific types of epistemic tasks such as prediction, classification, and pattern-recognition, but for many other types of epistemic tasks—such as asking relevant questions, problem-analysis, interpreting problems as of a specific kind, designing interventions, and ‘seeing’ analogies that help to interpret a problem differently—the production and use of comprehensible scientific knowledge remains crucial.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 109.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

Notes

  1. 1.

    In this chapter, ‘theory’ is taken in a broad sense, encompassing different kinds of scientific knowledge such as concepts, laws, models, etc. The more general term ‘scientific knowledge’ encompasses different kinds of specific epistemic entities such as theories, models, laws, concepts, (descriptions of) phenomena and mechanisms, etc., each of which can be used in performing different kinds of epistemic tasks (e.g., prediction, explanation, calculation, hypothesizing, etc.).

  2. 2.

    On the terminology used in this chapter. In the semantic view of theories, patterns in data are also called data-models (see section “Empiricist epistemologies”), which are mathematical representations of empirical data sets (e.g., Suppe 1974; McAllister 2007). This chapter will adopt the term data-model in this very sense. In machine learning textbooks, data-models are also referred to as mathematical functions. Abu-Mostafa et al. (2012), for instance, speaks of the unknown target function f: X −> Y, where X is the input space (set of all possible inputs x), and Y is the output space (e.g., y1 is ‘yes’ for x1; y2 is ‘no’ for x2; etc.). The machine learning algorithm aims to find a mathematical function g that ‘best’ fits the data, and that supposedly approximates the unknown target function f. Abu-Mostafa et al. call the function g generated by machine learning ‘the final hypothesis.’ Alpaydin’s (2010), on the other hand, uses the notion of model and function interchangeably. An example (Alpaydin 2010, 9) is predicting the price of a car based on historical data (e.g., past transaction). Let X denote the car attributes (i.e., properties considered relevant to the price of a car) and Y be the price of the car (i.e., the outcome of a transaction). Surveying the past transactions, we can collect a training data set and the machine learning program fits a function to this data to learn Y as a function of X. An example is when the fitted function is of the form y = w1.x + w0. In this example, the data-model is a linear equation and w1 and w0 are the parameters (weight factors) of which the values are determined by the machine learning algorithm to best fit the training data. Alpaydin (2010, 35) calls this equation ‘a single input linear model.’ Hence, in this example, the machine learning algorithm to fit the training data includes only one property to predict the price of a car. Notably, the machine learning program involves a learning algorithm, chosen by human programmers, that confines the space in which a data-model can be found – in this example, the learning algorithm assumes the linear equation, while the data-model consists of the linear equation together with the fitted values of the parameters (w0 and w1).

  3. 3.

    Current machine learning practices show that machine learning algorithms are dependent in varying degrees on our theoretical and practical background knowledge. Therefore, another option regarding Anderson’s assumptions is that the current state of knowledge suffices for this purpose. Yet, in the context of this article, it will be assumed that he means to say that machine learning technology will eventually develop to the extent that such knowledge will become superfluous in the construction of machine learning algorithms.

  4. 4.

    The notion of epistemic opaqueness of a process has been introduced by Humphreys (2009, 618): “a process is epistemically opaque relative to a cognitive agent X at time t just in case X does not know at t all of the epistemically relevant elements of the process. A process is essentially epistemically opaque to X if and only if it is impossible, given the nature of X, for X to know all of the epistemically relevant elements of the process.”

  5. 5.

    Frederick Suppe (1974, Chapter One) presents a comprehensive outline on the historical background to the so-called Received View, which develops from positivism to logical positivism (e.g., Carnap) and logical empiricism (e.g., Hempel).

  6. 6.

    A clarifying phrase “to save the phenomena” to capture the empiricist idea of how knowledge is obtained from data was originally introduced by Duhem (2015/1909) and later adopted by, among others, Van Fraassen (1977, 1980) and Bogen and Woodward (1988).

  7. 7.

    McAllister (2007) presents an in-depth technical discussion of how to find patterns in data (i.e., data-models). He argues that “the assumption that an empirical data set provides evidence for just one phenomenon is mistaken. It frequently occurs that data sets provide evidence for multiple phenomena, in the form of multiple patterns that are exhibited in the data with differing noise levels” (Ibidem, 886). McAllister’s (2007, 885) also critically investigates how researchers in various disciplines, including philosophy of science, have proposed quantitative techniques for determining which data-model is the best, where ‘the best’ is usually interpreted as ‘the closest to the truth,’ ‘the most likely to be true,’ or ‘the best-supported by the data.’ According to McAllister, these “[data-]model selection techniques play an influential role not only in research practice, but also in philosophical thinking about science. They seem to promise a way of interpreting empirical data that does not rely on judgment or subjectivity” (Ibidem, 885, my emphasis), which he disputes.

  8. 8.

    Affirmative answers to these questions can be taken as an interpretation of Anderson’s position. Notably, even machine learning scientists and textbooks promote that knowledge of any sort related to the application (e.g., knowledge of concepts, of relevant and irrelevant aspects, and of more abstract rules such as symmetries and invariances) must be incorporated into the learning network structure whenever possible (Alpaydin 2010, 261). Abu-Mostafa (1995) calls this knowledge hints, which are the properties of the target function that are known to us independently of the training examples – i.e., hints are auxiliary information that can be used to guide the machine’s learning process. The use of hints is tantamount to combining rules and data in the learning network structure – hints are needed, according to Abu Mostafa, to pre-structure data-sets because without them it is more difficult to train the machine. In image recognition, for instance, there are invariance hints: the identity of an object does not change when it is rotated, translated, or scaled.

  9. 9.

    Notably, ‘phenomena’ in the sense of Bogen and Woodward (1988) do not occur in this view. Rather than phenomena, as B&W claim, the model of data mediates between the measured data and the model of the theory, which is a specific instantiation (interpretation, concretization) of the theory (see Schema 1).

  10. 10.

    This claim only holds for anti-realist interpretations (as in Duhem and Van Fraassen) of the semantic view. Yet, the semantic view of theories also allows for realist interpretations of theories (e.g., Suppe 1989).

  11. 11.

    In other work, I have explained from a range of different philosophical issues, the crucial role of phenomena in the ‘applied’ research practices and what this means for our philosophical understanding both of scientific knowledge and of the aim of science (Boon 2011, 2012a, b, 2015, 2017a, c, forthcoming). The idea that these application-oriented scientific research practices aim at scientific knowledge in view of epistemic tasks aimed at learning how to do things with (often unobservable, and even not yet existing) physical phenomena has led to the notion of scientific knowledge as epistemic tool (Boon and Knuuttila 2009; Knuuttila and Boon 2011; Boon 2015; Boon 2017b,c; also see Nersessian 2009; Feest 2010; Andersen 2012). The original idea of scientific knowledge (or, originally more narrowly stated, ‘scientific models’) as epistemic tools, proposes to view scientific knowledge—such as descriptions, concepts, and models of physical phenomena—firstly as representations of scientists’ conceptions of aspects of reality, rather than representations in the sense of a two-way relationship between knowledge and reality (as in anti-realist empiricist epistemologies as well as in scientific realism). The point of this (anti-realist) view is that someone can represent her conception (comprehension, understanding, interpretation) of aspects of reality by means of representational means such as text, analogies, pictures, graphs, diagrams, mathematical formula, and also 3D material entities. Notably therefore, scientists’ conceptions of observable as well as unobservable phenomena arrived at by intricate reasoning processes (creative, inductive, deductive, hypothetical, mathematical, analogical, etc.), which employ all kinds of available epistemic resources, can be represented. By representing, scientists’ conceptions become epistemic constructs that are public and transferable. Knuuttila and I have called these constructs epistemic tools, that is, conceptually meaningful tools that guide and enable humans in performing all kinds of different epistemic tasks.

References

  • Abu-Mostafa, Y. 1995. Hints. Neural Computation 7: 639–671. https://doi.org/10.1162/neco.1995.7.4.639.

    Article  Google Scholar 

  • Abu-Mostafa, Y. S., Magdon-Ismail, M., and Lin, H.- T. (2012). Learning from data. AMLbook.com. ISBN 10:1-60049-006-9, ISBN 13:978-1-60049-006-4

  • Alpaydin, E. 2010. Introduction to Machine Learning. Cambridge: The MIT Press.

    Google Scholar 

  • Andersen, H. 2012. Concepts and Conceptual Change. In Kuhn’s The Structure of Scientific Revolutions Revisited, ed. V. Kindi and T. Arabatzis, 179–204. Routledge.

    Google Scholar 

  • Anderson, C. (2008). The End of Theory: The Data Deluge Makes the Scientific Method Obsolete. Wired Magazine, June 23. Retrieved from: https://www.wired.com/2008/06/pb-theory/

  • Bogen, J. 2011. ‘Saving the Phenomena’ and Saving the Phenomena. Synthese 182 (1): 7–22. https://doi.org/10.1007/s11229-009-9619-4.

    Article  Google Scholar 

  • Bogen, J., and J. Woodward. 1988. Saving the Phenomena. The Philosophical Review 97 (3): 303–352. https://doi.org/10.2307/2185445.

    Article  Google Scholar 

  • Boon, M. 2011. In Defense of Engineering Sciences: On the Epistemological Relations Between Science and Technology. Techné: Research in Philosophy and Technology 15 (1): 49–71. Retrieved from http://doc.utwente.nl/79760/.

    Google Scholar 

  • ———. 2012a. Scientific Concepts in the Engineering Sciences: Epistemic Tools for Creating and Intervening with Phenomena. In Scientific Concepts and Investigative Practice, ed. U. Feest and F. Steinle, 219–243. Berlin: De Gruyter.

    Google Scholar 

  • ———. 2012b. Understanding Scientific Practices: The Role of Robustness Notions. In Characterizing the Robustness of Science After the Practical Turn of the Philosophy of Science, ed. L. Soler, E. Trizio, Th. Nickles, and W. Wimsatt, 289–315. Dordrecht: Springer: Boston Studies in the Philosophy of Science.

    Google Scholar 

  • ———. 2015. Contingency and Inevitability in Science – Instruments, Interfaces and the Independent World. In Science as It Could Have Been: Discussing the Contingent/Inevitable Aspects of Scientific Practices, ed. L. Soler, E. Trizio, and A. Pickering, 151–174. Pittsburgh: University of Pittsburgh Press.

    Google Scholar 

  • ———. 2017a. Measurements in the Engineering Sciences: An Epistemology of Producing Knowledge of Physical Phenomena. In Reasoning in Measurement, ed. N. Mößner and A. Nordmann, 203–219. London/New York: Routledge.

    Google Scholar 

  • ———. 2017b. Philosophy of Science in Practice: A Proposal for Epistemological Constructivism. In Logic, Methodology and Philosophy of Science – Proceedings of the 15th International Congress (CLMPS 2015), ed. H. Leitgeb, I. Niiniluoto, P. Seppälä, and E. Sober, 289–310. College Publications.

    Google Scholar 

  • ———. 2017c. An Engineering Paradigm in the Biomedical Sciences: Knowledge as Epistemic Tool. Progress in Biophysics and Molecular Biology 129: 25–39. https://doi.org/10.1016/j.pbiomolbio.2017.04.001.

    Article  Google Scholar 

  • ———. forthcoming. Scientific methodology in the engineering sciences. Chapter 4. In The Routledge Handbook of Philosophy of Engineering, ed. D. Michelfelder and N. Doorn. Routledge. Publication scheduled for 2019.

    Google Scholar 

  • Boon, M., and T. Knuuttila. 2009. Models as Epistemic Tools in Engineering Sciences: A Pragmatic Approach. In Philosophy of Technology and Engineering Sciences. Handbook of the Philosophy of Science, ed. A. Meijers, vol. 9, 687–720. Elsevier/North-Holland.

    Google Scholar 

  • Chang, H. 2004. Inventing Temperature: Measurement and Scientific Progress. Oxford: Oxford University Press.

    Book  Google Scholar 

  • Craig, E. 1998. Duhem, Piere Maurice Marie. In Routledge Encyclopedia of Philosophy: Descartes to gender and science., vol. 3, 142–145. London/New York: Routledge.

    Google Scholar 

  • Da Costa, N.C.A., and S. French. 2003. Science and Partial Truth. A Unitary Approach to Models and Scientific Reasoning. Oxford: Oxford University Press.

    Book  Google Scholar 

  • Dai, W., et al. 2015. Prediction of Hospitalization Due to Heart Diseases by Supervised Learning Methods. International Journal of Medical Informatics 84: 189–197.

    Article  Google Scholar 

  • Duhem, P. 1954/[1914]. The Aim and Structure of Physical Theory. Princeton: Princeton University Press.

    Google Scholar 

  • ———. 2015/[1908]. To Save the Phenomena: An Essay on the Idea of Physical Theory from Plato to Galileo. Trans. E. Dolland and C. Maschler. Chicago: University of Chicago Press

    Google Scholar 

  • Esteva, A., B. Kuprel, R.A. Novoa, J. Ko, S.M. Swetter, H.M. Blau, and S. Thrun. 2017. Dermatologist-level Classification of Skin Cancer with Deep Neural Networks. Nature 542: 115. https://doi.org/10.1038/nature21056.

    Article  Google Scholar 

  • Feest, U. 2010. Concepts as Tools in the Experimental Generation of Knowledge in Cognitive Neuropsychology. Spontaneous Generations 4 (1): 173–190.

    Google Scholar 

  • Giere, R.N. 1988. Explaining Science. Chicago/London: The University of Chicago Press.

    Book  Google Scholar 

  • ———. 2010. An Agent-Based Conception of Models and Scientific Representation. Synthese 172 (2): 269–281. https://doi.org/10.1007/s11229-009-9506-z.

    Article  Google Scholar 

  • Glymour, B. 2002. Data and Phenomena: A Distinctions Reconsidered. Erkenntnis 52: 29–37.

    Article  Google Scholar 

  • Hempel, C.G. 1962. Explanation in Science and Philosophy. In Frontiers of Science and Philosophy, ed. R.G. Colodny, 9–19. Pittsburgh: University of Pittsurgh Press.

    Google Scholar 

  • ———. 1966. Philosophy of Natural Science. Englewood Cliffs: Prentice-Hall.

    Google Scholar 

  • Hofree, M., J.P. Shen, H. Carter, A. Gross, and T. Ideker. 2013. Network-based Stratification of Tumor Mutations. Nature Methods 10: 1108. https://doi.org/10.1038/nmeth.2651. https://www.nature.com/articles/nmeth.2651#supplementary-information.

    Article  Google Scholar 

  • Humphreys, P. 2009. The Philosophical Novelty of Computer Simulation Methods. Synthese 169 (3): 615–626. https://doi.org/10.1007/s11229-008-9435-2.

    Article  Google Scholar 

  • Knuuttila, T., and M. Boon. 2011. How Do Models Give Us Knowledge? The Case of Carnot’s Ideal Heat Engine. European Journal for Philosophy of Science 1 (3): 309–334. https://doi.org/10.1007/s13194-011-0029-3.

    Article  Google Scholar 

  • Kourou, K., et al. 2015. Machine Learning Applications in Cancer Prognosis and Prediction. Computational and Structural Biotechnichnology Journal 13: 8–17.

    Article  Google Scholar 

  • Lemm, S., et al. 2011. Introduction to Machine Learning for Brain Imaging. NeuroImage 56: 387–399.

    Article  Google Scholar 

  • Libbrecht, M.W., and W.S. Noble. 2015. Machine Learning Applications in Genetics and Genomics. Nature Reviews Genetics 16: 321–332.

    Article  Google Scholar 

  • Lima, A.N., et al. 2016. Use of Machine Learning Approaches for Novel Drug Discovery. Expert Opinion on Drug Discovery 11: 225–239.

    Article  Google Scholar 

  • Mayo, D.G. 1996. Error and the Growth of Experimental Knowledge. Chicago: University of Chicago Press.

    Book  Google Scholar 

  • McAllister, J.W. 1997. Phenomena and Patterns in Data Sets. Erkenntnis 47 (2): 217–228. https://doi.org/10.1023/A:1005387021520.

    Article  Google Scholar 

  • ———. 2007. Model Selection and the Multiplicity of Patterns in Empirical Data. Philosophy of Science 74 (5): 884–894. https://doi.org/10.1086/525630.

    Article  Google Scholar 

  • ———. 2011. What do Patterns in Empirical Data Tell Us About the Structure of the World? Synthese 182 (1): 73–87. https://doi.org/10.1007/s11229-009-9613-x.

    Article  Google Scholar 

  • Mena, J. 2011. Machine Learning Forensics for Law Enforcement, Security, and Intelligence. Boca Raton: CRC Press.

    Google Scholar 

  • Nersessian, N.J. 2009. Creating Scientific Concepts. Cambridge, MA: MIT Press.

    Google Scholar 

  • Odone, F., M. Pontil, and A. Verri. 2009. Machine Learning Techniques for Biometrics. In Handbook of Remote Biometrics. Advances in Pattern Recognition, ch. 10, ed. M. Tistarelli, S.Z. Li, and R. Chellappa. London: Springer.

    Google Scholar 

  • Olszewska, J.I. 2016. Automated face Recognition: Challenges and Solutions. In Pattern Recognition – Analysis and Applications, ed. S. Ramakrishnan, 59–79. InTechOpen. https://doi.org/10.5772/62619.

    Google Scholar 

  • Phua, C., et al. (2010). A comprehensive survey of data mining-based fraud detection research. https://arxiv.org/abs/1009.6119

  • Suppe, F. (1974). The Structure of Scientific Theories (1979 second printing ed.). Urbana: University of Illinois Press.

    Google Scholar 

  • ———. 1989. The Semantic Conception of Theories and Scientific Realism. Urbana/Chicago: University of Illinois Press.

    Google Scholar 

  • Suppes, P. 1960. A Comparison of the Meaning and Uses of Models in Mathematics and the Empirical Sciences. Synthese 12: 287–301.

    Article  Google Scholar 

  • Tcheng, D.K., A.K. Nayak, C.C. Fowlkes, and S.W. Punyasena. 2016. Visual Recognition Software for Binary Classification and Its Application to Spruce Pollen Identification. PLoS ONE 11 (2): e0148879. https://doi.org/10.1371/journal.pone.0148879.

    Article  Google Scholar 

  • Van Fraassen, B.C. 1977. The Pragmatics of Explanation. American Philosophical Quarterly 14: 143–150.

    Google Scholar 

  • ———. 1980. The Scientific Image. Oxford: Clarendon Press.

    Book  Google Scholar 

  • ———. 2008. Scientific Representation. Oxford: Oxford University Press.

    Book  Google Scholar 

  • ———. 2012. Modeling and Measurement: The Criterion of Empirical Grounding. Philosophy of Science 79 (5): 773–784.

    Article  Google Scholar 

  • Van Liebergen, B. 2017. Machine Learning: Revolution in Risk Management and Compliance? The Capco Institute Journal of Financial Transformation 45: 60–67.

    Google Scholar 

  • Woodward, J.F. 2011. Data and Phenomena: A Restatement and Defense. Synthese 182 (1): 165–179. https://doi.org/10.1007/s11229-009-9618-5.

    Article  Google Scholar 

  • Wuest, T., et al. 2016. Machine Learning in Manufacturing: Advantages, Challenges, and Applications. Production & Manufacturing Research 4 (1): 23–45.

    Article  Google Scholar 

Download references

Acknowledgements

This work is financed by an Aspasia grant (2012–2017 nr. 409.40216) of the Dutch National Science Foundation (NWO) for the project “Philosophy of Science for the Engineering Sciences.” I want to acknowledge Koray Karaca for critical suggestions and fruitful discussions; I am indebted to him for introducing me to the machine learning literature, which I have gratefully used in writing the introduction to this theme. I also wish to acknowledge Marta Bertolaso and her team for organizing the expert meeting (March 5–6, 2018 in Rome) on the theme “Will Science remain Human? Frontiers of the Incorporation of Technological Innovations in the Biomedical Sciences,” and participants of this meeting for their fruitful comments, especially by Melissa Moschella.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Mieke Boon .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2020 Springer Nature Switzerland AG

About this chapter

Check for updates. Verify currency and authenticity via CrossMark

Cite this chapter

Boon, M. (2020). How Scientists Are Brought Back into Science—The Error of Empiricism. In: Bertolaso, M., Sterpetti, F. (eds) A Critical Reflection on Automated Science. Human Perspectives in Health Sciences and Technology, vol 1. Springer, Cham. https://doi.org/10.1007/978-3-030-25001-0_4

Download citation

Publish with us

Policies and ethics