Abstract
This chapter aims to contribute to critically investigate whether human-made scientific knowledge and the scientist’s role in developing it, will remain crucial—or can data-models automatically generated by machine-learning technologies replace scientific knowledge produced by humans? Influential opinion-makers claim that the human role in science will be taken over by machines. Chris Anderson’s (2008) provocative essay, The End of Theory: The Data Deluge Makes the Scientific Method Obsolete, will be taken as an exemplary expression of this opinion. The claim that machines will replace human scientists can be investigated within several perspectives (e.g., ethical, ethical-epistemological, practical and technical). This chapter focuses on epistemological aspects concerning ideas and beliefs about scientific knowledge. The approach is to point out epistemological views supporting the idea that machines can replace scientists, and to propose a plausible alternative that explains the role of scientists and human-made science, especially in view of the multitude of epistemic tasks in practical uses of knowledge. Whereas philosophical studies into machine learning often focus on reliability and trustworthiness, the focus of this chapter is on the usefulness of knowledge for epistemic tasks. This requires distinguishing between epistemic tasks for which machine learning is useful, versus those that require human scientists. In analyzing Anderson’s claim, a kind of double stroke is made. First, it will be made plausible that the fundamental presuppositions of empiricist epistemologies give reason to believe that machines will ultimately make scientists superfluous. Next, it is argued that empiricist epistemologies are deficient, because neglect the multitude of epistemic tasks for which humans need knowledge that is comprehensible for them. The character of machine learning technology is such that it does not provide such knowledge.
It will be concluded that machine learning is useful for specific types of epistemic tasks such as prediction, classification, and pattern-recognition, but for many other types of epistemic tasks—such as asking relevant questions, problem-analysis, interpreting problems as of a specific kind, designing interventions, and ‘seeing’ analogies that help to interpret a problem differently—the production and use of comprehensible scientific knowledge remains crucial.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
Notes
- 1.
In this chapter, ‘theory’ is taken in a broad sense, encompassing different kinds of scientific knowledge such as concepts, laws, models, etc. The more general term ‘scientific knowledge’ encompasses different kinds of specific epistemic entities such as theories, models, laws, concepts, (descriptions of) phenomena and mechanisms, etc., each of which can be used in performing different kinds of epistemic tasks (e.g., prediction, explanation, calculation, hypothesizing, etc.).
- 2.
On the terminology used in this chapter. In the semantic view of theories, patterns in data are also called data-models (see section “Empiricist epistemologies”), which are mathematical representations of empirical data sets (e.g., Suppe 1974; McAllister 2007). This chapter will adopt the term data-model in this very sense. In machine learning textbooks, data-models are also referred to as mathematical functions. Abu-Mostafa et al. (2012), for instance, speaks of the unknown target function f: X −> Y, where X is the input space (set of all possible inputs x), and Y is the output space (e.g., y1 is ‘yes’ for x1; y2 is ‘no’ for x2; etc.). The machine learning algorithm aims to find a mathematical function g that ‘best’ fits the data, and that supposedly approximates the unknown target function f. Abu-Mostafa et al. call the function g generated by machine learning ‘the final hypothesis.’ Alpaydin’s (2010), on the other hand, uses the notion of model and function interchangeably. An example (Alpaydin 2010, 9) is predicting the price of a car based on historical data (e.g., past transaction). Let X denote the car attributes (i.e., properties considered relevant to the price of a car) and Y be the price of the car (i.e., the outcome of a transaction). Surveying the past transactions, we can collect a training data set and the machine learning program fits a function to this data to learn Y as a function of X. An example is when the fitted function is of the form y = w1.x + w0. In this example, the data-model is a linear equation and w1 and w0 are the parameters (weight factors) of which the values are determined by the machine learning algorithm to best fit the training data. Alpaydin (2010, 35) calls this equation ‘a single input linear model.’ Hence, in this example, the machine learning algorithm to fit the training data includes only one property to predict the price of a car. Notably, the machine learning program involves a learning algorithm, chosen by human programmers, that confines the space in which a data-model can be found – in this example, the learning algorithm assumes the linear equation, while the data-model consists of the linear equation together with the fitted values of the parameters (w0 and w1).
- 3.
Current machine learning practices show that machine learning algorithms are dependent in varying degrees on our theoretical and practical background knowledge. Therefore, another option regarding Anderson’s assumptions is that the current state of knowledge suffices for this purpose. Yet, in the context of this article, it will be assumed that he means to say that machine learning technology will eventually develop to the extent that such knowledge will become superfluous in the construction of machine learning algorithms.
- 4.
The notion of epistemic opaqueness of a process has been introduced by Humphreys (2009, 618): “a process is epistemically opaque relative to a cognitive agent X at time t just in case X does not know at t all of the epistemically relevant elements of the process. A process is essentially epistemically opaque to X if and only if it is impossible, given the nature of X, for X to know all of the epistemically relevant elements of the process.”
- 5.
Frederick Suppe (1974, Chapter One) presents a comprehensive outline on the historical background to the so-called Received View, which develops from positivism to logical positivism (e.g., Carnap) and logical empiricism (e.g., Hempel).
- 6.
- 7.
McAllister (2007) presents an in-depth technical discussion of how to find patterns in data (i.e., data-models). He argues that “the assumption that an empirical data set provides evidence for just one phenomenon is mistaken. It frequently occurs that data sets provide evidence for multiple phenomena, in the form of multiple patterns that are exhibited in the data with differing noise levels” (Ibidem, 886). McAllister’s (2007, 885) also critically investigates how researchers in various disciplines, including philosophy of science, have proposed quantitative techniques for determining which data-model is the best, where ‘the best’ is usually interpreted as ‘the closest to the truth,’ ‘the most likely to be true,’ or ‘the best-supported by the data.’ According to McAllister, these “[data-]model selection techniques play an influential role not only in research practice, but also in philosophical thinking about science. They seem to promise a way of interpreting empirical data that does not rely on judgment or subjectivity” (Ibidem, 885, my emphasis), which he disputes.
- 8.
Affirmative answers to these questions can be taken as an interpretation of Anderson’s position. Notably, even machine learning scientists and textbooks promote that knowledge of any sort related to the application (e.g., knowledge of concepts, of relevant and irrelevant aspects, and of more abstract rules such as symmetries and invariances) must be incorporated into the learning network structure whenever possible (Alpaydin 2010, 261). Abu-Mostafa (1995) calls this knowledge hints, which are the properties of the target function that are known to us independently of the training examples – i.e., hints are auxiliary information that can be used to guide the machine’s learning process. The use of hints is tantamount to combining rules and data in the learning network structure – hints are needed, according to Abu Mostafa, to pre-structure data-sets because without them it is more difficult to train the machine. In image recognition, for instance, there are invariance hints: the identity of an object does not change when it is rotated, translated, or scaled.
- 9.
Notably, ‘phenomena’ in the sense of Bogen and Woodward (1988) do not occur in this view. Rather than phenomena, as B&W claim, the model of data mediates between the measured data and the model of the theory, which is a specific instantiation (interpretation, concretization) of the theory (see Schema 1).
- 10.
This claim only holds for anti-realist interpretations (as in Duhem and Van Fraassen) of the semantic view. Yet, the semantic view of theories also allows for realist interpretations of theories (e.g., Suppe 1989).
- 11.
In other work, I have explained from a range of different philosophical issues, the crucial role of phenomena in the ‘applied’ research practices and what this means for our philosophical understanding both of scientific knowledge and of the aim of science (Boon 2011, 2012a, b, 2015, 2017a, c, forthcoming). The idea that these application-oriented scientific research practices aim at scientific knowledge in view of epistemic tasks aimed at learning how to do things with (often unobservable, and even not yet existing) physical phenomena has led to the notion of scientific knowledge as epistemic tool (Boon and Knuuttila 2009; Knuuttila and Boon 2011; Boon 2015; Boon 2017b,c; also see Nersessian 2009; Feest 2010; Andersen 2012). The original idea of scientific knowledge (or, originally more narrowly stated, ‘scientific models’) as epistemic tools, proposes to view scientific knowledge—such as descriptions, concepts, and models of physical phenomena—firstly as representations of scientists’ conceptions of aspects of reality, rather than representations in the sense of a two-way relationship between knowledge and reality (as in anti-realist empiricist epistemologies as well as in scientific realism). The point of this (anti-realist) view is that someone can represent her conception (comprehension, understanding, interpretation) of aspects of reality by means of representational means such as text, analogies, pictures, graphs, diagrams, mathematical formula, and also 3D material entities. Notably therefore, scientists’ conceptions of observable as well as unobservable phenomena arrived at by intricate reasoning processes (creative, inductive, deductive, hypothetical, mathematical, analogical, etc.), which employ all kinds of available epistemic resources, can be represented. By representing, scientists’ conceptions become epistemic constructs that are public and transferable. Knuuttila and I have called these constructs epistemic tools, that is, conceptually meaningful tools that guide and enable humans in performing all kinds of different epistemic tasks.
References
Abu-Mostafa, Y. 1995. Hints. Neural Computation 7: 639–671. https://doi.org/10.1162/neco.1995.7.4.639.
Abu-Mostafa, Y. S., Magdon-Ismail, M., and Lin, H.- T. (2012). Learning from data. AMLbook.com. ISBN 10:1-60049-006-9, ISBN 13:978-1-60049-006-4
Alpaydin, E. 2010. Introduction to Machine Learning. Cambridge: The MIT Press.
Andersen, H. 2012. Concepts and Conceptual Change. In Kuhn’s The Structure of Scientific Revolutions Revisited, ed. V. Kindi and T. Arabatzis, 179–204. Routledge.
Anderson, C. (2008). The End of Theory: The Data Deluge Makes the Scientific Method Obsolete. Wired Magazine, June 23. Retrieved from: https://www.wired.com/2008/06/pb-theory/
Bogen, J. 2011. ‘Saving the Phenomena’ and Saving the Phenomena. Synthese 182 (1): 7–22. https://doi.org/10.1007/s11229-009-9619-4.
Bogen, J., and J. Woodward. 1988. Saving the Phenomena. The Philosophical Review 97 (3): 303–352. https://doi.org/10.2307/2185445.
Boon, M. 2011. In Defense of Engineering Sciences: On the Epistemological Relations Between Science and Technology. Techné: Research in Philosophy and Technology 15 (1): 49–71. Retrieved from http://doc.utwente.nl/79760/.
———. 2012a. Scientific Concepts in the Engineering Sciences: Epistemic Tools for Creating and Intervening with Phenomena. In Scientific Concepts and Investigative Practice, ed. U. Feest and F. Steinle, 219–243. Berlin: De Gruyter.
———. 2012b. Understanding Scientific Practices: The Role of Robustness Notions. In Characterizing the Robustness of Science After the Practical Turn of the Philosophy of Science, ed. L. Soler, E. Trizio, Th. Nickles, and W. Wimsatt, 289–315. Dordrecht: Springer: Boston Studies in the Philosophy of Science.
———. 2015. Contingency and Inevitability in Science – Instruments, Interfaces and the Independent World. In Science as It Could Have Been: Discussing the Contingent/Inevitable Aspects of Scientific Practices, ed. L. Soler, E. Trizio, and A. Pickering, 151–174. Pittsburgh: University of Pittsburgh Press.
———. 2017a. Measurements in the Engineering Sciences: An Epistemology of Producing Knowledge of Physical Phenomena. In Reasoning in Measurement, ed. N. Mößner and A. Nordmann, 203–219. London/New York: Routledge.
———. 2017b. Philosophy of Science in Practice: A Proposal for Epistemological Constructivism. In Logic, Methodology and Philosophy of Science – Proceedings of the 15th International Congress (CLMPS 2015), ed. H. Leitgeb, I. Niiniluoto, P. Seppälä, and E. Sober, 289–310. College Publications.
———. 2017c. An Engineering Paradigm in the Biomedical Sciences: Knowledge as Epistemic Tool. Progress in Biophysics and Molecular Biology 129: 25–39. https://doi.org/10.1016/j.pbiomolbio.2017.04.001.
———. forthcoming. Scientific methodology in the engineering sciences. Chapter 4. In The Routledge Handbook of Philosophy of Engineering, ed. D. Michelfelder and N. Doorn. Routledge. Publication scheduled for 2019.
Boon, M., and T. Knuuttila. 2009. Models as Epistemic Tools in Engineering Sciences: A Pragmatic Approach. In Philosophy of Technology and Engineering Sciences. Handbook of the Philosophy of Science, ed. A. Meijers, vol. 9, 687–720. Elsevier/North-Holland.
Chang, H. 2004. Inventing Temperature: Measurement and Scientific Progress. Oxford: Oxford University Press.
Craig, E. 1998. Duhem, Piere Maurice Marie. In Routledge Encyclopedia of Philosophy: Descartes to gender and science., vol. 3, 142–145. London/New York: Routledge.
Da Costa, N.C.A., and S. French. 2003. Science and Partial Truth. A Unitary Approach to Models and Scientific Reasoning. Oxford: Oxford University Press.
Dai, W., et al. 2015. Prediction of Hospitalization Due to Heart Diseases by Supervised Learning Methods. International Journal of Medical Informatics 84: 189–197.
Duhem, P. 1954/[1914]. The Aim and Structure of Physical Theory. Princeton: Princeton University Press.
———. 2015/[1908]. To Save the Phenomena: An Essay on the Idea of Physical Theory from Plato to Galileo. Trans. E. Dolland and C. Maschler. Chicago: University of Chicago Press
Esteva, A., B. Kuprel, R.A. Novoa, J. Ko, S.M. Swetter, H.M. Blau, and S. Thrun. 2017. Dermatologist-level Classification of Skin Cancer with Deep Neural Networks. Nature 542: 115. https://doi.org/10.1038/nature21056.
Feest, U. 2010. Concepts as Tools in the Experimental Generation of Knowledge in Cognitive Neuropsychology. Spontaneous Generations 4 (1): 173–190.
Giere, R.N. 1988. Explaining Science. Chicago/London: The University of Chicago Press.
———. 2010. An Agent-Based Conception of Models and Scientific Representation. Synthese 172 (2): 269–281. https://doi.org/10.1007/s11229-009-9506-z.
Glymour, B. 2002. Data and Phenomena: A Distinctions Reconsidered. Erkenntnis 52: 29–37.
Hempel, C.G. 1962. Explanation in Science and Philosophy. In Frontiers of Science and Philosophy, ed. R.G. Colodny, 9–19. Pittsburgh: University of Pittsurgh Press.
———. 1966. Philosophy of Natural Science. Englewood Cliffs: Prentice-Hall.
Hofree, M., J.P. Shen, H. Carter, A. Gross, and T. Ideker. 2013. Network-based Stratification of Tumor Mutations. Nature Methods 10: 1108. https://doi.org/10.1038/nmeth.2651. https://www.nature.com/articles/nmeth.2651#supplementary-information.
Humphreys, P. 2009. The Philosophical Novelty of Computer Simulation Methods. Synthese 169 (3): 615–626. https://doi.org/10.1007/s11229-008-9435-2.
Knuuttila, T., and M. Boon. 2011. How Do Models Give Us Knowledge? The Case of Carnot’s Ideal Heat Engine. European Journal for Philosophy of Science 1 (3): 309–334. https://doi.org/10.1007/s13194-011-0029-3.
Kourou, K., et al. 2015. Machine Learning Applications in Cancer Prognosis and Prediction. Computational and Structural Biotechnichnology Journal 13: 8–17.
Lemm, S., et al. 2011. Introduction to Machine Learning for Brain Imaging. NeuroImage 56: 387–399.
Libbrecht, M.W., and W.S. Noble. 2015. Machine Learning Applications in Genetics and Genomics. Nature Reviews Genetics 16: 321–332.
Lima, A.N., et al. 2016. Use of Machine Learning Approaches for Novel Drug Discovery. Expert Opinion on Drug Discovery 11: 225–239.
Mayo, D.G. 1996. Error and the Growth of Experimental Knowledge. Chicago: University of Chicago Press.
McAllister, J.W. 1997. Phenomena and Patterns in Data Sets. Erkenntnis 47 (2): 217–228. https://doi.org/10.1023/A:1005387021520.
———. 2007. Model Selection and the Multiplicity of Patterns in Empirical Data. Philosophy of Science 74 (5): 884–894. https://doi.org/10.1086/525630.
———. 2011. What do Patterns in Empirical Data Tell Us About the Structure of the World? Synthese 182 (1): 73–87. https://doi.org/10.1007/s11229-009-9613-x.
Mena, J. 2011. Machine Learning Forensics for Law Enforcement, Security, and Intelligence. Boca Raton: CRC Press.
Nersessian, N.J. 2009. Creating Scientific Concepts. Cambridge, MA: MIT Press.
Odone, F., M. Pontil, and A. Verri. 2009. Machine Learning Techniques for Biometrics. In Handbook of Remote Biometrics. Advances in Pattern Recognition, ch. 10, ed. M. Tistarelli, S.Z. Li, and R. Chellappa. London: Springer.
Olszewska, J.I. 2016. Automated face Recognition: Challenges and Solutions. In Pattern Recognition – Analysis and Applications, ed. S. Ramakrishnan, 59–79. InTechOpen. https://doi.org/10.5772/62619.
Phua, C., et al. (2010). A comprehensive survey of data mining-based fraud detection research. https://arxiv.org/abs/1009.6119
Suppe, F. (1974). The Structure of Scientific Theories (1979 second printing ed.). Urbana: University of Illinois Press.
———. 1989. The Semantic Conception of Theories and Scientific Realism. Urbana/Chicago: University of Illinois Press.
Suppes, P. 1960. A Comparison of the Meaning and Uses of Models in Mathematics and the Empirical Sciences. Synthese 12: 287–301.
Tcheng, D.K., A.K. Nayak, C.C. Fowlkes, and S.W. Punyasena. 2016. Visual Recognition Software for Binary Classification and Its Application to Spruce Pollen Identification. PLoS ONE 11 (2): e0148879. https://doi.org/10.1371/journal.pone.0148879.
Van Fraassen, B.C. 1977. The Pragmatics of Explanation. American Philosophical Quarterly 14: 143–150.
———. 1980. The Scientific Image. Oxford: Clarendon Press.
———. 2008. Scientific Representation. Oxford: Oxford University Press.
———. 2012. Modeling and Measurement: The Criterion of Empirical Grounding. Philosophy of Science 79 (5): 773–784.
Van Liebergen, B. 2017. Machine Learning: Revolution in Risk Management and Compliance? The Capco Institute Journal of Financial Transformation 45: 60–67.
Woodward, J.F. 2011. Data and Phenomena: A Restatement and Defense. Synthese 182 (1): 165–179. https://doi.org/10.1007/s11229-009-9618-5.
Wuest, T., et al. 2016. Machine Learning in Manufacturing: Advantages, Challenges, and Applications. Production & Manufacturing Research 4 (1): 23–45.
Acknowledgements
This work is financed by an Aspasia grant (2012–2017 nr. 409.40216) of the Dutch National Science Foundation (NWO) for the project “Philosophy of Science for the Engineering Sciences.” I want to acknowledge Koray Karaca for critical suggestions and fruitful discussions; I am indebted to him for introducing me to the machine learning literature, which I have gratefully used in writing the introduction to this theme. I also wish to acknowledge Marta Bertolaso and her team for organizing the expert meeting (March 5–6, 2018 in Rome) on the theme “Will Science remain Human? Frontiers of the Incorporation of Technological Innovations in the Biomedical Sciences,” and participants of this meeting for their fruitful comments, especially by Melissa Moschella.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2020 Springer Nature Switzerland AG
About this chapter
Cite this chapter
Boon, M. (2020). How Scientists Are Brought Back into Science—The Error of Empiricism. In: Bertolaso, M., Sterpetti, F. (eds) A Critical Reflection on Automated Science. Human Perspectives in Health Sciences and Technology, vol 1. Springer, Cham. https://doi.org/10.1007/978-3-030-25001-0_4
Download citation
DOI: https://doi.org/10.1007/978-3-030-25001-0_4
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-25000-3
Online ISBN: 978-3-030-25001-0
eBook Packages: Religion and PhilosophyPhilosophy and Religion (R0)