Keywords

1 Introduction

1.1 On the Growing Popularity of Artificial Intelligence and Machine Learning

During the last years, Artificial Intelligence (AI) related projects, as well as Machine Learning (ML) related projects grew in numbers, both at scientific and at industry level. While any developing research field sparks scientific interest on its own because of much knowledge is yet to be explored – with a quick Internet search, you can even find websites proposing research ideas [1,2,3,4], AI and ML proved to also be very attractive to different markets because of how much value companies get (and are forecasted to keep getting) thanks to AI and ML related products, proven by the growing marked size for these kind of technologies, predicted to be as big as US$160 Bn by 2026 [5], as well as by how the stock markets for these fields have been positively behaving (and predicted to keep doing so) during recent years [6].

Solutions like customer behavior analysis, chatbots, business forecasting tools, behavior-based cybersecurity systems, to name a few, turned out to be highly profitable business [7, 8], which in turn made investment on these areas to grow as now is possible to even find crowdfunding websites for these kind of technologies [9].

This strong trend led to the involvement of many of the world’s largest companies, to active development AI and ML technologies. Interestingly enough, many of the companies leading in AI and ML markets, are also active players in the Cloud Computing industry, often developing hybrid solutions using AI and ML technologies on cloud services, promoting even more investing in these fields [10].

Current market analysis forecast that AI and ML could lead to an weighted average of 1.7% across 16 different industries as well as to increase the economic output of those industries up to US$4 trillion by 2035. What’s more, when analyzed at country-level, it is forecasted that AI and ML could double the economic growth rates, among 12 countries sampled [11, 12]. It is also noteworthy that AI and ML are already creating new jobs, with industries requiring workers such as AI engineer, machine learning scientist, AI developer, among others [13].

The AI and ML have a well justified popularity both in the scientific and industrial communities. That being said, it is important to clarify and understand the difference between both, which is explained in the next Sect. 6.1.2. Later in Sect. 6.2 this paper dives into the different classification criteria used for ML algorithms, and then the Unified Vision Proposal is provided and explained in Sect. 6.3, followed up by conclusions and future work suggestions.

1.2 Defining Artificial Intelligence and Machine Learning

While the previous section mentioned AI and ML together, these are two different – but closely related – terms.

Although it is possible to – extremely – simplify this by stating that ML is a subset inside the AI field, brief but proper definitions are provided and referenced as follows:

  • Artificial Intelligence (AI). AI represents a set of complex edge technologies capable of interacting with its environment by means of simulating human intelligence [14] and is considered the core of the so-called “Fourth Industrial Revolution” [15].

  • Machine Learning (ML). Is the performance optimization in a certain task through computational means, following a certain criterion and using referential data and/or past results from previous iterations [16]. ML comes from the need to tackle problems beyond the reach of traditional, hardcoded IA solutions, being technically a specialized subset of AI, focused on real world knowledge applied to machines capable of making “subjective” decisions [17].

In a broad sense, the basic machine learning process involves building a ML based by “training” the machine using referential data [18].

2 Classifications and Selection Criteria in Machine Learning

2.1 Machine Learning Algorithm Classifications

Many authors concur in classifying ML algorithms based on a cognitive criterion, meaning that each ML algorithm belongs to a certain group depending on how it “learns”. This approach identifies three main categories: Supervised learning, unsupervised learning and reinforced learning [18, 19], although some authors reduce these categories to the first two [17, 20].

  • Supervised Learning. As the name suggests, ML algorithms are “guided”. This guidance takes the form of referential data, usually called “target” data, so the algorithm knows that it must identify that kind of data. The algorithm then is trained used that target data, so when it is ready, it can identify whenever it is shown the target data or something else. This kind of algorithm is usually seen in tasks which require identifying what kind of input data (an image, for example) is being presented to the algorithm.

  • Unsupervised Learning. Unlike the previous ML type, those algorithms belonging to the unsupervised learning classification do not have the help of target data, so they rely on identifying patterns and structures on the input data they have to work with. In Example, an unsupervised ML algorithm will be given a set of pictures of pencils, apples and cars, so after iterating over that info, it will eventually be able to separate the pencils from the apples and the cars.

  • Reinforced Learning. These kinds of ML algorithms work similarly to its counterpart in psychology. The result of a task will be awarded or penalized depending on whether the answer is right or wrong, so the algorithm will learn from its previous experiences to answer right. Instead of using a target data set, it works with goals it aims to achieve. Some video games provide a very nice example of this kind of learning, when a character needs to go from point A to point B, while having many possible paths, by only one is the optimal one [21].

Other classification methods are based on the type of problem to be solved, or the type of data needed to be handled, or even in the type of statistical procedure required to achieve a solution. The strategies are closer to the decision criteria used to decide when to use a given algorithm, so we’ll cover those in the next section.

2.2 Criteria for Choosing the Right ML Algorithm

While ML algorithms can be very flexible, some are more suited to certain scenarios than others.

Current literature states that the following variables are to be considered when deciding which algorithm is to be used:

  • Data size: As some algorithms can have higher execution times, for very large datasets this can discard some options.

  • Data quality: Algorithms relying heavily on the accuracy of the data presented to them (like in the case of supervised learning types), when the available data isn’t reliable enough, it should be preferred to use algorithms which don’t have this heavy dependency.

  • Available time: Closely related to the data size variable, when confronted to a short deadline, some algorithms can be an actual obstacle to the research.

  • Data type: Discrete and continuous data are to be approached differently, thus the algorithm must be chosen with this variable in mind as well.

3 Classification Filtering and Unified Vision Proposal

As previously stated, current literature provides a myriad of classification names to ML algorithms, often being redundant by giving a similar name to something already classified as a different name.

We have gathered all the definitions we found in the aforementioned literature, then filtered repeated results, and sorted them as a unified vision. Given the large number of concepts and relationships involved, the full scheme is presented in four parts, as shown in Figs. 6.1, 6.2, 6.3 and 6.4. Then we sorted those relationships in a cleaner, shorter version, which is shown in Fig. 6.5.

Fig. 6.1
figure 1

Raw scheme, part 1. (Source: Prepared)

Fig. 6.2
figure 2

Raw scheme, part 2. (Source: Prepared)

Fig. 6.3
figure 3

Raw scheme, part 3. (Source: Prepared)

Fig. 6.4
figure 4

Raw scheme, part 4. (Source: Prepared)

Fig. 6.5
figure 5

Filtered scheme. (Source Prepared)

4 Conclusions and Future Work

This research went through different views when it comes to how to classify ML algorithms, as well as views on which factors include to decide which algorithm is the best option for a given scenario. Then we found some common ground among various views, from which we graphically described how each view integrates into a greater scheme of things. It was possible to filter “repeated” views so as a result, we produced a refined version of this graphic perspective, in the form of a conceptual map, which we propose as a tool to contribute to a better understanding of ML in a more structured way.

Nevertheless, by the time out research finished, new literature added even more views [22] up to 14 different ML algorithm types [23], so this proposal still has room for improvement. Future work should review those new classification proposals, in order to find a way to integrate them into this new, greater classification scheme.

Of course, as a relatively young and unexplored field, ML might lead to new algorithms and classifications currently unexplored, which may or may not integrate seamlessly into this scheme, thus our proposal might (or might not) require a deep reforming.