1 Introduction

The recycling business has been vastly discussed around the world and has attracted attentions of countries, industries, scholars and city managers during the last decade. Recycling used to cost much less for developed countries, because these countries had simply exported the wastes to China before 2018. In 2018 China started the “Blue Sky” program to replace its “National Sword” policy. Since then, its contamination rate has been set to the strictest level of 0.5% and its import has reduced 99% in waste plastics and 33% in mixed papers (Staub 2019). The China’s decreased ability to process wastes places a huge burden on these developed countries. In England, more than half a million tons of plastics were burnt in 2019; in Australia, 1.3 million tons of life wastes that had been previously shipped to China were beyond its ability to handle. Thus, waste processing becomes an extra burden to governmental administrations, and has also caused economic loss to recycling businesses in these countries. In the USA, the processing price fell from 36 to 4.69 dollars per ton, according to its National Waste & Recycling Association. The price fall is better understood from the way how the recycling business works. In general, the government pays private recycling companies to deal with wastes through contracts, and then the government gets it pay back from the worth of the recycled materials, the larger portion of it. When the price of the wastes drops due to the China’s recycling ban, recycling companies have to make 70% of their profits from landfilling and incineration other than turning into renewable items in order to contain their cost. The UK Government Statistical Service reported that its recycling rate was 45.5% in 2017, around 45% in 2018, and it was only 37.5% in 2017 in the USA (Brucker 2014).

The recycling landscape in developing countries is never pleasant. Most of the jobs in recycling businesses are still done by highly exposed people and low-income families. They pick up the recycling materials by hand in mixed rubbish bins or in recycling centres. This not just increases their health vulnerability but also decreases the recycling percentage in the long run. The cost of people’s health largely erodes away the value of recycling materials. In addition, the quality of the recycled materials is not guaranteed for the absence of systematic separation in these countries.

To improve the situation of the recycling business and unlock the full value of recycling, machine learning (ML), a subbranch of the Artificial Intelligence (AI), stands out with its superb abilities in accurate classification (Ryan and Bernard 2006), image identification (Scavino et al. 2009) and Data processing capacity (Yogeswari et al. 2019). As its name implies, ML learns itself from examples offered by human rather than passively takes specific orders from human. It has the ability to sort the complex materials on the conveyor belt of a recycling centre and respond spontaneously to the recycling volume, as humans do, by repeatedly collecting input information, and therefore ML helps greatly in reducing manpower, producing materials with a higher resale value, and adding flexibility to sorting facilities. Besides, the increased computational power and the availability of larger data sets will make waste prediction at a municipal level possible by ML.

Some ML examples on market are Denver-based AMP Robotic (amprobotics.com), a leading company in the innovation of recycling businesses. It has developed some software, an AMP Neuron platform that uses computer vision and machine learning. Its robots can recognize different colours, textures, shapes, sizes and patterns to identify the characteristics of target materials and further sort wastes like human. Also, robots have been installed at the Single Stream Recycle plants in Sarasota, Florida, and they are able to pick 70–80 items a minute, twice as fast as a man does, and with greater accuracy (Bennett 2019), which is a great improvement in recycling efficiency before dual stream recycling takes place of the single stream one dominating the recycling business currently across the world.

In response to the market use of ML, there is a rapid growth in its academic studies, as is seen in the results of a preliminary bibliography search done with the Web of Science database. In order to ascertain the extent of ML coverage in recycling literature, “machine learning” and “recycling” were used as the key words. Figure 1 is built on the data retrieved from the Web of Science database for the query results. As shown in Fig. 1, a rapid growth of ML publications was found mainly during the period from 2005 to 2019, and the growing trend coincided with the popularity of ML (Ni et al. 2019). It is also noticed in Fig. 1 that those studies published between 2018 and 2019 took up half of all the ML publications and the growth was also the rapidest in this period. The rapid growth was sure to be, in part, related to the new waste disposal system requested and recycling bans issued by some governments, and in part, to the development of the vision techniques that appeared recently like deep learning.

Fig. 1
figure 1

The query results presented by Web of Science database

In the literature, a body of studies on recycling business are based on ML algorithms of different variety such as Neural Networks (Gupta et al. 2019), Support Vector Machines (Mukherjee 2017), k-Nearest Neighbour (Ghosh et al. 2019). However, ML does not have a clear-cut classification of its algorithms, mainly because of the number of algorithms and multiple variations. As a result, it becomes difficult and even confusing in tailoring an ML algorithm to one’s need when a researcher is designing the research questions in recycling studies. Researchers may find it challenging to track the use of ML algorithms in recycling business.

A close reading of the literature also shows several research gaps in the applications of ML in recycling business (Liu et al. 2017; Gundupalli et al. 2017). First, the practitioners and researchers in this field are short of explicit clue about the literature they should read and the journals to submit their studies; Meanwhile, the literature reviewed by previous studies in this field was generally focused on one specific area of a single industry, a systematic review on the applications of ML in recycling business is lacking. Second, there is no statistical panorama of the overall usage of all the ML algorithms that already have been applied in recycling business. The scarcity of the corresponding statistics leads the practitioners and researchers to copying ML algorithms adopted by previous researchers instead of turning to the ones that are seldom used but might be more specific and helpful. Third, although most of the domains of recycling business were theoretically claimed to be applicable with ML, only a few ML algorithms had been practically used in some domains of recycling business.

In consideration of the above-mentioned gaps, this paper is to conduct a comprehensive review of the academic literature on the applications of ML in recycling business. It aims, first, to explore the potential domains ignored in preceding studies to help the practitioners and researchers learn more about applying ML to the recycling business, and, second, to evaluate the current situation of ML applications in recycling business against the recommendations raised in previous studies, and finally, to identify the potential ML obstacles and future trends from the recycling standpoint.

In other words, this paper makes contributions to the knowledge of ML applications and its influence on recycling business by addressing the next three research questions: (1) what is the current status of ML applications in recycling business? (2) What ML algorithms have been applied in recycling business so far and what are the benefits perceived? (3) What are the obstacles and future directions of ML applications in recycling business?

In order to answer these questions, the following objectives are set:

  • To review and classify the literature on the applications of ML in recycling business in terms of publication year, influential authors, journals, and recycling types

  • To find out the ML algorithms employed practically and examine their benefits to provide the practitioners and researchers with an explicit guidance map

  • To depict the domains of recycling business that have adopted ML

  • To identify the possible obstacles and then explore more space for the use of the ML in recycling industries for the future practitioners and researchers

This paper is organized in the following way: Sect. 2 explains the methodology used in analysing the articles obtained and the objective topics that have been specified. Section 3 discusses topic one. It is to classify the literature based on different variables. Section 4 (topic two) uses an overall statistic to examine the ML algorithms in recycling business and their benefits in order to discover both the popular and under-focused (but potentially value-added) ML algorithms that deserve the attention of the practitioners and researchers. The domains of recycling business (topic three) that have applied ML are depicted in Sect. 5. Obstacles to ML applications in recycling business are discussed in Sect. 6 and future trends of applying ML (topic four) are also listed there.

2 Methodology

Up to the end of 2019, there has been not much literature offering a systematic analysis of the ML applications in recycling business. To our knowledge, there were only two reviews (Liu et al. 2017; Gundupalli et al. 2017) that addressed ML algorithms of recycling business. These two papers analysed the publications of different industries from 1991 to 2015 and 1995 to 2015, respectively. Gundupalli et al. (2017) introduced the applications of 13 automatic sorting methods in recycling 6 types of municipal solid wastes. Such ML algorithms as Naïve Bayesian (NB), SVM, and NN were claimed to be used to classify metal and plastic wastes; Liu et al. (2017) mentioned NN in the future trends of optimization techniques used in the domain of composite recycling. However, both reviews lacked a systematic perspective of the applications of ML algorithms in recycling waste materials. Also, as only rubbish sorting and strategy optimization problems were discussed in these two reviews, a full picture of the domains where ML has been used in recycling business is still needed.

Other published studies reported the ML applications in recycling business in their literature review part. For example, Qi et al. (2018) suggested that regression trees, decision trees (DT) and boosting were able to predict the strength of recycling wasting cemented paste; Meza et al. (2019) recommended such ML algorithms as decision trees, regression trees and boosting in predicting the urban waste generation; Kannangara et al. (2018) reviewed the NN and DT in municipal solid waste and claimed that the prediction rate could be raised as high as 72%. In general, most of these reviews were based on authors’ own knowledge about the algorithms rather than an overall statistical analysis of all the ML algorithms in application. Last but not least, as seen from the publication date, most of the articles concerned were completed before 2017 and there has been a boost of publications since 2017. Thus, a new systematic review is required for the recent boost.

2.1 Timespan for literature analysed

This paper analyses the literature in 20 sequential years, from 2000 to 2019. That the selected literature starts from 2000 follows the reasons mentioned in Ni et al. (2019). That is, the ML research became popular toward the end of 1900s. Also, the reason using 2000 as the start can also be justified by the results of bibliographical query shown in Fig. 1 that the concerned studies was first retrieved in the database after 2000.

2.2 Literature search

The academic literature to review in this paper was acquired from the leading databases of journal articles with the purpose of covering the relevant domains as comprehensively as possible. Nine scholar databases in total are used in this paper, including Science Direct, SCOPUS, Wiley Online Library, Springer Nature, JSTOR, Taylor & Francis, IEEE Xplore, Emerald, and Google Scholar.

2.3 Database query

Although the scheme for the literature varies with databases in retrieval, the basic method for retrieval was adopted in each case: (Title OR Abstract) CONTAINS (recycling business OR recycling) AND « (Title OR Abstract) CONTAINS (Machine learning) AND (Timespan) = 2000–2019 AND (Article Type) = (Peer Reviewed)». The retrieval yielded 404 articles in total. Those articles published in the language other than English were deleted first, then deleted were the news from trade magazines, and those articles with duplicate entries due to the databases overlap. Being processed by categorizing and analysing, 256 articles were left after a practical outline made by two autonomous researchers. Their discrepancies in judgement were settled down by a group discussion joined by the third researcher. According to the group discussion, another 195 articles were deleted, because they did not focus on the ML applications in recycling business although they did match the scheme for retrieval. For instance, some of those deleted articles referred to ML briefly or incidentally, but did not carry out research of any significance in this domain. In the end, 51 articles were obtained for analysis in this paper.

2.4 Analysis

Guided by the three research questions, this paper intends to remedy the shortage of systematic literature reviews about the ML applications in recycling business. At first, it presents an overview of references sources for academics, e.g. types of journals, the most influential authors and countries. Then, based on the discovery that the use of ML is strongly related to authors’ own experience and knowledge, and no exact explanation is there for their usage, although Koutschan (2015) identified 32 commonly used algorithms, a statistical analysis is done to describe the application scenario of the ML algorithms. This is to provide a guidance map for the commonly used algorithms and underexplored algorithms in the previous literature. Besides, a framework of how ML algorithms can be applied into recycling business is developed in this paper. Thus, the later paper is divided into four topics of discussion plus a conclusion as an extended discussion:

  • Discussion One: The general classification of literature

  • Discussion Two: The ML algorithms applied in recycling business and their characteristics;

  • Discussion Three: The application domains of ML in recycling business and their benefits

  • Discussion Four: A matrix for examining how the ML algorithms and application domains are matched

  • Conclusion: A discussion of the current obstacles in ML application to recycling business and its future directions

3 Discussion one: descriptive analysis of the literature

All 51 articles were classified in accordance with the scheme mentioned above. The selected articles from the 37 journals were analysed by the year of publication, country, journal title, author, and waste material type. This particular analysis will provide a guidance map for pursuit of future research on ML and its applications in recycling business by clarifying the chronological growth of ML over the years, the countries and journals that paid attention to ML theories and applications, the authors and classification of recycling material types.

3.1 Distribution of articles by the year of publication

Figure 2 depicts the frequency of the publication per year and contextualizes the ML applications in recycling business. It indicates clearly that the articles published in this field increased sharply in the very recent years, from 2017 onwards, accounting for 66.67% of all the publications, with an average of 11.7 publications per year as against less than one piece before 2017. This situation was very likely initialized by the waste ban that started from China but sent a wave of “recycling revolution” across the world. The corresponding research was increased to catch the topic.

Fig. 2
figure 2

Distribution of articles by the year of publication

3.2 Countries

Figure 3 displays a broad coverage of publications for ML in recycling business across 25 countries around the world. Among the 51 selected articles, China, USA and India lead the research with 13, 6 and 6 articles published, respectively. This tendency is in compliance with the idea that a larger economy always goes with more recycling requirements. The three countries have a total GDP contribution of 38.4% to the world economy and 40.6% of the world population, which gives them pressure to the leading position in waste processing, including the recycling business. The countries following China, US, India are Iran, Canada, and German which contributed a lot to building the theories concerning the ML applications in recycling business. The rest countries (or districts) listed in Fig. 3 like the UK, Finland, Italy, Hong Kong, etc., did involve but their contribution is relatively low. And it is noted that not all the large economies join the lead in this field; it is also able to find, in regard to the number of countries, that only a small proportion of the world are practicing ML in recycling business. This might be resulting from the demanding qualifications of the ML applications.

Fig. 3
figure 3

The coverage of publications for ML over the world

3.3 Distribution of articles by journal

The journals of various disciplines (e.g. IS, IT, engineering, management, business, and networking) may publish the articles on the ML applications in recycling business. Table 1 lists the journals that published two or more concerned articles during the target period from 2000 to 2019. As shown in Table 1, Waste Management is the biggest publisher in this field so far and it has published 9 papers concerned. This journal has actually been long focusing on the recycling research. Journal of cleaner production takes the second place which has published 4 relevant papers, and this is an international, transdisciplinary journal whose main concerns are on the research and practice in cleaner production, environment, and sustainability. Construction and Building Materials, Environmental Progress & Sustainable Energy and Recycling gets 2 articles published, respectively. The other 32 journals have only one each (see Appendix).

Table 1 Distribution of articles by journal

3.4 Co-citation network of the cited authors

Figure 4 shows the co-citation network of the cited authors. Since none of the first authors and corresponding authors get published more than once, the co-citation of all the authors of the 51 publications was developed through VOSviewer. As indicated in Fig. 4, some authors are very frequently cited who are quite influential, being either the ML top researchers or pioneers in ML application to recycling business. For example, Abbasi, M. is an expert in wastewater treatment; Behnood, A. is one of the researchers who first predicted the concrete strength with machine leaning; Mullai, P. first applied Adaptive Network-based Fuzzy Inference System (ANFIS) in wastewater production in 2011; Dubey, R. is the first expert to employ predictive analysis with big data in green supply chain. Fall, M. has registered patents in recycling of WEEE. Meanwhile, Yann, L. is the founder of deep learning; Vanik is the inventor of SVM. Haykin, S. is an expert in neural computation; and Breiman, L. has made important contributions to classification, regression trees, and to enabling the regression trees in fitting the bootstrap samples. As seen in Fig. 4, the connections between the authors are normally single-directional. This implies that a contributing author network has not been formed yet, and the corresponding research foundation is not solid enough. In other words, the subject concerning applying ML to recycling business is a niche area.

Fig. 4
figure 4

Co-citation network of the cited authors

3.5 Types of recycling materials

Every research seems incomplete until they affect the practical industry (Dubey et al., 2017). The studies report that ML used in treating municipal waste is greatly improving the prediction accuracy of the waste generation, and save policy makers or recycling plants overwork time, and this is especially true with Deep Learning which is able to classify the plastic types accurately with less labour. But so far what recycling materials have been effectively handled by ML?

It can be seen in Fig. 5 that the urban and plastic waste processing takes the first and second place in applying ML, and comes next the treatment of building and concrete materials. It is thought-provoking to see only 12 types of recycling materials are studied among all the 51 articles, the ML applications in recycling business are rather limited in terms of recycling materials varieties.

Fig. 5
figure 5

Types of recycled materials

4 Discussion two: machine learning algorithms

From the selected literature, it is identified that 11 of 32 commonly used ML algorithms have been applied to recycling business. The distribution of 11 ML algorithms is presented in Fig. 6 based on their frequency of being discussed in the literature. Specifically, these ML algorithms (their percentage) include Artificial Neural Network (ANN, 32%), Support Vector Machine (SVM, 20%), Deep Learning (DL,15%), Decision Tree (DT,9%), K-Nearest Neighbour (KNN, 6%), Random Forest (RF, 5%), Reinforcement Learning (RL, 4%), K-means (4%), Extreme Learning Machine (ELM, 2%), Logistic Regression (LR, 2%) and Naïve Bayesian (NB, 1%).

Fig. 6
figure 6

11 ML algorithms applied to recycling business. ANN Artificial Neural Network, K-means K-means, ELM Extreme Learning Machine, SVM Support Vector Machine, KNN k-Nearest Neighbour, LR Logistic Regression, DL Deep Learning, RL Reinforcement Learning, NB Naïve Bayesian, DT Decision Tree, RF Random Forest


Artificial Neural Network (ANN) ANN is an ML algorithm invented in the 1980s and has gained great popularity over the past 30 years. It is a powerful algorithm used to predict nonlinear relationships. The theory underlined is the simulation of human brain by passing the information along certain neurons. An ANN model normally includes three layers: input neurons, middle neurons and output neurons. ANN connects the information provided by the target layer with its posterior layers, that is, the output layers. This simple strategy gives ANN strong robustness, and ANN also has a strong nonlinear fitting ability as well. Kannangara et al. (2018) used ANN to predict the generation of regional solid wastes with such parameters as leaf and yard waste, household kitchen organs, and blue box waste, etc. According to Kannangara's report, ANN has an accuracy of 72% in terms of R2, which is superior to traditional linear models.


Support Vector Machine (SVM) SVM is an algorithm both feasible for classifying and regression. SVM is also able to maximize the distance between the closet points in relation to each class. The distance between the hyperplane and the first point of each class is defined as the margin (Cortes and Vapnik 1995). Thus, SVM operates classifications that maximize the margin. In this way, SVM can make up for the disadvantages of ML since SVM has strong generalization and mathematical interpretability. Liu et al. (2019) once used SVM to optimize the recycling process of wastepaper with deinked-pulp properties. Their research improved the recycling rate up by 2.16% and improved acceptable proportion of DIP property from 56.07% to 100%. This was a huge success in terms of the amount of paper recycled every day. They succeeded in applying SVM for they held that the mathematical theory behind was understandable and the programming was able to optimize SVM’s parameters.


Deep Learning (DL) DL is a new branch of ML developed in recent years (LeCun 2015). DL network is the product of computational power and development of algorithms, which solves the coordination issues when large-scale neural networks run together. It normally has more than one to hundreds of layers of neural networks and multiple artificial neurons. This strong computational characteristic gives DL ability to recognize images. DL has been recently found useful in recognizing materials in recycling business. Hayashi (2019) used ML to identify the label of WEEE and the speed of recycling was thus much accelerated. Of course, DL has added great value to electrical equipment like cameras, cellophane and laptops.


Decision Tree (DT) DT is a supervised ML algorithm that is massively used in regression and classification tasks. There are three major constituents in a DT: root nodes, internal nodes and leaf nodes. The internal nodes represent magnitude values of the attributes, and each leaf node of the tree represents the probability and the class. This characteristic equips DTs with great interpretability. DTs are often incorporated into other ML algorithms in recycling business. Gruber et al. (2019) put DTs into their ensemble models. They used DTs to do the basic classification of 400 particles of 14 types of plastics. Their final results yielded an accuracy of 93.5%.


K-means K-means is an unsupervised ML algorithm. It basic function is clustering. K-means divides the data into k clusters by minimizing the square errors among different groups (Lloyd 1982). K-means has the distinct advantage of being time-efficient with its low computational complexity. In recycling business, K-means are used for the recyclables profiling. Niska and Serkkola (2018) used the standard k-means algorithm (Darken and Moody 1967) in combination with the self-organizing map (SOM) algorithm (Kohonen 1997) to cluster the data and calculate cluster-specific type profiles of waste generation. Their algorithm obviously increased the profiling efficiency in contrast to the manual profiling.


k-Nearest Neighbour (KNN) KNN is also a ML algorithm that splits data sets into K clusters. However, compared with K-means, KNN is a supervised learning method, which makes it relatively stable faced with noise. With KNN as a basic classifier, Abbasi and Hanandeh (2016) successfully predicted the monthly averages of waste quantities in the Logan City Council region, Queensland, Australia, in 2016 with a R2 of 0.98; they also forecasted that the peak of waste in the region would come in 2020.


Random Forest (RF) RF is a commonly used ML algorithm for classification. As its name mentioned, RF is an ensemble algorithm that create multiple DTs so that the classification consensus of trees is given by a voting process that determines the classification of new data sets. RF is often considered as a base line or benchmark for ML in recycling business. Arora et al. (2019) used RF as a comparison baseline to accurately predict the performance of the recycled aggregate concretes containing mineral admixtures.


Reinforcement Learning (RL) RL is a field of ML which emphasizes how to act on the basis of environment to maximize the expected benefits. It was inspired by the behaviourism theory in psychology, that is, how the organism, under the stimulation of the reward or punishment given by the environment, gradually forms the expectation of the stimulation, and produces the habitual behaviour that can obtain the maximum benefit. RL has the advantage of generating near-optimal solutions over stochastic dynamic programs without the request of transition probabilities. Shah et al. (2010) used RL to solve the switching problem in the recycling environment. Their results showed that RL could cut the cost by 65%.


Extreme Learning Machine (ELM) ELM is a feed-forward ANN, and this structure helps ELM cut down half of the calculation done by a normal ANN, which means most of its training is done in milliseconds, seconds, and minutes (Huang et al. 2006). ELM is able to analyse big data sets for recycling business within a short period of time. ELM was used by Xiao et al. (2019) as a classifier to separate the construction waste. The proposed method could precisely identify 180 samples of 6 waste types, including woods, plastics, bricks, concretes, rubbers, and black bricks.


Logistic Regression (LR) LR is an extensive version of the classic Linear Regression Model (LRM) (Ghasri et al. 2016). LR is capable of nonlinear description of inputs and outputs, because LR employs mathematical function to replace the constant coefficients in LRM. Thus, LR achieves better results at predicting with continuous data, and has been used in waste forecasting of recycling business. Similar to RF, LR is often set as a baseline in ML research. Rutqvist et al. (2019) used LR as a nonlinear model with standard features of an accuracy of 96.6% to beat the manually engineered model with an accuracy of only 86.8% in an automated recycling for waste management.


Naïve Bayesian (NB) NB is a ML algorithm based on the Bayes' theorem. Compared with other ML algorithms, NB assumes that the variables and the conditional probability are independent from each other, and the independence can be calculated for each case according to the assumed classes. This makes NB always ready to learn quickly and best suited for real-time prediction. Grochowski and Tang (2009) used NB to optimize the disassemble work of the obsolete products of a city in simulation of a real-time condition.

4.1 The Distribution of the research focus through the year

Figure 7 is plotted to further present the distribution of the research attention in each publication year. It shows that among the 11 listed algorithms, ANN and SVM were the two most reported algorithms during the target period from 2000 to 2019. They appeared in over half (54.65%) of the 51 selected articles, while DL had a boost in recent two years. It can also be seen in Fig. 7 that an obvious publication peak appeared in the period between 2018 and 2019. As compared to the other ML algorithms, ANN, SVM and KNN have been predominant research focus across quite some publication years. It is worth noting that some studies employed more than one ML algorithm.

Fig. 7
figure 7

The distribution of the research focus in each publication year

4.2 The Benefits

In order to provide a guidance map for the researchers and practitioners in recycling business to evaluate the gains of applying ML in recycling business, the 51 articles listed in this paper were analysed in terms of how ML benefits have been acknowledged. At first, the articles were reviewed cautiously to offer a list of benefits along with the ML algorithms as previously discussed. The benefits matrix was then obtained and laid out in Table 2. As seen in Table 2, the benefit varieties are noted down when they are mentioned by each of the articles reviewed and the majority of the articles reviewed in this paper acknowledged more than one benefit of ML and.

Table 2 The benefits matrix of ML applications in recycling business

To do a more in-depth analysis, the percentage of articles is calculated to demonstrate the achievement of benefits both of the conceptual and implementation-focuses: 54.90% of the studies have reported accurate prediction, 25.49% image recognition, 9.80% multiple features, 21.57% accurate classification, 5.88% hidden indicator mining, and 23.35% less labour, and 9.80% optimization. Once again, the benefits included are those clearly declared in the articles, excluding those assumed ones as suggestions for further research in each reviewed article. This is to guarantee the quality and reliability of the analysis with fewer biases. It also has to be mentioned that the most covered benefit is now on accurately predicting the waste generation, but actually image recognition and optimization are more needed in order to make significant transformation in waste recycling business.

With reference to Table 2, it is observed that many articles perceived more than one benefit from each ML algorithm. This suggests that the gains from applying ML in recycling business are interconnected across different domains. Hence, multiple benefits are able to be attained simultaneously. For instance, when a better strategy is implemented, more labour is able to be saved. Indeed, the benefits table can provide researchers more confidence and a guide map as to explore broader scopes of benefits of ML application in waste recycling.

5 Discussion three: the application domains of machine learning in recycling business

This section analyses the ML publications in recycling business according to Fig. 2. There are mainly six domains of recycling business benefited from the ML applications, namely, waste generation prediction, object recognition, condition evaluation, waste sorting, strategic planning, and robotics. They are introduced in detail as follows:

5.1 Waste generation prediction

Waste generation is inevitable because of human activities. Meanwhile, it is a big challenge for many city managers in charge of recycling business, because it is extremely hard to predict either the total quantity or generation pattern with so many effective factors. If the waste is able to be predicted in advance, the accurate information will help them effectively regulate the resources necessary for disposing waste. Traditional methods are impossible to manage big data sets, because when adding any new feature to a data set, it may break the rules that sample size should be larger than the feature size. ML can conquer this weakness and help to find out the patterns of the waste generation behaviour and design the incentives to encourage the recycling and composting. In this application domain of ML, Kumar et al. (2018) used the ANN, SVM and RF to predict the plastic generation of the municipal solid waste; Kontokosta et al. (2018) employed ML algorithms to predict weekly and daily waste at a building level; Meza et al. (2019) used predictive analysis of urban waste generation with DT-based ML, SVM and ANN.

5.2 Object recognition

Object detection has been a mature area in computer science for years (Arman et al. 1993). However, the traditional way of object recognition is based on the density, shape, texture by using facilities like “star screen”, which is a series of concentric metal disks spinning around and floating up the lighter materials or magnets to capture metal. This recognition is usually not accurate and requires large area for the facilities. However, with the development of DL recently, the ML object recognition can be achieved. DL can recognize objects with a high accuracy. Wang et al. (2019a, b) has used R-CNN methods to look for scattered nails and screws and then recycle them. The success of their experiment is 100% and the repetition rate is over 12%. Xue et al. (2019) used convolutional neural networks to recognize the agriculture waste compost maturity by using their image. The accuracy in their 4 experiments reaches an average of 99.6%.

5.3 Waste sorting

The rapid growth of municipal waste has caused various problems, such as the resource depletion and environmental pollution. An important strategy for dealing with these problems is recycling and sorting is a primary task in recycling. In general, waste sorting is highly automatic, but some of the sorting tasks are still based on human judgment and manual picking up. The workers monitor the quality of recovered materials and the separation of bulky waste. They are exposed to risk by direct contact with the waste. Additionally, the repetitive work is also big burden to their health. ML is able to improve the quality control and health conditions of the workers through reducing their exposure. In the domain of waste sorting, Wang et al. (2019a, b) has used SVM with relief algorithm to classify different kinds of plastic bottles by their colour and material. The accuracy could reach as high as 94.7%; Xiao et al. (2019) used Random forest method and reported an accuracy of 100% in identifying 180 samples of 6 material types including woods, plastics, bricks, concretes, rubbers and black bricks. Costa et al. (2018) developed a computer vision approach to separate glass, paper, metal and plastic. The models used in their study involved Pre-trained VGG-16 (VGG16), AlexNet, SVM, KNN, and RF. Their models reached an accuracy of 93% in discerning the waste materials.

5.4 Condition evaluation

Decomposing can transform waste into stabilized and pollution-free materials. However, the condition and elements of the waste are needed to be tested for the safe use of composting waste. These biological and chemistry tests for waste take a long time and have to be done in laboratories. In this case, it is in an urgent need to develop a rapid and direct assessment method for the condition evaluation. With the development of the digital photography and computer vision technologies, DL can realize the end-to-end prediction and achieve the effect of real-time prediction. In this domain, Xue et al. (2019) succeeded in developing a fast and easy method for predicting agriculture waste composting maturity. The accuracy of their proposed method was 99.7%. Their method also proved the possibility of large-scale application. Liu et al. (2019) once combined SVM with DL and the combination raised the accuracy from 56 to 100% in wastepaper selection. Vu et al. (2019) assessed the waste characteristics with an ANN model. Their assessment of the wastepaper condition resulted in the means of the absolute percentage errors ranging from 10.92 to 16.51% and therefore greatly reduced the purchase cost of wastepaper. Besides, their study also proposed an intelligent model for scheduling the displacement of mixed wastepaper.

5.5 Strategic planning

Strategic planning is a well-studied domain. In recycling business, strategic plan claimed that all the stakeholders serve the final goal of achieving an integrated sustainable waste management system. Strategic planning provides sustainable improvements to the practice of local waste management, because it can respond to the ever-changing waste situation quickly and ensure the whole recycling process in keeping pace with the waste generation, contributing to a sanitary environment. ML is able to help the strategic planning by monitoring more data at the same time and transferring the data synchronously. In this domain, Liu et al. (2002) built a recyclability assessment model based on an ANN. As well known, ANN can help overcome the difficulties in recycling business by choosing different kinds of product design. Shah et al. (2010) designed a smart system in choosing the usage of the materials based on the design. Tuncel et al. (2014) designed a large-scale disassembly line for balancing problems with uncertainty by reinforcement learning.

5.6 Robotic

Traditional recycling business depends on automatic sorting systems to pick out recyclable such as bottles, cans, and plastics. However, these systems are not always perfect that using human power to capture the information missed by the systems is necessary, but fewer workers take the job of long hours handling the endless waste stream. Moreover, due to the low cost of outsourcing recycling business to China before 2018, many countries are still applying the single stream recycling methods, which leads to the high contamination of the recyclables like the greasy pizza box or shattered plastic bags. These highly contaminated wastes affect the operation of automatic sorting systems and their efficiency. As a result, more human power is required. Otherwise, the wrongly identified items will damage the facilities or break off the recycling stream. Robots has demonstrated superhuman classification speed with low error rate. The Rocycle from the MIT Computer Science and Artificial Intelligence Laboratory can achieve 63% accuracy by a soft PTFE "finger" that detects objects through fingertip sensors. However, as stated by Wang et al. (2019a, b), the robotic sorter controlled by ML has not been well-accepted because of job displacement and trust issues.

6 Discussion four: a matrix of ML algorithms in recycling domains

As presented in Table 3, the matrix of 11 frequently used ML algorithms in 6 recycling domains shares the following features in distribution:

  1. 1)

    The 6 recycling domains have witnessed an obvious uneven ML algorithms application. Generation prediction reported in 21 studies is where ML is used most frequently, taking the first place, robotic in 16 the second, followed by waste sorting in 15, condition evaluation in 14, strategic planning in 13, and object recognition in only 7.

  2. 2)

    As also shown in Table 3, there are 66 cross-blocks in the matrix between 11 ML algorithms and 6 recycling domains, and 30 of them are empty. Empty cross-blocks indicate that some recycling domains and some ML algorithms never cross path or they exclude each other in use. For example, condition evaluation has never adopted any of KNN, RL, K-means, ELM, LR, or NB. NB has never been used in waste generation prediction, or object recognition, condition evaluation, waste sorting, or strategic planning. Actually, the absence of certain ML algorithms in some recycling domains does not suggest that the absent ML algorithms are not feasible in those domains, and the absence of some recycling domains does not definitely mean that these ML algorithms are not needed in the domains, either. It might be the inadequate understanding for the connection between the ML algorithms and the domains that should be responsible for the absent involvement and application. In other words, the researchers and practitioners in recycling business might find it quite difficult to adapt the features of certain ML algorithms specifically to the needs of some recycling domains.

Table 3 11ML algorithms used in 6 recycling domains

Also the matrix shows that the distribution of the ML algorithms in the 6 recycling domains is unbalanced in that such ML algorithms as NN and SVM pile up in some recycling domains (particularly in generation prediction), and at the same time, some recycling domains such as condition evaluation and object recognition solely involve a couple of ML algorithms.

7 A discussion of the current obstacles and future directions

Based on the analysis in Sects. 3, 4, 5, and 6, this section conducts a final discussion to address RQ1, RQ2, and RQ3, and out of the discussion are derived obstacles existing in the research of ML application to cycling business, and then future directions are recommended as solutions to the obstacles.

7.1 More about the current literature

As seen in Sect. 3, ML has already been decently applied to recycling business. Some countries, especially those large economies with large a population, get actively involved in the ML applications in recycling business, and considering the publications in this field each year, the growing trend is obvious lately. However, both the research and the application is still much limited, and so is the attention to this field from journals. The co-author network has not been formed yet; the authors from recycling business confined their studies to the entry-level use of ML and the ML experts only initialized an attempt to step into recycling business and in total, no more than 12 types of materials have been reported concerning ML use in their recycling process; altogether 37 journals are covered in the review but merely 5 of them published more than one study in this area. An inadequate clarity about the nature of the ML algorithms may be the obstacles that has hindered the researchers and practitioners in this area so far.


“More clarity” is needed The ‘black box’ nature of the ML algorithms suggests that results produced by ML algorithms are essentially difficult to interpret. Although ML has achieved decent performance in recycling business, more interpretability is required for the studies specialized in recycling business to be validated and published, and the lack of interpretability may derive from their simple analysis of the feature extraction or classification of ML algorithms so that a ‘new’ feature or ‘new’ classifier in the studies is introduced though, it is usually not well elaborated. The researchers and practitioners as well are supposed to rigorously take the validity of ML algorithms into consideration before they attempt to apply ML to a new context in recycling business.

7.2 More about the machine learning algorithms

Indeed, ML can ‘learn’ the classification of recycling materials, to identify bottles, cans, and plastics, etc. In theory, one can teach ML anything a man can learn to identify certain material, if the ML is provided with adequate learning samples of the defining features of the material type. ML can also be extensively used to predict the recycling volume. The ML algorithms are generous to a variety of input data sets, including the ones with more features than samples, missing data and mixed frequency data. As an alternative to the traditional regression models, ML is able to figure out nonlinear relationships between the input features. This can provide the factory staff and city managers in recycling with important information for getting prepared in advance. Moreover, ML can be developed to understand the mechanisms underlying the chain of the recycling business. By so doing, in addition to the recycling data, some ML algorithms can be used to predict the recycling processing time based on certain social-economic data.

However, as mentioned in Discussion Two, most of the research only piles up on SVM and NN. More attention is required to be assigned to the other ML algorithms. As shown in Sect. 4.1, DL has been widely used in recycling business in recent two years, because DL is a useful tool for image recognition. Additionally, DL is able to provide recycling business not only with the “eyes” to check the recycle materials, but also more strong computing power to classify them. As pointed out in Sect. 4.2, most of the articles reviewed in this paper mentioned waste generation prediction as benefit, but the benefits for image recognition, labour-saving and overall optimization of the recycling data have not been as much highlighted for the lack of better data.


“Better Data” are required It can be seen in Discussion Two that 54.5% of the articles address the waste generation prediction just, because “volume” is the easiest feature to be taken down from the databases. However, recycling is a business normally sensitive to the material type such as plastic, cement, metal, and being less specific to their recycling condition or their logistic plus a rough estimation of material type diminishes the data quality of most of the research. Also, the data recording systems so far available make researchers and practitioners “feel at a loss” at the start of looking for help in recycling business. Invalid training data fail to make the verifiable generalization, which may be the major reason for the limited benefit reports afore mentioned. By the way, it is noteworthy that a new generalization should be bench-marked on more than one data set, any flaw in a data set will reduce its validity.

7.3 The correlation between the recycling domains and the ML algorithms

Section 5 introduces the 6 recycling domains that have applied ML and discovers the application is unevenly distributed across the domains. Although 21 of the 51 studies concerned address the domain of generation prediction, only 6 of them deal with the object recognition. A matrix is provided later to specify the untried domains and those underdeveloped algorithms in the current research. From these discoveries, three future research directions are extrapolated or recommended:


More “combinations” are awaiting As mentioned in Sect. 5, there is no valid justification why certain algorithms have been adopted into most domains and others have not yet, which implies a large space for more “combinations” of the algorithms and the domains in future researches. For example, those empty cross-blocks under ELM, LR, NB and RL in Table 3 suggest that these ML algorithms have been severely ignored and they are potential candidates for future recycling business and relevant studies. Actually, in real world, more recycling domains are said to heavily depend on the development of ML. Thus, Table 3 can be expanded to include possible domains like street-collecting robots, intelligent rubbish bin and recyclable detectors and more algorithms like principal component analysis, adaboost, Xgboost.


“Overall optimism” is promising Table 3 in Sect. 5 shows a lot of blanks in the domain of strategic planning, implying a possible shortage in valid data collection. Actually, the information about the frequency of an individual dumping, the volume of each dumping, the waste types, so and so forth, can be collected by sensors, the radio frequency identification chips if guided and aided by ML algorithms. Moreover, the overall optimism of the 6 domains can be integrated by ML into an instant door-to-door collection, dynamic rewarding system for certain types of materials, logistic design, etc., and then an overall optimization system for the whole recycling business can be ultimately achieved.


“Specific picking” is zooming in Table 3 also shows a big inadequacy of ML in the domain of object recognition, but as indicated in Fig. 7, DL has catalysed interest in recycling business more broadly in recent two years in the domains of the pattern recognition and object detection. However, the application of DL is still in its relative infancy and has its limitations. Besides visual recognition, DL can be used as an algorithm to make prediction in more delicate tasks, for example, picking greasy pizza boxes among clean boxes or other paper with food stains. Of course, DL has better prediction accuracy only if a lager data set is available.

8 Conclusion

Again this paper has reviewed 51 articles, published from 2000 to 2019, on the ML applications in recycling business from 9 databases, addressed four research questions, and completed a descriptive research of the selected studies in terms of publication years, countries, journals, authors and, types of recycling materials. The results suggest that ML has been applied to a few domains within recycling business and has been a proved irreplaceable interpretation tool of large data sets like municipal recycling data, but the findings obtained in this paper also indicate that the ML applications in recycling business are still a niche area.

Therefore, a large research space is out there for further exploration by more researchers and practitioners out of more countries, who may obtain inspirations from the leading authors and journals listed in this paper, for example, on more recycling materials than that have been dealt with in the 51 studies, and on potential applications of ML into more domains of recycling business by making a statistic reference to the 11 frequently used ML algorithms discussed in the paper, which would be surely able to provide an explicit guidance map for them. In order to present the guidance map in more detail, the 6 recycling domains and three of their top benefits are depicted in Fig. 8 for a quick view of what domains ML can be applied to and their possible benefits for the applications. It is believed that the analysis coupled with researcher’s own perspectives and experiences can explicitly guide future advancements in the recycling business under the background of recycling ban.

Fig. 8
figure 8

A quick view of possible ML applications in recycling business

Before concluding, the limitations in this paper still require addressing. Firstly, although the articles from nine the-most-referred-to databases in Management and Information area were included, further studies should contain articles from more sources. Secondly, restriction can be caused by the dynamic characteristic of online database. Some of the articles that have been published in press, will take a long time to be indexed into the database, although are all covered by this paper the representative studies available in target fields during the period it is written. The last limitation is this paper only collected articles published in English, so it might have ignored the ones carried out in other languages. Despite these limitations, to serve the goal of providing a general guidance of how ML algorithms are appropriate and effective to recycling business, this paper has addressed several research questions in the recycling business and outlined some of the major challenges in applying ML. It is considered that this paper has achieved reliable comprehensiveness and can benefit the scholars in establishing new research directions and evaluating their practical applications of ML in recycling business.