Introduction

Population, urbanization and industrialization increase the amount of waste produced worldwide. Over 84% of people in developed countries and almost 64% in developing countries will live in town areas by 2050. According to the studies, the annual solid waste gathering is expected to reach about 3.40 billion tons worldwide by 2050, which would amount to a management cost of around $635.5 billion. Waste management can also significantly impact the environment and public health. Thus, waste management is a global problem with varying environmental, economic, and social impacts. Waste management encompasses the waste disposal process, the recycling of precious metals with reduced maintenance costs, and the management of waste that is environmentally friendly (Kaza et al. 2018).

Waste management is often ineffective due to various kinds of waste, such as electronic waste, municipal solid waste, biomedical waste, industrial waste, building and demolished materials, hazardous waste, and agricultural waste. Human and environmental health are harmed when waste is improperly disposed of. Failure to prioritize recycling has a negative financial impact and wastes valuable natural resources. Recycling facilities presently have to sort waste manually and use many large filters for separating particular objects as part of the recycling process. Also, customers might be confused about how to eliminate the diverse range of packaging materials. An automated trash sorting system was something we felt was necessary. This could improve processing operations’ efficiency and reduce waste because staff members do not always accurately sort materials. In addition, both the environment and the economy will benefit from this. Thus, to better utilize and conserve resources, trash classification may make recycling and garbage disposal easier (Hoornweg and Bhada-Tata 2012; Cheng et al. 2023).

It is possible to increase recycling efforts by automating the waste classification process. Recycling waste efficiently benefits the environment and the economy. It can aid in recovering rare resources, conserving energy, minimizing greenhouse gas emissions and water pollution, creating new landfills, etc. Scrapers and collection agencies in developing countries perform home separation, selling repurposed materials for a profit. In developed countries, communities are more involved in reuse and recycling initiatives. In developed nations, various methods for automatic garbage sorting are available, including mechanical and chemical sorting. However, even in a developed nation, there is much room for improvement in garbage recycling and classification (Williams 2005; Kollikkathara et al. 2009; Al-Salem et al. 2009; Ni et al. 2023). Waste classification can now be done more quickly and accurately thanks to advances in deep learning and image processing algorithms. Deep learning represents a specialized branch of machine learning, employing intricate multi-layered neural networks to process and analyze data. It functions as a subset of machine learning (ML), which, in turn, operates within the broader landscape of artificial intelligence (AI). It is essential to recognize that while all ML is a facet of AI, not all AI is synonymous with ML, and this distinction similarly applies to deep learning. In various domains, including computer science, data analysis, software engineering, and AI, both ML and DL have played pivotal roles in driving substantial advancements. Exploring deep learning models remains an ever-evolving and promising field, with the potential to yield breakthroughs in critical applications such as network attack detection and classification, the proactive monitoring and prediction of pandemic diseases, and the development of automated face-based university attendance systems, among other exciting prospects (Ajagbe and Adigun 2023; Ojo et al. 2023; Adekunle et al. 2023).

Motivation and contributions

Classifying trash is a well-organized way to preserve the environment and maximize resource use. Deep learning algorithms can be trained to identify and categorize various kinds of waste in waste management, improving process efficiency and lowering costs. The core motivation for using deep learning models is that it allows for creating highly accurate models to make complex predictions or classifications based on large amounts of data. Furthermore, the deep learning model’s ability to acquire complex patterns automatically and make exact predictions has become essential to many modern waste classification systems. Because of this, in this work, we present a novel hybrid model using deep learning for classifying waste. The following are the main contributions of the suggested model:

  1. 1.

    An innovative hybrid model using deep learning for smart waste classification is proposed based on the proposed optimized DenseNet-121 and SVM.

  2. 2.

    An SVM model is used to classify waste based on features extracted from novel optimized DenseNet-121.

  3. 3.

    The TrashNet dataset is used to classify waste, and a method that augments data is applied to improve classification accuracy. Metal, glass, paper, cardboard, plastics, and paper are the six categories used for categorizing waste.

  4. 4.

    The proposed model’s efficacy is evaluated by comparing it with numerous benchmark models using widely used metrics such as accuracy, confusion matrix, F1-Score, precision, and recall.

  5. 5.

    According to the findings from experiments, the proposed model outperforms the most advanced related classification techniques on the TrashNet dataset and is very effective.

Layout of the paper

The remaining part of the article is arranged like this: “Related work” looks at the progress of related work for waste classification. “Methodology” discusses the methodology, specifically the description of waste datasets, techniques for data pre-processing, and the suggested model. Finally, “Performance and result discussion” includes the results and outcomes of the suggested deep learning-based classification model, and “Conclusion” wraps up the paper.

Related work

One of the essential components of today’s world is waste collection and recycling. Recycling waste is a critical issue in today’s society because of the depletion of resources from nature and the adverse environmental effects of increased waste production. Trash classification may make recycling and garbage disposal easier, which helps to utilize resources better and conserve them. Unfortunately, manual waste separation, on the other hand, puts workers’ health in danger, eats up time, and raises the cost of recycling. As a result, several academics have conducted studies on garbage categorization based on machine learning techniques (Sudha et al. 2016). These methods typically use images as the primary input for automatically classifying waste. TrashNet is one of the most widely used image datasets for waste classification. The TrashNet data has been tested with a variety of traditional machine learning algorithms. Authors Costa et al. (2018) and Yang and Thung (2016) achieved an accuracy level of 63 and 88% using the SVM algorithm. Authors Satvilkar (2018) classified garbage images from the TrashNet dataset using the XGBoost and random forest algorithms. Using CNN, random forest, k-nearest neighbor, and SVM models, the authors Sousa et al. (2019) classified waste into five types; using the CNN model achieved the highest level of classification accuracy (89.91%), and Using the SVM models, near-infrared reflectance spectroscopy, and principal component analysis, Authors in Zhu et al. (2019) created an identification method for chemicals classified into six types of plastic solid waste with a 97.5% classification accuracy. The authors Uganya et al. (2022) used machine learning methods of categorization, such as SVM, logistic regression, decision trees, and the random forest algorithm. The best accuracy on the above methods is from the random forest, which is 92.15%.

Authors Aral et al. (2018) used a range of transfer learning models (MobileNet, Densenet121, InceptionResnetV2, DenseNet169, and Xception) with Adadelta and Adam as the optimizer in neural network models to categorize trash from the TrashNet dataset. The experimental findings indicated that Adam provided better test accuracy than Adadelta, and with a 95% accuracy rating, the DenseNet121 model was the most accurate. In addition, in light of the small sample size of the Trashnet dataset, this study also performs data augmentation to improve classification accuracy. Authors Ruiz et al. (2019) used various CNN models and achieved the best results, with a mean accuracy of 88.66%. Several renowned CNN representations for image classification, including ImageNet (Krizhevsky et al. 2017), ResNet (He et al. 2016), VGG (Simonyan and Zisserman 2014), ResNext (Xie et al. 2017), and DenseNet (Huang et al. 2017), can also serve as fundamental models for trash classification. ResNext is the most effective method of transfer learning to classify trash out of the CNN models indicated above. Another study (Chu et al. 2018) suggested a deep learning-based hybrid model CNN (AlexNet)-MLP and achieved the best result with 98.2% from the TrashNet dataset. To enhance the prediction time performance improvement on CPU, the authors in Bircanoğlu et al. (2018) altered the skip connections’ association patterns inside dense blocks and called the proposed model RecycleNet. Even though RecycleNet’s accuracy was only 81 % accuracy to the TrashNet dataset, it did manage the number of parameters effectively, decreasing from 7 million to roughly 3 million. The authors Vo et al. (2019) created a DNN-TC framework based on the ResNeXt framework. The ordinary ResNeXt-101 model was altered to reduce redundancy by adding two fully connected layers, and this modification produced the best accuracy on the TrashNet dataset, 94%.

Recently, authors Alqahtani et al. (2020a) examined a Cuckoo Search Optimised Long Short Term Neural Networks-based strategy for intelligent city waste management. The maximum efficiency they were able to achieve was 98.4%, which outperformed other classifiers already in use, like the SVM and genetic method (95.97%), particle swarm optimization and artificial neural networks (96.76%), and ant colony algorithms with neural network structures (96.13%). Authors Alsubaei et al. (2022) classify waste objects utilizing the neural network approach. They claim an accuracy 98.61% on the Kaggle repository dataset, benchmark garbage classification. An intelligent Deep Reinforcement Learning (DRL)-based model was developed by authors in Al Duhayyim et al. (2022) for smart cities that identify and categorize recycling waste objects. Their classifier is a deep Q-learning network (DQN), with the reference model being DenseNet. A hyperparameter optimizer based on the dragonfly algorithm (DFA) is created to improve the performance of the DenseNet model further. They asserted that their waste classification was 99.3% accurate on the Kaggle repository dataset, benchmark garbage classification. In a different study (Ali et al. 2022), the authors used an artificial hummingbird algorithm to address feature-based garbage sorting, an enhanced variation of the novel meta-heuristic algorithm. With the data augmentation technique on the TrashNet dataset (10,108 waste images), they claim a 98.81 % accuracy rate.

According to the literature review, most smart waste classification methodologies use the DenseNet model for higher accuracy; therefore, in this work, we presented a novel hybrid model built upon modifying the DenseNet-121 model as optimized DenseNet-121 with SVM.

Methodology

This part explains the dataset used, how the data were prepared, and the specifics of the novel hybrid deep learning model suggested for intelligent waste sorting.

Dataset

The proposed research utilized the TrashNet dataset, a publicly available collection of carefully selected images curated for garbage or trash classification tasks. Its primary goal is to aid in developing and assessing machine learning models focused on automating waste classification. The TrashNet collection includes photos showing waste materials frequently found in daily life, including plastic bottles, cans, cardboard, paper, glass, and other household objects. The waste categories associated with each image in the dataset are labelled, allowing machine learning models to identify the various forms of rubbish reliably. TrashNet contains 2527 images of “Paper”, “Glass”, “Metal”, “Plastic”, “Cardboard”, and “Trash”. The photos were captured with a cell phone camera and natural or artificial lighting. The captured items were either positioned against a white background or took up the entire frame (cardboard). Each image is 512x384 pixels in size. Table 1 lists the contents of the TrashNet dataset, and Fig. 1 shows instances of images from the TrashNet dataset.

Table 1 TrashNet dataset (Aral et al. 2018)
Fig. 1
figure 1

Sample images of TrashNet dataset

Data pre-processing

Given that there are only a few images in the TrashNet dataset of recyclable trash therefore, we augmented the initial TrashNet dataset to create a large dataset similar to related work (Chu et al. 2018; Mao et al. 2021; Yuan and Liu 2022; Ali et al. 2022; Shi et al. 2021; Lin et al. 2022). The dataset augmentation produced 12,735 waste images by flipping, rotating 35 degrees, zooming, shearing, and shifting. The parameter of augmentation is listed in Table 2. Finally, the expanded dataset was divided into the testing and training sets, with each class randomly divided into 20% and 80% of the total, respectively.

Table 2 Parameters used in augmentation

Proposed model

Our suggested model is a hybrid deep learning model that combines Optimised DenseNet-121 and SVM. Figure 3 shows the detailed block diagram of the suggested model. We have changed the DenseNet-121 model and called it optimized DenseNet-121, which is used for feature-extracting from the augmented TrashNet image datasets. After extracting the feature, we used the support vector machine to classify the final output. The following subsections elaborate on the detailed workings of the proposed models:

Optimized DenseNet-121

CNNs have issues with “vanishing gradient” when they go deeper. This implies that as the path of information from the input to the output layers grows, it may result in some data “vanishing” or becoming lost, preventing the network from effective training. Densely Connected Convolutional Networks (DenseNets) can solve this issue of CNN by simplifying the connectivity pattern between layers. To do it, they link each layer directly to the next. DenseNets has \(N(N+1)/2\) direct connections for “N” layers, and the feature maps from all previous ones are joined together and used as inputs in every layer rather than added up. Hence, DenseNets involve fewer parameters than an old-style CNN, permitting feature reclaim because redundant feature maps are removed. The feature maps of the \(N^{th}\) layer are represented as follows:

$$\begin{aligned} X_N=H_N([X_0,...,X_{N-1}]) \end{aligned}$$
(1)

where output [\(X_0\),.., \(X_{N-1}\)] from all the previous N layers combined to create the features map in all previous layers. \(H_N(\cdot )\) is the nonlinear function (i.e., three operations: a rectified linear unit (ReLU), batch normalization, and convolution are performed one after the other) of every layer, and \([\cdot ]\) is each layer’s dense connections.

DenseNets uses dimensionality reduction to reduce feature maps to accelerate computation times. To accomplish this, DenseNets are segmented into dense blocks. The layers in the middle of two adjacent blocks are named transition layers, and they use convolution and pooling operations to downsample (i.e., alter the feature map’s size). In contrast, the dense block’s feature maps are all the same size, permitting feature concatenation. DenseNet-121 is a variant of the DenseNet family of models. It contains 121 feed-forward layers connected to one another, resulting in a highly connected network. DenseNet-121 has several advantages over earlier CNN models, including improved accuracy, fewer parameters, and better feature exploitation. The architecture of DenseNet-121 is shown in Fig. 2.

In our suggested optimized DenseNet-121 model, after the final Dense Blocks, we run parallel global max pooling and global average pooling operations. We took the smallest value of both above operations and then applied dropout, dense layers, and finally, SoftMax (CNN) for image classification. Figure 3 displays the details of our proposed hybrid model.

Fig. 2
figure 2

Architecture of DenseNet121 (Ji et al. 2019)

Fig. 3
figure 3

Proposed hybrid model

Support vector machine (SVM)

SVM is a supervised machine learning approach to solving regression and classification issues that applies to binary and situations with multiple classes. The SVM’s primary goal is to identify an optimal hyperplane that effectively divides data points into various classes while also maximizing the gap between them. To produce a robust and accurate classification, the SVM algorithm carefully positions this hyperplane to achieve the most significant possible separation between classes. This hyperplane serves as the decision boundary and is positioned to offer the greatest practical separation between the closest data points or support vectors. To accomplish this, the SVM increases the dimension of the input data’s feature space using an appropriate kernel function (linear or polynomial kernel, sigmoid kernel, radial basis function, etc.), which makes it easier to find a hyperplane that successfully divides the data. The number of features shapes the hyperplane’s dimension. The hyperplane is just a line when it receives just two features in the input. The hyperplane falls into a two-dimensional plane when it receives three input features. The data points known as support vectors are closer to the hyperplane, which affects the hyperplane’s position and location. Support vectors are used to raise the margin of the classifier, and their elimination causes a change in the hyperplane. These principles served as the foundation for the creation of our SVM. The SVM algorithm uses hinged loss as the loss function, which supports maximizing the margin and is therefore employed to widen the gap between the data points and the hyperplane. Hinged loss is defined as follows:

$$\begin{aligned} Hinged loss(\hat{Y}, Y)=max(0, 1 - \hat{Y} \cdot Y) \end{aligned}$$
(2)

The concept of SVM for linear separation lines is shown in Fig. 4.

Fig. 4
figure 4

Concept of SVM classifiers in linear separation (Meyer and Wien 2001)

Fig. 5
figure 5

Proposed model’s confusion matrix

Fig. 6
figure 6

Proposed model’s accuracy

Fig. 7
figure 7

Proposed model precision, recall, and F1-score matrix

Fig. 8
figure 8

Proposed model’s accuracy, precision, recall, and loss plots

Table 3 Comparison of the proposed model to existing models on TrashNet datasets
Table 4 Comparison of the proposed model to related waste classification works using the TrashNet dataset

Performance and result discussion

This section outlines the performance metrics that were employed, the findings of the experiments that were carried out using the proposed model, and a performance comparison with related works to demonstrate the model’s effectiveness.

Performance metrics

The effectiveness of the garbage categorization model can be assessed using a wide range of efficacy metrics. Accuracy, recall, confusion matrix, precision, AUC score, and F1-score are some of the performance metrics for classification that are frequently utilized.

  1. a

    Confusion matrix: The assessment of a machine learning model using test data is emphasized by a confusion matrix. This matrix displays the total number of false positives, true positives, false negatives, and true negatives based on test data. If there are two classes, the matrix will have a size of \(2\times 2\). If there are more than two classes, the matrix’s shape will be proportional to the number of classes. The following are the meanings of the matrix content:

    • False positives (FP): Circumstances in which the model wrongly forecasts the positive class rather than the negative one.

    • True positives (TP): True positives are instances where the model successfully estimates a positive class.

    • False negatives (FN): Circumstances in which the model mispredicts the negative class rather than the positive class.

    • True negatives (TN): Situations in which the model accurately predicts the negative class.

  2. b

    Accuracy: Accuracy is just the proportion of correctly predicted results to all results, i.e.

    $$\begin{aligned} Accuracy = \frac{TN+TP}{TN+TP+FN+FP} \end{aligned}$$
    (3)
  3. c

    Precision: The proportion of correctly positive estimates to all positively predicted observations is precision, also called the positive predicted value, i.e.,

    $$\begin{aligned} Precision = \frac{TP}{FP+TP} \end{aligned}$$
    (4)
  4. d

    Recall: Sensitivity is another name for it. It is the proportion of all predicted positive findings to all findings in the positive group, i.e.,

    $$\begin{aligned} Recall = \frac{TP}{FN+TP}\ \end{aligned}$$
    (5)
  5. e

    F1-Score: The F1-score is the harmonic mean of precision and recall, i.e.,

    $$\begin{aligned} F1-Score = \frac{2 \times Recall \times Precision}{Recall + Precision} \end{aligned}$$
    (6)

Result analysis

  1. a

    Model training: We implemented and trained the suggested waste classification model using Google Colab’s Keras deep learning API.

  2. b

    Confusion matrices: The proposed model is tested to generate confusion metrics for both the case of no data augmentation and the case of data augmentation. Figure 5 depicts the outcome for the same. According to the confusion matrix results, the proposed model correctly recognizes waste with overall high accuracy in the case of data augmentation.

  3. c

    Accuracy: The original DenseNet-121 had an accuracy of 79.45% without data augmentation. The original DenseNet-121 obtained 98.64% when data augmentation was used. Furthermore, to improve the accuracy of the DenseNet-121 model, we added a few fine-tuned, optimized classification layers in DenseNet-121 in addition to an SVM with the SoftMax activation function, which results in our novel hybrid model (called Optimised DenseNet-121 + SVM). First, we tested our model (Optimised DenseNet-121 + SVM) on the raw TrashNet dataset to validate its accuracy. Our model had been overfitted, and its accuracy was 85.17%. As a result, we used an approach to augment the data, which included photos of horizontal flipping, vertical flipping, and random 35-degree rotation. These data augmentation methods resulted in a total of 12,735 garbage photos. When carefully evaluated using the massive augmented TrashNet dataset, our proposed model achieves a remarkable accuracy of 99.84%. The proposed model accuracy before and after image augmentation is depicted in Fig. 6. It is clear from Fig. 6 that the accuracy is improved with data augmentation, which is higher than related existing work.

  4. d

    Recall, Precision, and F1-score: We have also calculated our proposed model recall, precision, and F1-score without data augmentation and with data augmentation. Figure 7 shows the proposed model’s recall, precision, and F1-score values in every class. Furthermore, Fig. 8 also visualizes the suggested model’s recall, precision, accuracy, and loss with epochs.

  5. e

    Comparative analysis: The proposed model is evaluated against a set of related, existing models using TrashNet datasets. Table 3 displays the accuracy comparison between the proposed models and relevant existing models using the TrashNet dataset. Table 3 shows that the best accuracy of the proposed model after image augmentation is 99.84%, which is the best among the other mentioned models.

  6. f

    Comparison with related works: We also compared the proposed work to related waste classification works to assess its effectiveness further. This comparison is depicted in Table 4. According to Table 4, the proposed model accuracy on the TrashNet dataset with image augmentation is 99.84% after 40 epochs of training. In contrast, other existing models on the same dataset have less accuracy after the same 40 epochs of training.

Conclusion

Integration of densely connected convolutional networks (DenseNet-121) into trash classification systems significantly aids in the development of waste management techniques. Automated systems powered by DenseNet-121 can better classify, recycle, and dispose of waste by extracting features from various waste products more efficiently. So, using a hybrid deep learning model made up of the proposed novel Optimised DenseNet-121 and Support Vector Machine (SVM), we present in this paper a novel waste classification model that successfully distinguishes between various waste categories mentioned in TrashNet datasets. The proposed model reduces the need for human intervention, reduces the risk of contamination, and protects the environment by automating waste classification. Furthermore, the highest accuracy of the suggested model through image augmentation is 99.84%, which is higher than any other existing models on TrashNet datasets. Our future work will improve the system’s ability to classify a broader range of waste products.