Keywords

1 Introduction

The rapid development of Artificial Intelligence (AI) and the supporting domain including Big Data and high-performing computational infrastructure has triggered a tectonic shift. The substantial refinement of deep learning-based systems including foundation models(e.g. Transformer, GPT-4, Bard, DALL-E, RoBERTa, etc.) has equipped the AI-based systems to penetrate high-stake applications. These applications span across various critical domains, including healthcare, finance, law enforcement, and agriculture [35, 39]. This speedy penetration has sociotechnical, privacy, and safety implications associated with these intelligent systems and is categorized as an existential threat to the human race. One of the key contributions to the scepticism is the non-transparent nature of these models. The opaqueness of the ML algorithms restricts trust and resists deployment in vulnerable domains.

AI is defined as a set of approaches to mimic human behaviour in general whereas by and large, Machine Learning (ML) algorithms are predictive models using existing data features to build class mapping during the learning phase. This learning phase is based on the data retrieved from the usual user activities (e.g. online shopping, medical history, social interactions, customer profiles, etc.). This enormous amount of data is liable to contain human biases and predispositions. So the decision models also have the innate ability to have presumptions on the learning which can lead to wrong decisions. As black box models are extensively developed and tested on huge datasets, numerous stakeholders emphasize system transparency [23, 24]. In general, people are restrained from using techniques that are not justifiable and transparent, which results in demanding ethical AI [38]. The increasing complexity of these opaque systems enhances performance, yet they still lack transparency. While designing and developing the ML model, keeping transparency as a design consideration drive impartial decision-making and can guarantee to use of important variables to generate the model predictions.

As a consequence, Explainable Artificial Intelligence (XAI), an emerging frontier of AI, is pertinent due to its ability to help answer the raised concerns and mitigate the associated risks. XAI gives a suite of ML techniques to generate an explainable model and develop a trustworthy human-understandable system. Various communities are using explanations for Model decisions. The objectives and perspectives of the developed various XAI system vary as per the need for explanations. The contribution of this paper is summarized below. This paper begins by providing an overview of the key requirements in the field and subsequently conducts a comprehensive review of explainable artificial intelligence (XAI) approaches for the machine learning (ML) pipeline. The review specifically focuses on different stages of the ML lifecycle to analyze and evaluate the effectiveness of XAI techniques. This is an extension of our previous survey on XAI techniques [16].

The paper is organized as follows. In Sect. 2, we discuss the terminology of the domain. Following that, Sect. 3 contributes to providing the overview of the approaches in XAI, and the evaluation and discussion on these approaches are presented in Sect. 4. Lastly, the conclusion is highlighted in Sect. 5.

2 Desiderata for Explainable Artificial Intelligence

As the domain is emerging and we don’t have enough context, some research works [6, 21], are using the terms interpretability and explainability interchangeably, preventing the creation of common grounds. There is communal agreement on the necessity for a formal definition of the nomenclature [1, 14]. In this section, we will outline the definitions that we adopt to comprehend the techniques within the domain. This understanding will allow us to assess the capabilities of the explanation system and evaluate its alignment with the responsible AI framework.

Explainability has varying definitions; for this study, we refer to the definition: “Explainability aims to bridge the gap between the algorithm and human interpretation of the decision-making process. It is capable of enhancing trust by answering the how and why of the system” [28].

Interpretability from a user-centric view, is the human-intelligible explanation of the model system output [7]. For different users understanding of the system varies. A point of caution is reliance on human evaluations can lead to persuasive systems rather than transparent systems. This limits the ability to define the appropriate scope of interpretation. The selected terminology and the definitions are listed in Table 1.

Table 1. This table elucidates essential terminology within the XAI literature, serving as a reference for clarifying the concepts explored in this study.

3 Explainable Artificial Intelligence Methods

These days ML tends deep learning systems are complex and comprise many more layers and huge training data and have achieved high accuracy [28]. Despite these successes, the associated risks are a reluctance to adopt these models. The challenge of the hour is to develop trustworthy systems with these complex systems with escalating performance to help the user to understand the why of the decision to mitigate the associated risks. Now the next challenge is building trustworthy systems. So now with this development, we have achieved high accuracy. There are numerous surveys on XAI [1, 5, 7, 13, 14, 21] and explainable deep learning [25, 41]. All of these surveys cover a large body of work in different dimensions. A standard ML pipeline consists of several phases as shown in Fig. 1. In this paper, We will analyze each phase and will discuss the XAI approaches in the literature to address the various problems of each stage. As various stakeholders are involved in the different stages of the pipeline. The appropriate explanation to answer the various questions associated with the different phases of the ML pipeline.

Fig. 1.
figure 1

A snapshot of a standard machine learning lifecycle.

3.1 Data Collection and Preparation

The foremost and most crucial stage of the ML lifecycle. It refers to the process of gathering and organizing data for utilization in machine-learning applications. This is a multi-step phase to ensure data relevance, reliability, and accuracy of the data using data cleaning, transformation, and integration techniques. The generation of explanations for data sources can address different aspects. We will classify these aspects into two categories: i) Detection of Data Bias, and iii) Annotation and Labeling of Data.

Detection of Data Bias. Counterfactual explanations often take the form of statements such as, “You were denied a loan due to an annual income of X. If your income had been X+Y, you would have been approved for the loan” [37]. The goal of a counterfactual statement is to identify the smallest modification in feature values that can yield the desired prediction.

Annotation and Labeling of Data. XAI can play a role in data annotation by providing transparency and interpretability to the annotation process. Forward Propagation-Based study [10] retrieved the feature importance with perturbation and identified the accountable mask for the results. Altering or blurring these salient features directly impacts the original classification outcome. These predictions can then be analyzed and interpreted using various XAI techniques to gain insights into the decision-making process of the model and understand the factors influencing the predictions.

3.2 Feature Engineering and Selection

Feature Interaction Analysis. Saliency Based Approaches highlight the importance of the region in the systems. These methods employ saliency maps to comprehend the contribution and significance of features in specific decisions. Visualization support aids in facilitating the comprehension of a diverse audience, allowing them to discern which feature influenced the decision. Prominent approaches proposed to calculate salience maps includes Integrated Gradients [34], SmoothGrad [32].

Outlier Detection. Automatic rule extraction Decompositional approaches work on the neurons to mimic the rules from the network architecture. Studies on transforming the neural network to the fuzzy rule are also available, the main work is on the extraction of approximation from the neurons [43]. Rule extraction techniques are valuable for identifying behavioural patterns, despite not being completely faithful to the models. As a result, there’s a requirement for further research on explainability to address this limitation. Adversarial Examples will provide the interpretable model understanding. Most approaches suggest reducing the gap between the antagonistic example and the instance to be controlled while adjusting the prognosis to the desired outcome of the system. This method allows for diagnosing the outlier in the data [40].

3.3 Model Training

Propagation Based Approaches supports the identification of important regions. The output of the model is feedback to the system. The robustness of the system helps to understand the stability of the decision. Backpropagation-based methods take derivatives of the output w.r.t. input and the system gradients. These approaches are intrinsic as they are based on the important regions and rely on the model structure to understand the important region. DeepLift [31], an example approach assigns the positive and negative contributions to the features based on the actual and retrieved output difference.

3.4 Model Evaluation

One of the accepted convenient ways to explain networks is to develop proxy models that can approximate the behavior of these networks, such methods are referred to as ’model-agnostic’ approaches. Ribeiro et al. have exemplified the proxy model by developing LIME(Local Interpretable Model-agnostic Explanations) [26]. Whereas SHAP(SHapely Additive exPlanations) method [18] supports interpretability by assigning feature scores to each attribute. The underlying mechanism of these models is to use the input to generate the linear proxy model to predict the behaviour by probing and perturbation. These models can be evaluated for their faithfulness to the original models. Decision Trees, another accepted type of proxy model, generate insights of Neural networks in equivalent Decision Trees [4]. The tree equivalence holds for fully connected, convolution, recurrent, and activation layers to satisfy the faithfulness. Although the generated trees are faithful these computations are computationally expensive and take substantial computational resources and time to develop.

Attention Mechanisms are neural networks that learn to assign weights to inputs or internal features, enabling them to focus on relevant information. These approaches have demonstrated incredible success in various complex tasks [36]. While attention units are not explicitly trained to generate human-readable explanations, they inherently provide a map of the information flow within the network, which can be interpreted as a form of explanation.

Knowledge Infused Explanations are general knowledge and knowledge-based methods. General Knowledge is generally referred to as a body of information or facts acquired through intellectual processes and diversity is the key trait. In this section, we will analyze the XAI techniques employing general knowledge for enrichment. Kim et al. [17] utilize the Concept Activation Vector(CAV) and analyze the importance of a concept in the task.

Knowledge-base (KB) Methods are deployed to enrich the model with human knowledge using available corpora explanations, which are suitable for specific situations. The KBs are generally represented as Knowledge Graphs(KG). The knowledge graph can be employed in the model design to enrich the feature entities and system rules to improve the model performance and explain the decisions. This is a known strategy for the recommender systems [19] to enrich the relation to identifying the similarity. A knowledge-based system can be employed after modeling to enhance reasoning, potentially through abductive reasoning, by utilizing its knowledge base to provide richer explanations [12, 29].

3.5 Model Optimization

Gradient-based explanation helps to understand the vector representation at the system and the intermediate layer level. These layer-based vector representations support the ’transfer learning’ mechanism for the other problems to learn from the underlying vector representation. These are not the real explanation of the system, these are intermediate vectors for the system to understand and resolve similar problem understanding and learning patterns. CAM [42], and GradCAM [30] are to name a few.

Explanation Generation Systems are designed to generate their own explanation units. The primary working principle is to generate the ‘because’ part with the model decision-making. These explanations are not directly interpretable and need proper evaluation to have faith in the model-generated answers. TEXTVQA [22], an extension of VQA a system that generates the multimodel explanation for the image captioning system can be visual as well as textual to help to understand why a certain caption is generated for the visual.

3.6 Deployment

The explanation without the training data can be categorized into human collaboration and policy abstraction-based binary categories.

Human Alliance: The studies [3, 9] proposed methods to automatically build the explanation corpus for the system agents to guide humans. The network learns to translate the action to the natural language generation. These beginning steps to experiential studies and exploring the machine learning pattern leads to a formal evaluation of the explanations to provide the information related to the events based on the experience gained throughout the processes.

Policy Abstraction highlights the policy information from the experience of the player. The generated summary is capable to enrich the context to understand the explanation of a specific action in the circumstances. A few relevant studies are [3, 33] the former study proposes the framework for abstraction and the latter support various abstraction levels for the same to be used in the system for the following action plan generation.

3.7 Monitoring and Maintenance

XAI can help detect and understand data drift by supplying explanations for model predictions, identifying changes in feature importance, and monitoring shifts in underlying patterns, enhancing the model’s adaptability. incremental Permutation Feature Importance(iPFI) on the interpretation of the complex feature is proposed by [11].

4 Discussion

The utilization of XAI techniques is crucial for algorithm transparency and interpreting model decisions, leading to improvements in the machine learning lifecycle. These techniques aid in understanding the decision-making process of models at different stages of the pipeline and enable achieving transparency objectives based on system requirements. We will summarize the approaches at the various stages of the ML pipeline in Table 2. Applying XAI techniques can indeed provide numerous benefits across various aspects of machine learning models. Through XAI, data quality can be evaluated, quantified, and remedied. Explanations aid in data selection, identifying valuable information for improvement, and enabling adjustments to ensure fairness and equity in the model’s performance. XAI techniques have the tendency to support data discretization and feature interaction analysis. The integration of XAI with data discretization and feature interaction analysis enhances our understanding and enhances the reliability of the machine learning models. Moreover, XAI techniques contribute to privacy protection. XAI techniques allow for the interpretation of models without compromising individual privacy. In conclusion, applications of XAI techniques bring several benefits to machine learning models. XAI techniques enhance transparency, fairness, and reliability. By leveraging XAI, organizations can make informed decisions, address ethical concerns, and build robust and trustworthy AI systems.

Table 2. An overview of selective XAI approaches across various stages of the ML pipeline.

5 Conclusion and Future Directions

In this paper, a comprehensive and systematic review of the development of XAI approaches for the machine learning pipeline is presented. XAI posed several challenges, from complex infrastructure to computational cost, but the strategic choices of the explanation techniques with the defined objective are beneficial and can mitigate the risks associated with the high-stake application. Deploying the right approach for the explanation of the model decisions can not only enrich the business processes but help to build faith in the system results. In these times of generative AI and the foundation models, systems are suffering from the inaccessibility to system understanding, and XAI can fill the gap in human understanding and model decision-making in high stake decisions. The explanation generation and evaluation framework for the machine learning pipeline can strengthen the downstream applications even derived from the foundation models. Appropriate XAI techniques with the relevant metrics for computational and cognitive evaluations of the model are a key step to proceed.