Abstract
Recently, few methods for understanding machine learning model’s outputs have been developed. SHAP and LIME are two well-known examples of these methods. They provide individual explanations based on feature importance for each instance. While remarkable scores have been achieved for individual explanations, understanding the model’s decisions globally remains a complex task. Methods like LIME were extended to face this complexity by using individual explanations. In this approach, the problem was expressed as a submodular optimization problem. This algorithm is a bottom-up method aiming at providing a global explanation. It consists of picking a group of individual explanations which illustrate the global behavior of the model and avoid redundancy. In this paper, we propose CoSP (Co-Selection Pick) framework that allows a global explainability of any black-box model by selecting individual explanations based on a similarity preserving approach. Unlike submodular optimization, in our method the problem is considered as a co-selection task. This approach achieves a co-selection of instances and features over the explanations provided by any explainer. The proposed framework is more generic given that it is possible to make the co-selection either in supervised or unsupervised scenarios and also over explanations provided by any local explainer. Preliminary experimental results are made to validate our proposal.
Access provided by Autonomous University of Puebla. Download conference paper PDF
Similar content being viewed by others
Keywords
1 Introduction
Nowadays, a wide range of real-life applications such as computer vision [5, 11], speech processing, natural language understanding [6], health [14], and military fields [2, 4] make use of Machine Learning (ML) models for decision making or prediction/classification purpose. However, those models are often implemented as black boxes which make their predictions difficult to understand for humans. This nature of ML-models limits their adoption and practical applicability in many real world domains and affect the human trust in them. Making ML-models more explainable and transparent is currently a trending topic in data science and artificial intelligence fields which attracts the interest of several researchers.
Explainable AI (XAI) refers to the tools, methods, and techniques that can be used to make the behavior and predictions of ML models to be understandable to human [3]. Thus, the higher the interpretability/explainability of a ML model, the easier it is for someone to comprehend why certain decisions or predictions have been made.
Multiple interpretability approaches are based on additive models where the prediction is a sum of individual marginal effects like feature contribution [16], where a value (denoting the influence on the output) is assigned to each feature. One of the latest proposed methods is based on mathematical Shapeley Values and was introduced by Scott et al. [9] as SHAP (for SHapley Additive exPlanations). It relies on combining ideas from cooperative game theory and local explanations [8].
Let us also mention the LIME (Local Interpretable Model-agnostic Explanations) method which is one of the most famous local explainable models [13].
LIME explains individual predictions of any classifier or regressor in a faithful and intelligible way, by approximating them locally with an interpretable model (e.g., linear models, decision trees). However, having a global explanation of the model can be challenging as it is more complicated to maintain a good fidelity - interpretability trade off. To this end, authors in [13] proposed an approach, called submodular Pick which is an algorithm aiming to maximize a coverage function of total feature importance for a set of instances. While maximizing the coverage function is NP-Hard, authors make use of a greedy algorithm which adds iteratively instances with the highest marginal coverage to the solution set, offering a constant-factor approximation to the optimum. The selected set is the most representative, non-redundant individual explanations of the model.
In this paper, our aim is to introduce a new approach to select individual instances (explanations) to be considered for global explanation to ensure that the picked group reflects the global behavior of the black-box model. Unlike submodular optimization proposed in [13], we advocate to consider the problem of picking representative instances as a co-selection task. The idea is to apply a similarity preserving co-selection approach to select a set of instances and features on the explanations provided by any explainer.
The paper is structured as follows. Section 2 provides a necessary background on LIME method. In Sect. 3, we present our approach allowing for a global explanation of black box ML models. Section 4 shows the preliminary experiments done to validate our proposal. In Sect. 5, we conclude the paper and draw some research lines for future work.
2 Background on LIME
Interpretability of ML models reflects the ability to provide meaning in understandable terms to human. It is crucial to trust the system and get insights based on its decisions. Quality of an explanation could be improved by making it more Interpretable, Faithful, and model-agnostic [12]. Faithfulness represents how the explanation is describing the reality of the model. Model-agnostic methods are used for any type of model. LIME introduced by Ribeiro et al [13], is one of the well-known examples of such methods. It is a framework which explains a prediction by approximating it locally using an interpretable model (Algorithm 1).
The basic idea of LIME is to replace a data instance x by its interpretable representations \(x'\) thanks to a mapping function \(\varPhi (x)\). For example, an image will be represented as a group of super-pixels, a text as binary vectors indicating the presence or the absence of a word. The interpretable representations are more easily understandable and close to human intuition. Then, \(x'\) is perturbed to generate a set of new instances. The black box model is used to make predictions of generated instances from \(x'\) which are weighted according to their dissimilarity with \(x'\). Now, for the explanation purpose, an interpretable model, such as linear models, is trained on weighted data to explain prediction locally.
2.1 LIME: Fidelity-Interpretability Trade-off
Authors in [13] define an explanation as a model \( g \in G\), where G is a class of potentially interpretable models (e.g., linear models, decision trees). Let \( \varOmega {(g)} \) be a measure of complexity (as opposed to interpretability) of the explanation g. For example, for linear models \(\varOmega (g)\) may be the number of non-zero weights. The model being explained is denoted by \(f : \mathbb R^d \longrightarrow \mathbb R\). Let now \(\pi _x\) defines a locality around x and \(\mathcal {L}(f; g; x) \) be a measure of how unfaithful g is in approximating f in the locality \(\pi _x\). The explanation produced by LIME is then obtained by the following minimization problem [13]:
2.2 Explaining Global Behavior
LIME explains a single prediction locally. Then, it picks K explanations which “must be representative" to show to the user. The Submodular Pick explained in Algorithm 2 is used to choose instances to be inspected for global understanding. The quality of selected instances is critical to get insights from the model in a reasonable time.
Let \(\mathrm {\textbf{X}}\) (with \(|\mathrm {\textbf{X}}|= n\)) be the set of instances to explain, Algorithm 2 calculates \(\mathrm {\textbf{W}}\in \mathbb R^{n \times d'}\) an explanation matrix using each individual explanation given by Algorithm 1. Then, it computes \( (I_{j})\) global feature importance for each column j in W, such that the highest importance score is given to the feature explaining an important number of different instances. Submodular Pick aims then at finding the set of instances V, \(|V| < \mathrm {\textbf{B}}\) that scores the highest coverage, defined as the function which calculates total importance of features in at least one instance. Finally, greedy algorithm is used to build V by adding the instance with highest marginal coverage gain.
3 Proposed Approach
The approach we propose in this paper consists of two sequential phases (see Fig. 1). The first is to use LIME (without loss of generality, any other explainer can be used) to obtain the explanations of the predictions for the test data. While the second phase focuses on global explainability by co-selecting the most important test instances and features. Thus, we provide a global understanding of the black-box model.
3.1 Explanation Space
Let f be a black box model, and \(\mathrm {\textbf{X}}\) a test dataset of n instances and \(\varPhi (\mathrm {\textbf{X}})=\mathrm {\mathbf {X'}}\) its interpretable representation in \(\mathbb R^p\). First, to obtain an individual explanation of the prediction made by f for each instance \(x_i\) we use LIME by fitting a linear model on a generated dataset around \({x'}_{i}\), the interpretable representation of \(x_i\). Thus, for each instance \(x_i\), we obtain an explanation of length k (\(k < p\)). It is worthy to note that the length is a parameter set by the user and corresponds to the number of features retained.
Once the individual explanations have been obtained, we construct an explanation space represented by \(\mathrm {\textbf{E}}\in \mathbb R^{n\times m}\), where the dimension m of the explanations space corresponds to the union of the k features of each explanation. We illustrate this step with the following example:
Example 1
Let \(\mathrm {\mathbf {X'}}\) be the interpretable representation of 3 instances in \(\mathbb R^{500}\), and \(k=3\) be the length of the explanation desired for these three instances. By performing LIME algorithm on \(\mathrm {\mathbf {X'}}\), we obtain 3 explanations of length 3:
where \(e_1,e_2\), and \(e_3\) are the explanations of \(x'_1\), \(x'_2\) and \(x'_3\) respectively. Thus, the matrix \(\mathrm {\textbf{E}}\in \mathbb R^{3\times 7}\) can be seen as the concatenation of all the explanations and the union of the set of features obtained by each explanation. Note that the dimension m here is equal to 7.
3.2 Global Explicability by Co-selection
Understanding the model’s decisions globally remains a complex task. In fact, some approaches like LIME were extended to face this complexity by only picking a group of individual explanations. In this paper, we advocate a method allowing global explainability by co-selecting the most important instances and features over the explanations provided by any explainer. The idea is to find a residual matrix \(\mathrm {\textbf{R}}\) and a transformation matrix \(\mathrm {\textbf{W}}\), which transforms high-dimensional explanations data \(\mathrm {\textbf{E}}\) to low dimensional data \(\mathrm {\textbf{E}}\mathrm {\textbf{W}}\), to maximize the global similarity between \(\mathrm {\textbf{E}}\) and \(\mathrm {\textbf{E}}\mathrm {\textbf{W}}\). After the optimal \(\mathrm {\textbf{W}}\) and \(\mathrm {\textbf{R}}\) have been obtained, the original features and instances are ranked, based on the \(\ell _{2,1}\)-norm values of the rows of \(\mathrm {\textbf{R}}\) and \(\mathrm {\textbf{W}}\), and the top features and instance are selected accordingly.
3.3 Notation
First, we present the notation we use in this paper. Let \(\mathrm {\textbf{E}}\) be an explanation matrix of n instances and m features. The \(l_{2,1}\)-norm of \(\mathrm {\textbf{E}}\) is:
and its Frobenius norm (\(l_{2,2}\)) is:
3.4 Co-Selection Pick (CoSP)
To perform a co-selection of instances and features on the explanations matrix, we must minimize the following problem as pointed out in [1]:
where:
-
Z is the eigen-decomposition of the pairwise similarity matrix, \(\mathrm {\textbf{A}}\), computed over the explanation matrix \(\mathrm {\textbf{E}}\). Note that the similarity matrix \(\mathrm {\textbf{A}}\) can be calculated in supervised fashion (e.g. adjacency matrix, fully binary matrix) if the labels of test instances are available, or in unsupervised mode as follows:
$$\begin{aligned} \mathrm {\textbf{A}}_{ij} = e^{-\frac{\Vert e_i-e_j \Vert ^2}{2\delta ^2}} \end{aligned}$$(6) -
\(\mathrm {\textbf{R}}={\textbf {W}}^T\mathrm {\textbf{E}}^T-{\textbf {Z}}^T-\varTheta .\), is a residual matrix and \(\varTheta \) is a random matrix, usually assumed to be multi-dimensional normal distribution [15]. Note that the matrix \(\mathrm {\textbf{R}}\) is a good indicator of outliers and less important and irrelevant instances in a dataset according to [17, 18].
-
\(\lambda \) and \(\beta \) are regularization parameters, used to control the sparsity of \(\mathrm {\textbf{W}}\) and \(\mathrm {\textbf{R}}\) respectively; and \(\delta \) is a parameter for the RBF kernel used to compute the matrix \(\mathrm {\textbf{A}}\) in the unsupervised mode in Eq. (6).
The first term of the objective in Eq. (5) exploits the \(\mathrm {\textbf{E}}\) structure by preserving the pairwise explanations similarity while the second and third terms are used to perform feature selection and instance selection, respectively. In order to minimize Eq. (5), we adopt an alternating optimization over \(\mathrm {\textbf{W}}\) and \(\mathrm {\textbf{R}}\) as in [1], by solving two reduced minimization problems:
Problem 1: Minimizing Eq. (5) by fixing \(\mathrm {\textbf{R}}\) to compute \(\mathrm {\textbf{W}}\) (for feature selection). To solve this problem, we consider the lagrangian function of Eq. (5):
Then, we calculate the derivative of \(\mathcal {L}_W\) w.r.t \(\mathrm {\textbf{W}}\):
where \(\mathcal {D}_W\) is a (\(m\times m\)) diagonal matrix with the \(i^{th}\) element equal to \(\frac{1}{2\parallel {\textbf {W}}(i,:)\parallel _2 }\). Subsequently, we set the derivative to zero to update \(\mathrm {\textbf{W}}\):
Problem 2: Minimizing Eq. (5) by fixing \(\mathrm {\textbf{W}}\) to compute the solution for \(\mathrm {\textbf{R}}\) (for explanation selection). To solve this problem, we consider the Lagrangian function of Eq. (5):
Then, we calculate the derivative of \(\mathcal {L}_\mathrm {\textbf{R}}\) w.r.t \(\mathrm {\textbf{R}}\):
where \(\mathcal {D}_\mathrm {\textbf{R}}\) is a (\(n\times n\)) diagonal matrix with the \(i^{th}\) element equal to \(\frac{1}{2\parallel \mathrm {\textbf{R}}^T(i,:)\parallel _2 }\).
Subsequently, we set the derivative to zero to update B:
where I is a (\(n\times n\)) identity matrix. All of the above developments are summarized on Algorithm 3.
3.5 Algorithm Analysis
In the Algorithm 3, the final user expects a selection of \(\mathrm {\textbf{B}}\) instances (e.g., explanations) and \(\mathrm {\textbf{L}}\) features which are most relevant to provide global explanation of the model. In order to achieve this, CoSP requires four hyper-parameters \(\lambda \), \(\beta \), \(\delta \) and h that will be used later on to build the set of chosen instances and features. Firstly, we build the explanations matrix \(\mathrm {\textbf{E}}\) using any explainer, in our case we use LIME. Secondly, we compute the similarity matrix \(\mathrm {\textbf{A}}\) either in supervised mode (as adjacency matrix or a binary matrix) or in an unsupervised way according to the availability of the labels of the test instances \(\mathrm {\textbf{X}}\). Then, we eigen-decompose \(\mathrm {\textbf{A}}\) to find Z. From line 9 to line 13 \(\mathrm {\textbf{W}}\) and \(\mathrm {\textbf{R}}\) are updated until convergence according to Eqs. (9) and (12). Following the alternate optimization, we rank the instances and the features according to \(\mathrm {\textbf{R}}\) and \(\mathrm {\textbf{W}}\) respectively. So, the higher the norm of \(\parallel {\textbf {R}}(:,j)\parallel _2\), the more the \(j^{th}\) explanation is not representative, while the higher the norm \(\parallel {\textbf {W}}(i,:)\parallel _2\), the more the \(i^{th}\) feature is important.
4 Experiments
In this section, we conduct some experiments to validate our frameworkFootnote 1 on some known sentiment datasets.
4.1 Datasets
We use a binary sentimental classification dataset. Sentimental analysis is the task of analyzing people’s opinions, reviews, and comments presented as textual data. It gives intuition about different points of view and feedback by detecting relevant words used to express specific sentiments [10]. Today, companies rely on sentimental analysis to improve their strategy. People’s opinions are collected from different sources like Facebook, Tweets, product reviews and processed in order to understand customer’s needs and improve marketing plans. When the sentiment is divided into positive and negative ones, it is called binary sentimental analysis which is the most common type and the one used in our case. While multi-class sentiment analysis classifies text into groups of possible labels. We use multi-Domain Sentiment DatasetFootnote 2, which contains multiple domains reviews (books and dvd) from Amazon.com, where for each type of product there are hundred of thousands of collected reviews. Then, we use an experiment introduced in [13] which aims to evaluate if explanations could help a simulated user to recognize the best model from a group of models having the same accuracy on validation set. In the order to do this, a new dataset will be generated by adding 10 artificial features to the train and validation set from original public dataset (reviews). For the train examples, each of those features appears in 10% of instances in one class and in 20% of the other class. In the test examples, an artificial feature appears in 10% of examples in both classes. This represents the case of having spurious correlations in the data introduced by none informative features.
4.2 Evaluation and Results
We train pairs of classifiers until their validation accuracy is within 0.1% of each other. However, their test accuracy should differ by at least 5% which will make one classifier better than the other. Then, we explain global behaviors of both classifiers using our proposed approach CoSP.
To validate our approach, we use the same experimental setting introduced in [7] by selecting top five important features per class chosen as most relevant ones to be considered for the classification task. Global approach is validated if it selects distinguishing features. Results shown in Figs. 3 and 4 were produced by applying CoSP with its hyper-parameters: \(\lambda \approx 2.11\), \(\beta \approx 61.79\), \(\delta = 1\) and \(h = 17000\) (which stands for the number of features selected by CoSP). First, the displayed perception contains words that are meaningful in order to judge the type of comment. Features are aligned with human intuition and words with no representative meaning like stop words were not selected. Second, noisy features labeled with prefix “FAKE" added to the dataset were not deemed important.
5 Conclusion
In this paper, we presented CoSP, a generic framework aiming to select individual instances in order to provide global explanation for machine learning models. We used Co-selection based on similarity as foundation to build global understanding of the black box internal logic over any local explainer. Furthermore, we conducted some experiments showing that CoSP offers representative insights. This study is a another step towards understanding machine learning models globally. For future work, we would like to explore this methods in the context of time series data, as it is a challenging to find representative illustration for this type of data.
References
Benabdeslem, K., Mansouri, D.E.K., Makkhongkaew, R.: sCOs: semi-supervised co-selection by a similarity preserving approach. IEEE Trans. Knowl. Data Eng. 34(6), 2899–2911 (2022). https://doi.org/10.1109/TKDE.2020.3014262
Bistron, M., Piotrowski, Z.: Artificial intelligence applications in military systems and their influence on sense of security of citizens. Electronics 10(7) (2021). https://www.mdpi.com/2079-9292/10/7/871
Guidotti, R., Monreale, A., Ruggieri, S., Turini, F., Giannotti, F., Pedreschi, D.: A survey of methods for explaining black box models. ACM Comput. Surv. 51(5) (2018)
Gunning, D., Aha, D.: DARPA’s explainable artificial intelligence (XAI) program. AI Mag. 40(2), 44–58 (2019)
Holm, E.A., et al.: Overview: computer vision and machine learning for microstructural characterization and analysis. CoRR abs/2005.14260 (2020). https://doi.org/10.1007/s11661-020-06008-4
Kłosowski, P.: Deep learning for natural language processing and language modelling. In: 2018 Signal Processing: Algorithms, Architectures, Arrangements, and Applications (SPA), pp. 223–228 (2018). https://doi.org/10.23919/SPA.2018.8563389
Linden, I.V.D., Haned, H., Kanoulas, E.: Global aggregations of local explanations for black box models. CoRR abs/1907.03039 (2019). https://arxiv.org/abs/1907.03039
Lundberg, S., et al.: Explainable AI for trees: from local explanations to global understanding. ArXiv abs/1905.04610 (2019)
Lundberg, S., Lee, S.: A unified approach to interpreting model predictions. In: Advances in Neural Information Processing Systems, pp. 4765–4774 (2017)
Minaee, S., Kalchbrenner, N., Cambria, E., Nikzad, N., Chenaghlu, M., Gao, J.: Deep learning-based text classification. ACM Comput. Surv. (CSUR) 54, 1–40 (2021)
Mohaghegh, F., Murthy, J.: Machine learning and computer vision techniques to predict thermal properties of particulate composites. CoRR abs/2010.01968 (2020). https://arxiv.org/abs/2010.01968
Ribeiro, M., Singh, S., Guestrin, C.: Fairness, accountability, and transparency in machine learning, paper ‘why should i trust you?’ Explaining the predictions of any classifier (2016). https://www.fatml.org/schedule/2016/presentation/why-should-i-trust-you-explaining-predictions
Ribeiro, M., Singh, S., Guestrin, C.: “Why should I trust you?": Explaining the predictions of any classifier. In: Krishnapuram, B., et al. (eds.) Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, San Francisco, CA, USA, 13–17 August 2016, pp. 1135–1144. ACM (2016). https://doi.org/10.1145/2939672.2939778
Shailaja, K., Seetharamulu, B., Jabbar, M.A.: Machine learning in healthcare: a review. In: 2018 Second International Conference on Electronics, Communication and Aerospace Technology (ICECA), pp. 910–914 (2018). https://doi.org/10.1109/ICECA.2018.8474918
She, Y., Owen, A.B.: Outlier detection using nonconvex penalized regression. CoRR abs/1006.2592 (2010). https://arxiv.org/abs/1006.2592
Štrumbelj, E., Kononenko, I.: Explaining prediction models and individual predictions with feature contributions. Knowl. Inf. Syst. 41(3), 647–665 (2013). https://doi.org/10.1007/s10115-013-0679-x
Tang, J., Liu, H.: CoSelect: feature selection with instance selection for social media data. In: Proceedings of the 13th SIAM International Conference on Data Mining, 2–4 May 2013. Austin, Texas, USA, pp. 695–703. SIAM (2013)
Tong, H., Lin, C.: Non-negative residual matrix factorization with application to graph anomaly detection. In: Proceedings of the Eleventh SIAM International Conference on Data Mining, SDM 2011, 28–30 April 2011, Mesa, Arizona, USA, pp. 143–153. SIAM/Omnipress (2011)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2022 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Meddahi, K. et al. (2022). Towards a Co-selection Approach for a Global Explainability of Black Box Machine Learning Models. In: Chbeir, R., Huang, H., Silvestri, F., Manolopoulos, Y., Zhang, Y. (eds) Web Information Systems Engineering – WISE 2022. WISE 2022. Lecture Notes in Computer Science, vol 13724. Springer, Cham. https://doi.org/10.1007/978-3-031-20891-1_8
Download citation
DOI: https://doi.org/10.1007/978-3-031-20891-1_8
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-20890-4
Online ISBN: 978-3-031-20891-1
eBook Packages: Computer ScienceComputer Science (R0)