Abstract
Explainable machine learning aims to reveal the reasons why black-box models make decisions. Counterfactual explanation is an example-based post-hoc explanation method. The counterfactual explanations aims to find the minimum perturbation that changes the model output with respect to the original instance. This study’s goal is to review the literature that has already been written about counterfactual explanations and topics that are relevant to it. We provide a formal definition of counterfactual explanations and counterfactual explainer, and a summary and formulaic description of the properties of the generated counterfactual instances. In addition, we investigate the application of counterfactual explanations in two areas: model robustness, and generating feature importance. The findings demonstrate that the qualities necessary for counterfactual instances cannot be simultaneously satisfied by present methodologies. Finally, we go over potential future research directions.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Adadi, A., Berrada, M.: Peeking inside the black-box: a survey on explainable artificial intelligence (xai). IEEE Access 6, 52 138–52 160 (2018)
Carvalho, D.V., Pereira, E.M., Cardoso, J.S.: Machine learning interpretability: a survey on methods and metrics. Electronics 8(8), 832 (2019)
Lundberg, S.M., Lee, S.-I.: A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30 (2017)
Kim, B., Khanna, R., Koyejo, O.O.: Examples are not enough, learn to criticize! criticism for interpretability. Adv. Neural Inf. Process. Syst. 29 (2016)
Guidotti, R.: Counterfactual explanations and how to find them: literature review and benchmarking. Data Min. Knowl. Discov., pp. 1–55 (2022)
Karimi, A.-H., Barthe, G., Schölkopf, B., Valera, I.: A survey of algorithmic recourse: definitions, formulations, solutions, and prospects. arXiv preprintarXiv:2010.04050 (2020)
Stepin, I., Alonso, J.M., Catala, A., Pereira-Fariña, M.: A survey of contrastive and counterfactual explanation generation methods for explainable artificial intelligence. IEEE Access 9, 11 974–12 001 (2021)
Miller, T.: Explanation in artificial intelligence: insights from the social sciences. Artif. Intell. 267, 1–38 (2019)
Lipton, P.: Contrastive explanation. R. Inst. Philos. Suppl. 27, 247–266 (1990)
Pearl, J., et al.: Models, Reasoning and Inference, vol. 19(2). Cambridge University Press, Cambridge (2000)
Völkel, S.T., Schneegass, C., Eiband, M., Buschek, D.: "What is" intelligent "in intelligent user interfaces? a meta-analysis of 25 years of iui". In: Proceedings of the 25th International Conference on Intelligent User Interfaces, pp. 477–487 (2020)
Wachter, S., Mittelstadt, B., Russell, C.: Counterfactual explanations without opening the black box: automated decisions and the gdpr. Harv. JL & Tech. 31, 841 (2017)
Poyiadzi, R., Sokol, K., Santos-Rodriguez, R., De Bie, T., Flach, P.: Face: feasible and actionable counterfactual explanations. In: Proceedings of the AAAI/ACM Conference on AI, Ethics, and Society, pp. 344–350 (2020)
Ramon, Y., Martens, D., Provost, F., Evgeniou, T.: A comparison of instance-level counterfactual explanation algorithms for behavioral and textual data: Sedc, lime-c and shap-c. Adv. Data Anal. Classif. 14(4), 801–819 (2020)
Pawelczyk, M., Broelemann, K., Kasneci, G.: Learning model-agnostic counterfactual explanations for tabular data. In: Proceedings of The Web Conference, pp. 3126–3132 (2020)
Mahajan, D., Tan, C., Sharma, A.: Preserving causal constraints in counterfactual explanations for machine learning classifiers. arXiv preprintarXiv:1912.03277 (2019)
Sharma, S., Henderson, J., Ghosh, J.: Certifai: counterfactual explanations for robustness, transparency, interpretability, and fairness of artificial intelligence models. arXiv preprintarXiv:1905.07857 (2019)
Kommiya Mothilal, R., Mahajan, D., Tan, C., Sharma, A.: Towards unifying feature attribution and counterfactual explanations: Different means to the same end. In: Proceedings of the 2021 AAAI/ACM Conference on AI, Ethics, and Society, pp. 652–663 (2021)
Yousefzadeh, R., O’Leary, D.P.: Interpreting neural networks using flip points. arXiv preprintarXiv:1903.08789 (2019)
Artelt, A., Hammer, B.: Convex density constraints for computing plausible counterfactual explanations. In: International Conference on Artificial Neural Networks, pp. 353–365. Springer (2020)
Ustun, B., Spangher, A., Liu, Y.: Actionable recourse in linear classification. In: Proceedings of the Conference on Fairness, Accountability, and Transparency, pp. 10–19 (2019)
Acknowledgement
This work is supported by the National Natural Science Foundation of China under Grant No. 62172316 and the Key R &D Program of Hebei under Grant No. 20310102D. This work is also supported by the Key R &D Program of Shaanxi under Grant No. 2019ZDLGY13-03-02, and the Natural Science Foundation of Shaanxi Province un-der Grant No. 2019JM-368.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Zhang, X., Dai, L., Peng, Q., Tang, R., Li, X. (2023). A Survey of Counterfactual Explanations: Definition, Evaluation, Algorithms, and Applications. In: Xiong, N., Li, M., Li, K., Xiao, Z., Liao, L., Wang, L. (eds) Advances in Natural Computation, Fuzzy Systems and Knowledge Discovery. ICNC-FSKD 2022. Lecture Notes on Data Engineering and Communications Technologies, vol 153. Springer, Cham. https://doi.org/10.1007/978-3-031-20738-9_99
Download citation
DOI: https://doi.org/10.1007/978-3-031-20738-9_99
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-20737-2
Online ISBN: 978-3-031-20738-9
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)