A Bibliometric Analysis of the Explainable Artificial Intelligence Research Field

Alonso, Jose M.; Castiello, Ciro; Mencar, Corrado

doi:10.1007/978-3-319-91473-2_1

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 853))

Included in the following conference series:

International Conference on Information Processing and Management of Uncertainty in Knowledge-Based Systems

3082 Accesses
36 Citations

Abstract

This paper presents the results of a bibliometric study of the recent research on eXplainable Artificial Intelligence (XAI) systems. We took a global look at the contributions of scholars in XAI as well as in the subfields of AI that are mostly involved in the development of XAI systems. It is worthy to remark that we found out that about one third of contributions in XAI come from the fuzzy logic community. Accordingly, we went in depth with the actual connections of fuzzy logic contributions with AI to promote and improve XAI systems in the broad sense. Finally, we outlined new research directions aimed at strengthening the integration of different fields of AI, including fuzzy logic, toward the common objective of making AI accessible to people.

Access provided by CONRICYT-eBooks. Download conference paper PDF

A Brief Review of Explainable Artificial Intelligence (XAI) Techniques

On the Different Concepts and Taxonomies of eXplainable Artificial Intelligence

Looking at the Branches and Roots

Keywords

1 Introduction

In the era of the Internet of Things and Big Data, data scientists are required to extract valuable knowledge from the given data. They first analyze, cure and pre-process data; then, they apply Artificial Intelligence (AI) techniques to automatically extract knowledge from data [1]. Getting AI into widespread real-world usage requires to think carefully of many important issues. Among them, we would like to highlight (1) Ethics, (2) Law and (3) Technology.

Recently, ACM issued a Statement on “Algorithmic Transparency and Accountability”, which establishes a set of principles, consistent with the ACM Code of Ethics, to support the benefits of algorithmic decision-making while addressing ethical and legal concerns [2]. Among such principles, Explanation is of relevance for this study. According to ACM: “Systems and institutions that use algorithmic decision-making are encouraged to produce explanations regarding both the procedures followed by the algorithm and the specific decisions that are made. This is particularly important in public policy contexts.”

In addition, a new European General Data Protection Regulation (GDPR^{Footnote 1}) is expected to take effect in 2018 [3]. It takes care of the protection of natural people when personal data have to be processed and freely moved. Moreover, it emphasizes the “right to explanation” of European citizens: “$[\ldots ]$ decision-making based on such processing, including profiling, should be allowed $[\ldots ]$ In any case, such processing should be subject to suitable safeguards, which should include $[\ldots ]$ the right to obtain human intervention, to express his or her point of view, to obtain an explanation of the decision reached after such assessment and to challenge the decision.”

Regarding technological issues, the theme of explainability in AI is also remarked in the last challenge stated by the USA Defense Advanced Research Projects Agency (DARPA) [4]: “Even though current AI systems offer many benefits in many applications, their effectiveness is limited by a lack of explanation ability when interacting with humans.”

Accordingly, non-expert users, i.e., users without a strong background on AI, require a new generation of explainable AI (XAI) systems. Such systems are expected to naturally interact with humans by providing comprehensible explanations of decisions that are automatically made. XAI systems can be also considered as an important step forward toward Collaborative Intelligence [5] which promises a fully accepted integration of AI in our society.

In this paper, we report the results of a bibliometric study of the recent research on XAI systems. We are interested in assessing the contributions of AI scholars in XAI, as well as in the subfields of AI that are mostly involved in the development of XAI systems. More specifically, we are interested in the role of the fuzzy logic community in the progress of XAI, exploring the connections of fuzzy logic contributions with AI to promote and improve systems explainability. While moving along this way, we hope to outline new research directions aimed at strengthening the integration of different fields of AI, including fuzzy logic, toward the common objective of making AI accessible to people.

The rest of the manuscript is organized as follows. Section 2 introduces material and methods. Section 3 presents our bibliometric analysis focused on XAI. Section 4 introduces additional details while focusing on Interpretable Fuzzy Systems (IFS) only. Finally, Sect. 5 remarks the main points of the study and pinpoints future work.

2 Material and Methods

2.1 Bibliometric Techniques

Scientometrics is informally defined as the discipline that studies the quantitative features and characteristics of science and scientific research, technology and innovation. Within Scientometrics, Bibliometrics copes with the statistical analysis of books, articles, or other kinds of publications [6].

Usually, bibliographical data are treated by statistical mathematical methods and results are visualized in form of tables and graphs. For example, Vargas-Quesada and Moya-Anegón [7] proposed a methodology for creating visual representations of scientific domains. They focused on illustrating interactions among authors and papers through citations and co-citations. Later, other authors generalized the idea and developed alternative methods and tools (e.g., [8, 9]) to create maps of linked items (scientific publications, scientific journals, researchers, research organizations, countries, or keywords).

Different types of links between pairs of items can be considered. As an example, let us briefly introduce the concept of item co-citation. Given a set of items, all potential links among pairs of items can be characterized by the standardized co-citation measure [10] as follows:

$$\begin{aligned} MCN_{ij}=\frac{Cc_{ij}}{\sqrt{c_{i} \cdot c_{j}}} \end{aligned}$$

(1)

where Cc means co-citation, c stands for citation, i and j are two different items.

The link values ($MCN_{ij}$) define the adjacency matrix of a graph which can be analyzed and visualized with social network analysis (SNA) techniques [11]. These techniques have been already applied to multiple fields of research, such as software development (e.g., debugging multi-agent systems [12]), scientometrics (e.g., analyzing large scientific domains [13]), or fuzzy modeling (e.g., analyzing fuzzy rule-bases with fingrams [14]).

There are many metrics designed to assess the importance of a node in a bibliographical graph (e.g., centrality degree, closeness, betweenness or page rank) [7]. In addition, there are many different methods for graph visualization [15]. Among them, force-directed algorithms are the most widely used in information science [16]. Their purpose is to locate the nodes of a graph in a 2D or 3D space, so that all the edges are approximately of equal length and there are as few crossing edges as possible, trying to obtain the most aesthetically pleasing view. There are also many clustering techniques aimed at discovering communities (or bunches of highly related nodes) in accordance with the importance of each single node and how it is connected to the others [17].

2.2 Bibliographic Repositories

Bibliographic data can be read from different sources such as Web of Science (WoS) or Scopus. WoS appears not to be adequate for assessing publications and citations in Informatics^{Footnote 2}. In addition, some other sources may be too large (e.g., Google Scholar) or too specialized (e.g., ACM DL, IEEEXplore, etc.). Therefore, in this work we focus on Scopus which also offers advanced search functionalities useful to select meaningful sets of items which can be considered as a ground to build our bibliometric analysis. Anyway, the selection of Scopus as a bibliographical source comes without any loss of generalization. We performed a preliminary study on data collected from WoS: the main trends and the general conclusions remained unchanged (only slight minor variations were detected).

Finally, it should be highlighted that data collected from Scopus have been cured in order to remove spurious information that would have hampered the subsequent steps of our analysis.

2.3 Bibliographic Analysis Tools

We used a couple of tools to analyze the results of search queries from Scopus:

Bibliometrix [18] - An R package for performing comprehensive quantitative research in Scientometrics and Bibliometrics. It allows importing bibliographic data from several sources (including Scopus and WoS). In addition, it evaluates co-citation as well as other kinds of measures, such as coupling, scientific collaboration and co-word analyses.
VOS viewer [9] - A software tool for constructing and visualizing bibliometric networks which can be related to citation, co-citation, bibliographic coupling, co-authorship or co-occurrence of words relations. Some clustering methods to identify related groups or communities are also provided.

3 A Global Overview on XAI

On October 20th, 2017, we ran the following query through the “Advanced Search” tool provided by Scopus:

Q1 = TITLE (“*interpretab*”) OR TITLE (“*comprehensib*”)

OR TITLE (“*understandab*”) OR TITLE (“*explainab*”)

OR TITLE (“*self-explanat*”) OR KEY (“*interpretab*”)

OR KEY (“*comprehensib*”) OR KEY (“*understandab*”)

OR KEY (“*explainab*”) OR KEY (“*self-explanat*”)

As a result, we found out 5735 documents. It is worthy to note this query is intentionally very general in order to broaden the global picture of the research field under examination. We identified only 5 general terms and their variants represented by the * symbol. We required at least one of these terms to be present in the title of the retrieved document or in the associated keywords (provided by authors or automatically indexed).

Figure 1 depicts the number of XAI publications since 1960 (top picture) and the distribution of publications in the top-10 ranking of subject fields (bottom picture). The number of publications started to grow significantly since 2000. Accordingly, we decided to focus our analysis only on the years ranging from 2000 to 2017.

XAI represents a multidisciplinary research field, as witnessed by the variety of subject areas. Anyway, three of them (Computer Sciences, Mathematics, and Engineering) collect most of the publications. Therefore, we are going to pay attention only to publications in these research areas. In this way, the final number of publications to analyze is 3737. We downloaded from Scopus all the related bibliographical information in form of csv and bib files.

Table 1 presents the Top-5 rankings of authors (columns 2–4) and countries (columns 5 and 6) with respect to h-index, total number of citations (TC) and publications (NP). Herrera stands as the leading author in terms of h-index, TC and NP. USA is by far the leading country in terms of both TC and NP.

Table 1. Top-5 ranking of XAI authors and countries in terms of h-index, Total Citations (TC) and Number of Publications (NP).

Full size table

Table 2. Top-5 ranking of XAI publications in terms of Total Citations (TC) and Average Citations per Year (ACY).

Full size table

The leading publications are listed in Table 2 in terms of TC and average citations per year (ACY). Guillaume [19] reviewed methods for automatically designing IFS. This is the most cited publication being also the fifth as for ACY. Aleven and Koedinger [20] described how to improve students’ learning with a computer-based approach endowed with self-explanation. This is the second most cited publication, ranked fourth in terms of ACY. Jin [21] authored the third most cited publication, presenting a fuzzy modeling approach designed to improve the interpretability of high-dimensional systems. This publication is out of the Top-5 in terms of ACY. García et al. [22] reviewed statistical techniques to get a good interpretability-accuracy trade-off in genetics-based machine learning. This is the fourth most cited paper and the second one in terms of ACY. The fifth publication in terms of TC (out of the ACY Top-5) comes from Ishibuchi and Nojima [23] who applied a multi-objective genetics-based machine learning approach to build fuzzy systems with a good interpretability-accuracy trade-off. The scenario is completed by Martínez and Herrera [24] (first paper in terms of ACY), who proposed a linguistic model for solving decision-making problems, and Gacto et al. [25] (third paper in terms of ACY), who reviewed interpretability indexes for assessing IFS. Notice that Herrera co-authored 3 of the Top-5 publications in terms of ACY: this emphasizes his leading role in the XAI research field (see Table 1).

The leading sources in XAI are depicted in the pie chart on the left of Fig. 2. Most papers are published in conference proceedings. Nevertheless, the Top-5 papers (see Table 2) appear in well-recognized journals.

Figure 3 shows a graph with the most popular author keywords in the publications under study. Each node is associated to a keyword and its size is proportional to the number of documents where the keyword appears. Interpretability is the main keyword since it is associated to the larger node. Understandability and classification are the second and the third main keywords. Links between nodes relate keywords which usually appear together in the same documents.

This graph gives a global overview about the main topics of interest in the XAI research field, with groups of closely related nodes painted in the same color^{Footnote 3}. On the one hand, interpretability is closer to topics usually addressed in the fuzzy logic community (e.g., fuzzy modeling or rule selection). On the other hand, understandability is surrounded by keywords related to software engineering. The gap between the main keywords is filled by other relevant nodes such as comprehensibility or self-explanation. Moreover, interpretability and comprehensibility are related to a group of keywords including popular topics in AI (e.g., classification, data mining or knowledge discovery). A community of keywords is partially disconnected from the rest of the graph (e.g., semantic web, ontology, and so on), and some single nodes lie away from others (e.g., interpretability logic or image interpretability). That is due to their relatedness to some specific research lines. Notice that NIIRS stands for National Imagery Interpretability Rating Scale which is a subjective scale for rating the quality of images.

When we turn to consider author co-citation, we look for pairs of authors being cited by the same publications. Figure 4 shows the co-citation map obtained by the VOS viewer (the minimum number of total citations by author is set to 50). Size of nodes is proportional to the number of citations, while link weights come from the co-citation index defined by Eq. (1). Most nodes are concentrated in the left-hand side of the map. Again, Herrera stands out as the main node.

4 Detailed Analysis on Interpretable Fuzzy Systems

We replicated the previous analysis with a modified query:

Q2 = Q1 AND “fuzz*”

By adding “fuzz*” to Q1 we focus our search on publications in the XAI field that are related to fuzzy sets and systems. Hereafter, we refer to this field of research as IFS. In addition, we filtered the collected results by adopting the same constraints imposed in the previous section: (1) years range [2000–2017] and (2) subject areas [Computer Sciences, Mathematics and Engineering]. As a result, we got 1054 documents, consisting in about 28% of the whole set of documents previously analyzed.

The Top-5 rankings of authors and countries is detailed in Table 3 concerning IFS. Most authors in Table 3 are present also in Table 1, thus certifying the relevance of the fuzzy community in the context of XAI. However, USA (the leading country in Table 1) is now out of the Top-5. Moreover, European countries take up the Top-5 in terms of TC. China and India appear only when looking at NP. These data reflect the relevance of European scholars in the fuzzy community and their outstanding leadership in IFS.

Table 3. Top-5 ranking of IFS authors and countries.

Full size table

Table 4. Top-5 ranking of IFS publications.

Full size table

Table 4 lists the leading publications in IFS and reflects once again the relevance of the fuzzy community in the context of XAI. All the papers in the TC ranking already appeared in Table 2: the current Top-3 is included in the TC ranking related to XAI, while [24, 25] appear in the Top-3 of the ACY ranking of Table 2. Actually, only the work authored by Fazzolari et al. [26] (a review of multi-objective evolutionary fuzzy systems devoted, among other things, to find a good interpretability/accuracy trade-off) is a new entry with respect to Table 2.

Looking carefully at the map of author keywords in Fig. 5, we miss some of the important topics highlighted in Fig. 3. For example, comprehensibility and understandability seem to play a much more prominent role in XAI than in IFS. It could be argued that Fig. 5 may be read as a zoom produced in a specific area of Fig. 3, namely the one related to the interpretability node. This suggests that many important issues in XAI are still to be addressed by IFS scholars.

Finally, Fig. 6 shows the map of author co-citation in IFS. Once again, the current map looks like a zoom of the left-hand side of Fig. 4.

5 Concluding Remarks

The results reported in the previous sections allow a number of considerations. First, there is a strong community of scholars in fuzzy logic addressing their study to the theme of XAI, with special emphasis on interpretability. In fact, interpretability studies in fuzzy logic started from pioneering works in 1999 and about one third of the selected papers belong to the fuzzy logic mainstream. As a result, many of the most influential authors and papers in XAI refer to the fuzzy community. However, if we compare Fig. 3 with Fig. 5 we observe that, within the fuzzy community, the main notions of interpretability, comprehensibility, understandability and explainability are not clearly distinct as in XAI. Rather, interpretability has a major role while the other keywords are either treated as synonyms or distinctly used in very specialized studies only.

Furthermore, a deeper analysis of the co-citation graph let us observe a neat separation between authors in the fuzzy community and authors in XAI not related to fuzzy logic. This can be appreciated in Fig. 7 where a zoom of the left-hand side of Fig. 4 is provided: authors related to fuzzy logic appear to be aggregated in two compact clusters (the yellow and the green ones in Fig. 7) demonstrating also a tight interconnection of the related research activities. On the other hand, authors in XAI not related to fuzzy logic appear to be loosely distributed, as a sign of a more scattered collaboration.

This analysis suggests at least two lines of development. Firstly, there is a need to clarify and distinguish the notions of interpretability, comprehensibility, understandability and explainability to provide a common terminological ground inside the varied XAI context. This could also shed light on refined conceptualizations where fuzzy logic could significantly contribute. Moreover, an opportunity emerges to tighten the connections of studies between fuzzy and non-fuzzy worlds of XAI, which now appear unnecessarily separated. We strongly believe that cross-fertilization between these communities is needed to successfully face the challenges posed by XAI.

Notes

1.
http://eur-lex.europa.eu/legal-content/en/TXT/?uri=CELEX%3A32016R0679.
2.
Informatics Research Evaluation (Draft), An Informatics Europe Report. http://www.informatics-europe.org/working-groups/research-evaluation.html.
3.
The graph was generated by the VOS viewer employing the suggested default parameters for layout visualization and clustering of nodes. Other clustering approaches may be applied, but choosing the best approach is out of the scope of this paper.

References

Philip Chen, C., Zhang, C.: Data-intensive applications, challenges, techniques and technologies: a survey on big data. Inf. Sci. 275, 314–347 (2014)
Article Google Scholar
ACM US Public Policy Council: Statement on Algorithmic Transparency and Accountability (2017)
Google Scholar
Goodman, B., Flaxman, S.: European union regulations on algorithmic decision-making and a “right to explanation”. In: ICML Workshop on Human Interpretability in Machine Learning (WHI), New York, NY, pp. 1–9 (2016)
Google Scholar
Gunning, D.: Explainable Artificial Intelligence (XAI). Technical report, Defense Advanced Research Projects Agency, Arlington, USA, DARPA-BAA-16-53 (2016)
Google Scholar
Epstein, S.L.: Wanted: collaborative intelligence. Artif. Intell. 221, 36–45 (2015)
Article MathSciNet Google Scholar
De Bellis, N.: Bibliometrics and Citation Analysis: From the Science Citation Index to Cybermetrics. Scarecrow Press, Lanham (2009)
Google Scholar
Vargas-Quesada, B., Moya-Anegón, F.: Visualizing the Structure of Science. Springer, Heidelberg (2007). https://doi.org/10.1007/3-540-69728-4
Book Google Scholar
Cobo, M., López-Herrera, A., Herrera-Viedma, E., Herrera, F.: Science mapping software tools: review, analysis, and cooperative study among tools. J. Assoc. Inf. Sci. Tech. 62, 1382–1402 (2011)
Article Google Scholar
Van Eck, N., Waltman, L.: Software survey: vosviewer, a computer program for bibliometric mapping. Scientometrics 84, 523–538 (2010)
Article Google Scholar
Salton, G., Bergmark, D.: A citation study of computer science literature. IEEE Trans. Prof. Commun. 22, 146–158 (1979)
Article Google Scholar
Wasserman, S., Faust, K.: Social Network Analysis: Methods And Applications (Structural Analysis in the Social Sciences). Cambridge University Press, Cambridge (1994)
Book Google Scholar
Serrano, E., Quirin, A., Botia, J., Cordón, O.: Debugging complex software systems by means of pathfinder networks. Inf. Sci. 180(5), 561–583 (2010)
Article Google Scholar
Moya-Anegón, F., Vargas-Quesada, B., Herrero-Solana, V., Chinchilla-Rodríguez, Z., Corera-Álvarez, E., Muñoz-Fernández, F.J.: A new technique for building maps of large scientific domains based on the cocitation of classes and categories. Scientometrics 61(1), 129–145 (2004)
Article Google Scholar
Pancho, D., Alonso, J., Cordón, O., Quirin, A., Magdalena, L.: FINGRAMS: visual representations of fuzzy rule-based inference for expert analysis of comprehensibility. IEEE Trans. Fuzzy Syst. 21(6), 1133–1149 (2013)
Article Google Scholar
di Battista, G., Eades, P., Tamassia, R., Tollis, I.: Graph Drawing: Algorithms for the Visualization of Graphs. Prentice Hall, Upper Saddle River (1998)
MATH Google Scholar
Kobourov, S.G.: Force-directed drawing algorithms. In: Tamassia, R. (ed.) Handbook of Graph Drawing and Visualization. CRC Press, Boca Raton (2012)
Google Scholar
Porter, M., Onnela, J., Mucha, P.: Communities in networks. Not. Am. Math. Soc. 56(9), 1082–1166 (2009)
MathSciNet MATH Google Scholar
Aria, M., Cuccurullo, C.: bibliometrix: an R-tool for comprehensive science mapping analysis. J. Informetr. 11(4), 959–975 (2017)
Article Google Scholar
Guillaume, S.: Designing fuzzy inference systems from data: an interpretability-oriented review. IEEE Trans. Fuzzy Syst. 9(3), 426–443 (2001)
Article Google Scholar
Aleven, V., Koedinger, K.: An effective metacognitive strategy: learning by doing and explaining with a computer-based cognitive tutor. Cogn. Sci. 26(2), 147–179 (2002)
Article Google Scholar
Jin, Y.: Fuzzy modeling of high-dimensional systems: complexity reduction and interpretability improvement. IEEE Trans. Fuzzy Syst. 8(2), 212–221 (2000)
Article Google Scholar
García, S., Fernandez, A., Luengo, J., Herrera, F.: A study of statistical techniques and performance measures for genetics-based machine learning: accuracy and interpretability. Soft. Comput. 13(10), 959–977 (2009)
Article Google Scholar
Ishibuchi, H., Nojima, Y.: Analysis of interpretability-accuracy tradeoff of fuzzy systems by multiobjective fuzzy genetics-based machine learning. Int. J. Approx. Reason. 44(1), 4–31 (2007)
Article MathSciNet Google Scholar
Martínez, L., Herrera, F.: An overview on the 2-tuple linguistic model for computing with words in decision making: extensions, applications and challenges. Inf. Sci. 207, 1–18 (2012)
Article MathSciNet Google Scholar
Gacto, M., Alcalá, R., Herrera, F.: Interpretability of linguistic fuzzy rule-based systems: an overview of interpretability measures. Inf. Sci. 181(20), 4340–4360 (2011)
Article Google Scholar
Fazzolari, M., Alcalá, R., Nojima, Y., Ishibuchi, H., Herrera, F.: A review of the application of multiobjective evolutionary fuzzy systems: current status and further directions. IEEE Trans. Fuzzy Syst. 21(1), 45–65 (2013)
Article Google Scholar

Download references

Acknowledgements

This work was supported by RYC-2016-19802 (Ramón y Cajal contract), and two MINECO projects TIN2017-84796-C2-1-R (BIGBISC) and TIN2014-56633-C3-3-R (ABS4SOW). All of them funded by the Spanish “Ministerio de Economía y Competitividad”. Financial support from the Xunta de Galicia (Centro singular de investigación de Galicia accreditation 2016–2019) and the European Union (European Regional Development Fund - ERDF), is gratefully acknowledged.

Author information

Authors and Affiliations

Centro Singular de Investigación en Tecnoloxías da Información (CiTIUS), Universidade de Santiago de Compostela, Santiago de Compostela, Spain
Jose M. Alonso
Department of Informatics, University of Bari “Aldo Moro”, Bari, Italy
Ciro Castiello & Corrado Mencar

Authors

Jose M. Alonso
View author publications
You can also search for this author in PubMed Google Scholar
Ciro Castiello
View author publications
You can also search for this author in PubMed Google Scholar
Corrado Mencar
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Jose M. Alonso .

Editor information

Editors and Affiliations

Universidad de Cádiz, Cádiz, Cadiz, Spain
Jesús Medina
Universidad de Málaga, Málaga, Málaga, Spain
Manuel Ojeda-Aciego
Universidad de Granada, Granada, Spain
José Luis Verdegay
Universidad de Granada, Granada, Spain
David A. Pelta
Universidad de Málaga, Málaga, Málaga, Spain
Inma P. Cabrera
LIP6, Université Pierre et Marie Curie, CNRS, Paris, France
Bernadette Bouchon-Meunier
Iona College, New Rochelle, New York, USA
Ronald R. Yager

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Alonso, J.M., Castiello, C., Mencar, C. (2018). A Bibliometric Analysis of the Explainable Artificial Intelligence Research Field. In: Medina, J., et al. Information Processing and Management of Uncertainty in Knowledge-Based Systems. Theory and Foundations. IPMU 2018. Communications in Computer and Information Science, vol 853. Springer, Cham. https://doi.org/10.1007/978-3-319-91473-2_1

Download citation

DOI: https://doi.org/10.1007/978-3-319-91473-2_1
Published: 18 May 2018
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-91472-5
Online ISBN: 978-3-319-91473-2
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

A Bibliometric Analysis of the Explainable Artificial Intelligence Research Field

Abstract

Similar content being viewed by others

A Brief Review of Explainable Artificial Intelligence (XAI) Techniques