Keywords

1 Introduction

Today, more scientific information is being made than in the past, and keeping up with everything published is getting more complicated. Furthermore, it is increasingly referred to as multidisciplinary science (Linnenluecke et al. 2020; Ware and Mabe 2015; Briner and Denyer 2012). Henceforth, a significant step toward cumulative scientific knowledge advancement is the synthesis of past research findings and bibliometric data (Aria and Cuccurullo 2017). Globalization has led to the disappearance of borders, information spread, and scientific knowledge advancement. Furthermore, bibliometric studies shed light on assessing the scientific literature by quantitative attitude. These investigations offer a readable, organized, and repeatable literature review.

For a decade, bibliometric analysis has been extended to all disciplines. However, the field of Applied Linguistics, which is relatively new, has witnessed a great deal of data, multiple studies, and many scientific publications. It is getting harder and more challenging for applied linguists to deal with large and fragmented volumes of data, and in light of the rapid increase in resources in the field of applied linguistics and its various strands, the concept of science mapping has become increasingly important in order to understand the current state of the art, connections, opportunities, and main players within an established community of practitioners, themes, as well as relationships between authors and institutions. (Guler et al. 2016). It is important to draw a diagram showing the interconnectedness of items like research data, locations, themes, and author-to-author and author-to-institutional ties (Şenel and Demir 2018).

According to Shi et al. (2020), bibliometric analysis is a computer program that analyzes data to determine its metrological and content characteristics. Hence, it is an objective and reliable method based on statistical techniques for analyzing information (Broadus 1987; Diodato and Gellatly 2013). Bibliometric analysis is complex due to the fact that it involves several steps. For this purpose, it is necessary to use a variety of complex and varied analysis and mapping tools, some of which may only be made available under a commercial license.

Bibliometric analysis can be done with numerous software or package programs such as VOSviewer (Eck et al. 2010), CiteSpace (Chen 2006), SciMAT (Cobo et al. 2012), Bibexcel, Biblioshiny (Aria and Cuccurullo 2017). This chapter aims to provide a thorough introduction to a free, open-source R package for conducting a systematic study of scientific literature. Moreover, The Bibliometrix package offers a collection of quantitative research instruments for bibliometrics and scientometrics. To do so, Biblioshiny integrates the features of the bibliometrix package with the simplicity of the Shiny package environment for web applications. Besides, the attributes of the bibliometrix package and the convenience of the Shiny-based web application for users led to Biblioshiny’s popularity (Aria and Cuccurullo 2017).

This chapter aims to provide a general and systematic overview of scientometric studies. To do this, I developed a project to map out the status of two topics published in Language and Linguistics and Education and Educational Research. In addition, comprehensive information about the English language and technology-related publications is described in detail, and PRISMA guidelines have been followed in extracting and filtering data from WoS.

Three sections have been created to carry out bibliometric research using Biblioshiny: (1) data collection, (2) data entering and (3) data visualization (Fig. 1).

Fig. 1
A chart includes data collection for gathering data from the base, data entering for importing raw files, loading bibliometrics files, using a sample collection, and data visualization for different software parts.

Bibliometric research’s steps through Biblioshiny

Data is the dynamic engine of bibliometric analysis. The clearer and more thorough our data, the more precise and accurate our results will be. According to Visser et al. (2021), “Bibliographical databases” are digital collections of references that are used to represent scientific literature in an orderly manner, such as journals, conference proceedings, patents, books, and more. In most cases, they include highly detailed descriptions of the topics covered in the form of keywords, subject categorization phrases, or abstracts. In Biblioshiny, raw files with the following formats can be imported to access major bibliometric databases: BIB, TXT, CIW, CSV, bibliometrix files: rdata, xls, and using sample collections. Bibliometric’s numerous available methods make importing bibliographic information from Scopus, Clarivate Analytics’ Web of Science, Dimensions, The Lens, PubMed, and Cochrane possible. Almost 36 million articles are available through the Web of Science (WoS), making it one of the most popular bibliographic databases. Scopus offers access to approximately 20 million articles. In terms of the quality of the data it contains, the Web of Science is superior to other databases (Aria et al. 2020; Harzing and Alakangas 2016). Despite the acceptable status of WoS metadata, three metadata items are completely missing from Scopus (Keywords Plus, Number of cited References, and Science Categories). Lastly, the data must be cleansed. In order for a calculation to be accurate, it must be based on data that is precise. A variety of pre-processing techniques can be employed, for example, to identify duplicates or misspellings. Although bibliometric data is generally reliable, there may be some discrepancies due to the fact that a single reference may include multiple editions of the same publication or different spellings of the author’s name.

2 Data Gathering

To gather the information we need, combine a list of phrases and items in the search box that may be used to locate articles in the specific field that make use of some methods that connect with the Boolean operators (“AND”, “OR”, and “NOT”). Make sure that research is limited to the target area and related fields. Next, the time period will be defined. It is also essential for the user to determine which type of document will be analyzed (article, book chapter, book review, the editorial). There is a wide range of languages used in academic papers from which to choose depending on the specifics of our investigation.

The inclusion criteria of this bibliometric study adhered to PRISMA’s guiding principles. The term PRISMA stands for preferred reporting items for systematic reviews and meta-analyses. It illustrates the process of screening in a visually appealing manner. Initially, the PRISMA records the number of articles that have been retrieved, and then it makes the selection process transparent by indicating the decisions made at different stages of the systematic review. At each stage, the number of articles is recorded. When excluding articles, researchers should include the reasons for doing so (Moher et al. 2009). The following eight WoS online indexing databases were chosen in accordance with the search strategy: Science Citation Index Expanded (SCI), Social Sciences Citation Index (SSCI), Art and Humanities Citation Index (AHCI), Conference Proceedings Index – Science (CPCI-S), Conference Proceedings Citation Index – Social Science and Humanities (CPCI-SSH), Book Citation Index – Science (BKCI-S), Book Citation Index – Social Sciences and Humanities (BKCI-SSH), and Emerging Sources Citation Indexes (ESCI). Figure 2 depicts the PRISMA flow diagram utilized in the bibliometric study.

Fig. 2
2 tables of 5 columns and 16 rows. The column headers are metadata, description, Scopus, missing counts, missing percentage, and status on the left. Metadata, description, W o S, missing counts, missing percentage, and status on the right.

Scopus and WoS metadata status

In the following query, only articles that meet the following criteria will be considered: (“Computer Assisted Language Learning” OR “CALL” OR “Mobile Assisted Language Learning” OR “MALL” OR “Robot Assisted Language Learning” OR “RALL”) AND (“Applied Linguistics” OR “ESL” OR “EFL” OR “L2 English” OR “English as a Second Language” OR “English as a Foreign Language” OR “English Teaching”) in the title, abstract or in the keyword list. Next, narrow the document topic to include only Language and Linguistics, Education and Educational Research. Ten years was chosen as the time frame for this bibliometric study (2013–2022). Therefore, only articles were selected as a document type. As a final step, English-language articles were included. Furthermore, a plain text data set was downloaded from WoS using the PRISMA flow diagram steps. According to the exclusion criteria, some records were extracted from plain text. A total of 881 articles relating to the English language and technology were analyzed in this study (Fig. 3).

Fig. 3
A flowchart depicts PRISMA. It includes identification, screening, and included for W o S database record identification, since W o S was utilized, no record after duplication was eliminated, records screened to excluded documents, and included documents in the bibliometric analysis.

PRISMA flow diagram

3 Working with R Biblioshiny

To begin, install the most recent version of R (https://cran.r-project.org). Next, install the most recent version of Rstudio (https://rstudio.com). Having completed all these steps, open Rstudio and run the following commands in the top left window: (1) install.packages (“bibliometrix”), (2) library (bibliometrix), (3) bibliometrix::biblioshiny (R Core Team 2014) (Figs. 4 and 5).

Fig. 4
A screenshot of the window screen for the R studio workplace. The menu bar includes file, edit, code, view, plots, session, build, debug, profile, tools, and help. The highlighted rectangular box indicates untitled.

Rstudio workplace

Fig. 5
A screenshot of the window screen depicts code editore. It represents the Biblioshiny dot of install dot packages of bibliometrix, library of bibliometrix, and bibliometrix double colon biblioshiny left and right parenthesis.

Code editore

Once run the program, you enter to graphical user interface (GUI) of the Bibioshiny. On the top right side are four basic icons, each of which is described individually (Fig. 6).

Fig. 6
An illustration includes the first icon on the screen that pertains to software notifications, users may be able to get package tutorials, use this donation button if they wish to contribute, and with the help of this icon, users can access bibliometrix Git Hub as well as K-Synth.

Basic sittings of the Biblioshiny

In addition, users can add the results of specific analyses into the Export section to obtain a CSV file. Meanwhile, it should be noted that in every section of Biblioshiny, a plus icon can be used to include the results in the report. Through the system settings, which is the last icon, the user can alter general modifications of software such as resolutions of export plots as PNG as well as changing height in inches. It is possible to alter general software settings through the system settings, which is the last icon. These adjustments include adjusting the resolution of export plots as PNGs as well as changing the height in inches. It is time to explore the Bibloshiny application. The first step is to select the data format used in the study. A raw data file, a bibliometric file, or a sample collection can be used in this format. Importing and loading files using Biblioshiny’s APIs (Application Programming Interfaces) is also possible. Select the plain text file exported from WoS from the browse button, and then select “Web of Science (WoS/WoK)” as the database. Data from other databases, such as Scopus, PubMed, Lens.org, Dimensions, and Cochrane Library, can be analyzed. The number of documents will appear in the conversion results once the user clicks the start button. Figure 7 shows that 881 documents have been uploaded to the app with different information. An in-depth literature review of the study can be illustrated with Biblioshiny’s user-friendly interface.

Fig. 7
A screenshot depicts a bibliometric data overview with the main information timespan, sources, documents, authors, authors of single-authored docs, international Co-authorship, author's keywords, references, document average age, and average citations per doc.

Bibliometric data overview

3.1 Overview

Figure 7 shows the most important information about the documents in the boxes. A total of 881 articles were included in the study from 217 sources between 2013 and 2022. There were 881 articles related to the English language and technology written by 1538 authors. One author wrote 249 articles. The average number of citations per document was 10.68. Biblioshiny also offered the option to download each plot in multiple formats (CSV, Excel, PDF) and different resolutions. The plots can be exported in a resolution ranging from 75 dpi to 600 dpi.

According to Fig. 8, the highest number of publications was published in 2022 (N = 120). The number of documents produced during this period has fluctuated since 2013. Besides, graphs generated with Plotly are all dynamic, so you can see more information by moving your mouse over them (Table 1).

Fig. 8
A line graph depicts articles versus years from 2013 to 2023 for annual scientific production. The values are (2013, 50), (2016, 120, (2027, 70), (2018, 75), (2020, 72), (2022, 121), and (2023, 50. The values are approximated.

Citations per year for articles

Table 1 Articles published annually

The user can get a graph illustrating the rate of citations over 10 years by clicking on the average citation per year button. Interestingly, the average number of citations per year in 2019 is the highest rate, indicating that one or more articles published in 2019 received the most citations (N = 2.39) (Fig. 9 and Table 2).

Fig. 9
A line graph depicts average citations versus years from 2013 to 2023. The values are (2013, 1), (2014, 2), (2015, 0.5), (2017, 2.5), (2018, 0.6), (2019, 2.3), (2021, 1.7), and (2023, 4.5). The values are approximated.

Citations per year for articles

Table 2 Annual total citation per year

Three-field plots can be selected based on the objective of the researchers by choosing from authors, affiliations, countries, keywords, abstracts, references, and cited sources in accordance with their objectives. Determining the variables and number of items in the middle, left, and right fields is necessary. Upon setting these conditions, clicking the apply button will produce the plot. In order to create it, three meta-data fields must be selected. In this bibliometric study, the author occupied the middle field, the country occupied the left field, and the keywords occupied the right field (Fig. 10).

Fig. 10
A Sankey diagram with a 3-field plot of country, authors, and keywords. The countries such as China, the United Kingdom, Iran, U S A, and Spain. The authors such as Hwang W Y, Lee J S, Ebadi S, and Chen J C. The keywords such as e f l, e f l learners, call, mobile learning, mall, and vocabulary.

Three-field plot

3.2 Sources

The y-axis shows the number of journals and sources that have published at least one of the documents in the bibliometric repertoire, while the x-axis shows the total number of articles published. Computer Assisted Language Learning (N = 226) was the most commonly published journal in this research, International Journal of Computer-assisted Language Learning and Teaching (N = 66), System (N = 30), and Language Learning and Technology (N = 28). Arab World English Journal and RECALL (N = 25). Users may find it useful to click on the options button at the top of the page to find an option that allows them to add more sources to the list (Fig. 11 and Table 3).

Fig. 11
A bar graph depicts sources versus the number of documents. The highest value is Computer-assisted Language Learning 226. Also included are the International Journal of Computer-assisted Language 66, system 30, Language Learning and Technology 28, Arab World English journal, and Recall 25.

Most relevant sources

Table 3 Journal article publication rate

Bradford’s law categorizes articles and journals to identify the relevant number for each topic. Its calculation is based on the sum of articles published by journals in a specific field of study. By doing so, it is possible to identify the core of the journals, which can then be used to analyze the core zone documents. Bradford defined the first zone as the center of specialized journals. In addition, Bradford’s law is also suitable for classifying journals based on their number of citations. It can be concluded that a relatively small number of journals produce a substantial amount of high-quality science based on citations (Fig. 12).

Fig. 12
A schematic concentric circle depicts Bradford's law zone. The arrows indicate core zone, zone 2, and zone 3.

Bradford’s law zone

Computer-Assisted Language Learning Journal and International Journal of Computer Assisted Language Learning and Teaching are two of the core zones among the 217 journals. Consequently, two journals with 292 articles are in the core zone, 26 journals with 301 articles are in the middle zone, and 189 journals with 282 articles are in the minor core (Fig. 13).

Fig. 13
A graph depicts articles versus the source log of rank. The value of the line is decreasing trend. The vertical bar of core sources from computed-assisted language, and international journal of.

Bradford’s law core sources

The H-index, also known as the Hirsch index, is the number of an author’s or journal’s works that have been mentioned in at least one other piece of research. Hence, this index is a more refined version of basic metrics like overall publications or citation counts. The index works best when comparing researchers in the same area due to the large variation in publications across fields. For example, if seven articles from a researcher’s total publication are cited at least seven times in other articles, the H-index of that researcher is seven. Therefore, The H-index is considered the consequence of the equilibrium between articles and citations. In addition to other science measurement indices, such as the number of publications and citations, this index distinguishes influential researchers from those who produce many articles. The m-index is calculated by dividing the scientist’s H-index by the number of years (n) from his or her first publication (journal).

Compared to the H-index, the g-index is superior as a worldwide indicator of an article’s citation impact. If sorted from most to least citations, the g-index is the (unique) greatest number such that the top g articles received (collectively) at least g2 citations. Taking into account source local impact by H index, Computer Assisted Language Learning (H-index = 32), System (H-index = 15), Language Learning and Technology (H-index = 14), and RECALL (H-index = 13) were ranked in that order (Fig. 14 and Table 4).

Fig. 14
A horizontal lollipop graph for sources versus impact measures in height. The highest value is Computer-assisted language learning with 32, and the lowest is Arab world English journal, Asia-Pacific education researcher and Calico journal, with 5 each. It also includes system 15, and recall 13.

The local impact of journals by H-index

Table 4 Information on local impacts of journals

The user will find the publications on the left, while developments for each year can be seen on the right. The number of articles related to English Language and Technology has increased for each journal since 2012. The graph shows that Computer Assisted Language Learning Started with ten articles in 2013 and reached the highest number among other journals in 2022 (N = 171) (Fig. 15 and Table 5).

Fig. 15
A multi-line graph of cumulative occurrences versus years from 2013 to 2023. Lines with increasing trends for sources include Arab World English Journal, Computer-assisted Language Learning, International Journal of Computer-assisted Language Learning, language learning and technology, and systems.

Production of journals over time

Table 5 Annual production overview

3.3 Authors

This section aims to illustrate bibliometric analysis through information about authors, such as the most relevant authors, most locally cited authors, top authors’ production over time, Lotka’s law, and authors’ local impact. In regards to affiliations, the most relevant affiliations will be discussed along with their production over time. Furthermore, a bibliometric analysis of countries will be performed, including the corresponding authors’ countries, the scientific output of countries, the countries’ production over time, and the most cited countries. The following plots (Figs. 16 and 17) and Tables (6 and 7) illustrate the development of an author’s publications and citations over time. In plots 16 and 17, it can be seen that HWANG WY had the highest number of publications (N = 11) and citations (N = 65).

Fig. 16
A horizontal lollipop graph depicts the most relevant authors versus a number of documents. The highest value is Hwang W Y 11, followed by Jiang L J 10, Ebadi S, Rahimi M, and W U W C V 9, Lee J S 8, Shadiev R, Zhang Y 7, Roy D 6, and Chen H H J 5.

Most relevant authors

Fig. 17
A horizontal lollipop graph depicts authors versus local citations. The highest value is Hwang W Y 65, followed by Rahimi M 47, Shadiev R 45, Ebadi S 44, Chen H S L 37, W U W C V 30, Lee J S 28, Marek M W 22, Hsieh J S C, and Huang Y M 21.

Most locally cited authors

Table 6 The number of articles per author
Table 7 Citation information of authors

Figure 18 illustrates the development of authors’ production over time. The circle will likely be darker and larger if you have published more articles than in other years. As an example, LEE JS had 4 publications in 2019 and an average of 22 citations per year. As the number of documents increases, the bubble size increases. Color intensity is directly proportional to the number of citations per year and the number of citations per year. There were 3 articles authored by RAHIMI, M in 2022, and the number of citations he received per year was 22.5. The light pink line indicates the authors’ timeline. Among all the ten-year activities, HWANG WY has the longest timeline (2013–2022).

Fig. 18
A graph of authors versus years from 2013 to 2023. The largest value is Hwang W Y from 2013 to 2022. The smallest value is Jiang L J from 2020 to 2022.

Production of authors over time

Lotka’s law describes the frequency of publication by authors in different fields. There is a fixed ratio between the number of authors publishing a certain number of articles and the number of authors publishing a single article. Lotka’s law can be mathematically represented using a formula showing how it works. Where A represents the number of publications, B is the relative frequency of authors with A publications, n and C are constants that vary depending on the field (n ≈ 2).

$$ {\mathrm{A}}^{\mathrm{n}}\ \mathrm{B}=\mathrm{C} $$

Lotka’s law states that as the number of articles published increases, the number of authors who produce many publications will decrease. A total of 1321 authors (85.1%) have only published one article. This baseline represents authors who have published more than one article, which reveals that 142 authors (0.1%) have 2 publications. Based on this baseline, which represents the number of core authors who have published at least two articles, it can be seen that 142 authors (0.1%) have published at least two articles. On the other hand, the term occasional authors refers to 1321 authors who have published only one publication (86%) (Fig. 19 and Table 8).

Fig. 19
A line graph exhibits the percentage of Lotka's law of author productivity versus documents written. The values of the lines are decreasing trends.

Lotka’s law of author productivity

Table 8 Author productivity based on Lotka’s law

According to Fig. 20, authors are categorized based on the H-index of their publications. Three authors (EBADI S, HWANG WY, LEE JS) in the top tier have recorded an H-index of 7.

Fig. 20
A horizontal lollipop graph depicts the local impact of authors versus impact measure H. The highest values are Ebadi S, Hwang W Y, and Lee J s 7, followed by Jiang L J, Rahimi M, Shadiev R, Tsai S C, and W U W C V 5, Andujar A, and Bahari A 4.

Local impact of authors

3.3.1 Affiliations

As found in this bibliometric analysis, the majority of the authors were from Islamic Azad University in Iran (N = 36), National Taiwan Normal University in China (N = 26), Education University in Hong Kong (N = 20), and Iowa State University in the USA (N = 19) (Fig. 21 and Table 9).

Fig. 21
A horizontal lollipop graph has affiliation versus articles. Values include Islamic Azad U N I V 36, Natl Taiwan normal U N I V 26, Educ U N I V Hong Kong 20, I O W A state, and Natl cent U N I V 19, U N I V M A C A U 15, Razi U N I V 13, and Natl Yunlin U N I V science and technology 12.

Highly affiliated institution

Table 9 Institutional rate of publication

As shown in Fig. 22, the production of the affiliation has increased over time. Iowa State University, among other affiliations, had the highest rate (N = 5) in 2013, whereas Islamic Azad University, which has been growing steadily over the years, hit peak levels in 2022 (N = 36).

Fig. 22
A multi-line graph depicts articles versus years from 2013 to 2023. The affiliations are Educ U N I V Hong Kong, JO W A state, Islamic Azad, Natl central, and Natl Taiwan normal U N I V. The values are increasing trends.

Production over time by institutions

3.3.2 Countries

According to Fig. 23, the authors’ countries of the articles were given, the corresponding authors’ countries indicated multiple country publications (MCPs), and the turquoise bar indicated single country publications (SCP). Accordingly, MCP indicates for each country the number of documents in which at least one coauthor is from another country. MCP measures a country’s international collaboration intensity. There were 227 corresponding authors from China. There is a total of 200 authors from the United States and 99 from Iran. However, most of the authors’ collaborations were limited to contributions from a single country. New Zealand had a high level of international collaboration (N = 6, MCP = 5) (Table 10).

Fig. 23
A stacked bar graph exhibits countries versus the number of documents for M C P and S C P collaboration. The highest value of S C P and M C P is in China, and the lowest value of S C P is in New Zealand, and M C P is in Malaysia and Thailand.

Countries of corresponding authors

Table 10 Information on corresponding authors’ countries

Every author’s nationality contributed to the anthology is considered in this map. There is a direct correlation between the number of articles and the magnitude of the color. As the number of documents increases, the color of countries will be darker. For instance, with 414 documents, China is darker than India, with five documents. Moreover, the USA and Iran, which had a darker color than other countries, were the two countries that ranked second and third after China in terms of the frequency of publications (Fig. 24 and Table 11).

Fig. 24
A world map exhibits the scientific production of the world. The legend at the bottom left indicates the number of documents 414, 18, 5, 3, and 1.

Scientific production of the world

Table 11 Frequency of publication per country

Let’s take a closer look at the results of our study regarding countries’ production since 2013, where China (N = 344) was the leading country, the USA, with 218 documents, was the second country, and Iran (N = 132) was chasing them as the third (Fig. 25).

Fig. 25
A multi-line graph depicts articles versus years from 2013 to 2023. The countries are China, Iran, Japan, the United Kingdom, and the U S A. The values of the curves are increasing trends.

Country production over time

The most cited countries were China (N = 2962), the USA (N = 1547), Iran (N = 719), and Spain (N = 641), respectively. Although the total number of citations in New Zealand, which is in the twelfth place, was approximately one-seventeenth of that of the leading country, China, the average number of article citations in New Zealand had the highest value with 28.67 (Fig. 26 and Table 12).

Fig. 26
A horizontal lollipop graph depicts countries versus the number of citations. The highest value in China is 2962, followed by U S A 1547, Iran at 719, Spain at 641, Turkey at 432, Canada at 417, United Kingdom at 379, Australia at 335, Japan at 319, and Sweden at 267.

Frequently cited countries

Table 12 Information on countries’ citation

3.4 Documents

The next section of the software will provide insight into some of the interesting features of the Biblioshiny by analyzing documents such as documents with the greatest local and global citations, references with the greatest local citations, reference spectrums, the most frequently used words, a word cloud, a tree map, and the frequency of words over time are all displayed (Fig. 27).

Fig. 27
A horizontal lollipop graph exhibits documents versus global citations. The highest value is Cenoz J 2014 Appl Linguist 260, and the lowest value is Sundqvist P 2015 system 84.

Documents with global citation

3.4.1 Cited References

Cenoz J.’s 2014 masterwork in the journal Applied Linguistics garnered the highest number of citations globally (N = 260). Additionally, the article published by HWANG WY in the Computer Assisted Language Learning journal received the most local citations in 2013 (N = 20). Furthermore, the global Citation metric measures the number of citations a document has received from all documents in the database (e.g., WoS or Scopus). Global citations indicate a document’s impact across the entire bibliographic database. For many documents, a significant portion of global citations may originate from a discipline other than the one in which the document is published. In contrast, local citations are a document’s citations from the studied collection.

Similarly, the following figure illustrates the local references that have received the most citations. With 66 citations, VYGOTSKY L.S., 1978, Mind in Society: Development of Higher Psychological Processes, had the highest number of local citations. The second place went to GOLONKA, EM et al. for their computer-assisted language learning work with 43 citations (Fig. 28).

Fig. 28
A horizontal lollipop graph depicts references versus local citations. The highest value is Vygotsky L S 1978 mind soc dev higher at 66, and the lowest value is Ellis R 2003 task-based language, and Horwitz 1986 M O D Lang J at 27.

The most cited local references

A method for determining the historical roots of research themes and topics is known as Reference Publication Year Spectroscopy (RPYS). RPYS highlights years with important findings in a paper’s cited reference profile. By doing so, it is possible to identify the historical roots of a discipline (Marx et al. 2014). A black line represents the number of citations per year, while a red line represents deviations from the median over the past 5 years (Fig. 29).

Fig. 29
A line graph exhibits cited references versus years from 1846 to 2020. The values of the curves are flat from 1846 to 1966, and then fluctuating trends.

Reference publication year spectroscopy

3.4.2 Words

If the user conducts keyword analysis, he or she may encounter some irrelevant words, such as conjunctions, adverbs, and plurals, that are irrelevant to the search. However, it is possible to eliminate words with the help of Bibliometrix for the time being. This issue can be avoided by creating a stop word list and uploading a TXT or CSV file that contains the words the user wishes to remove. Commas or semicolons should be used to separate or arrange items on a table. Additionally, margination analysis is plausible when plausible. This method is similar to the previous one: upload a synonym list in TXT or CSV format. Please note that the first word in the list will be replaced with the following (Fig. 30).

Fig. 30
A horizontal lollipop graph exhibits the author's keywords versus occurrences. The highest value is the Call 85, and the lowest value is e f l learners and mobile learning 24.

Most frequent keywords

CALL was the most frequent word, with 85 occurrences, followed by EFL (N = 69) and Computer-assisted language learning (N = 50) in the top three most frequent words. Various parts of the document may contain keywords that can be extracted and analyzed.

In WordCloud, the top keyword plus, subject categories, author keywords, title, and abstract words are displayed. Additionally, by adjusting parameters in the options, we are able to change some default settings, such as the shape, font type, and font color. Different scale transformations can be selected for a more vivid picture of word distribution, such as frequency, square root, log, and log10 (Fig. 31).

Fig. 31
A word cloud. The keywords such as learners, students, English, education, language, technology, call, classroom, perceptions, feedback, knowledge, motivation, instruction, impact, and others.

WordCloud derived from keywords

TreeMap uses color blocks to display how frequently different words occur visually. In a similar vein, you may analyze the title, abstract, and keywords of a document. Based on the results of TreeMap, we identified concepts such as CALL (N = 85, 10%), EFL (N = 69, 8%), computer-assisted language learning (N = 50, 6%), mobile-assisted language learning (N = 37, 4%), and English as a foreign language (N = 35, 4%) (Fig. 32).

Fig. 32
A tree map of the top 50 occurring keywords. The keywords are call, e f l, computer-assisted language learning, mobile assisted language learning, English as a foreign language, mall, motivation, e f l learners, mobile learning, vocabulary learning, and others.

TreeMap derived from keyword

It is possible to determine the most frequent word, year, and frequency by passing the mouse over the graph. The term “CALL” had 5 times the frequency at the beginning of the study, and by the end of 2022, it had increased to 83 times the frequency. In the beginning, there were similar occurrences of computer-assisted language learning and EFL concepts (N = 5), but EFL reached 64, and computer-assisted language learning was ranked third with 48 occurrences (Fig. 33).

Fig. 33
A line graph of cumulative occurrences versus years 2013 to 2023. The increasing terms include call, computer-assisted language learning, computer-assisted language learning of call, e f l learning, English as a foreign language, mobile assisted language learning, mobile learning, and motivation.

Frequency of words over time

Trend-topic graphs are scatter diagrams in which the x-axis represents time, and the y-axis represents the topic. Each topic’s reference year is determined based on the median of occurrences during the period considered. On the graph, each bubble represents a topic. There is a direct correlation between the size of the bubble and the number of occurrences of the words. Thus, the bubble’s size indicates the term’s frequency and the length of the lines indicates how long it has been studied (Fig. 34).

Fig. 34
A graph depicts terms versus years from 2014 to 2022. The values are increasing trends with the term frequencies 20, 40, 60, and 80.

Trends topic based on author’s keyword

The author’s keyword indicates that English as a foreign language has been a trending topic. Based on the analysis, the other trending topics were English as a foreign language, computer-assisted language learning, and CALL.

3.5 Clustering

The concept of bibliographic coupling can be viewed as the opposite of co-citation. In other words, it refers to a relationship between two or more documents that cite one another. Bibliographic coupling assumes that two documents can be significantly correlated even if they do not directly cite each other. Assuming that the two documents share at least one bibliographic reference, this is a reasonable assumption. Based on their symmetrical alignment and similar references, these documents are similar. The next few figures will illustrate how network clustering, maps, and clusters work by coupling them. It should be noted that some parameters can be altered based on our bibliographic needs, such as the units of analysis (documents, authors, sources), the impact of measures (local citation score, global citation score), and attributes (cluster labeling, coupling measure) (Figs. 35 and 36 and Table 13).

Fig. 35
A bubble chart depicts impact versus centrality for authors' coupling. The terms are mobile-assisted language learning, call, e f l, mall, mobile learning, e f l learners, computer-assisted language learning, motivation, English as a foreign language, and others with different values.

Clustering by authors’ coupling

Fig. 36
A network map on authors. The highlighted keywords are Chang Ke, Lee J h, Zhang Y, Wang Y H, Hwang w y, yang x m, and others.

Network map based on authors

Table 13 Information of clusters based on keywords

3.6 Conceptual Structure

This conceptual structure establishes a map of the scientific field by assessing correspondences, multiple correspondences, and clustering of terms in a 2-dimensional network using a vertices network of terms obtained from the keyword, title, and abstract fields (Hubert 1980). Also, it is composed of three main parts: a co-occurrence network, a thematic map, and a thematic evolution chart. The probable link between two bibliographic items appearing in the same research is evaluated in co-occurrence network analysis. Figures 45 and 46 illustrate the co-occurrence network and degree plot analysis of the author’s keyword. According to the size of the bubbles in the figure, CALL, EFL, and computer-assisted language learning are the most frequent keywords, ranked accordingly, as well as in the degree plot diagram (Figs. 37 and 38).

Fig. 37
A co-occurrence network map based on keywords. The keywords are computer-assisted language learning, e f l, call, mobile-assisted language learning, blended learning, learner autonomy, motivation, feedback, and others.

Co-occurrence network based on keywords

Fig. 38
A line graph exhibits cumulative degree versus node. The decreasing values of the curve with labeled call degree 1, e f l degree 0.787, and computer-assisted language learning degree 0.605.

Degree plot of co-occurrence network based on keywords

According to the thematic map, there were a total of 12 clusters identified. Based on the author’s keywords, the basic (developing) themes were (English, language, student), (call, knowledge, communication), and (competence, second-language acquisition, and negotiation). Modality was an emerging or declining theme. In terms of the motor (developed) theme, no parameters were available. Among the niche topics that were covered were (ESL, world English), (Japanese, words, speech perception), (organization, repair, conversation analysis), (attention, discourse, patterns), and (technology acceptance). A few invariance clusters are located between two of the basic quadrants and the motor quadrants, such as (technology, education, performance) and (accuracy, complexity, and performance). The video, input, and environment clusters were also located among the basic, motor, and emerging themes (Fig. 39).

Fig. 39
A bubble chat depicts the development degree versus the relevance degree of keywords. The keywords such as technology acceptance model, Japanese words speech-perception, organization repair conversation analysis, modality, and English language students.

Thematic map of keywords

The importance of identifying changes in terminology and the evolution of research fields in disciplines such as bibliometrics and scientometrics cannot be overstated. Thematic evolution analysis is a method used to reveal hidden key elements of the study, such as topics, by looking at the theme of the study about a specific topic but from different periods. Figure 47 depicts the evolution of keywords over three distinct periods (2013–2016, 2017–2020, and 2021–2023). It is important to note that the keywords “computer assisted language learning”, “call”, and “mobile assisted language learning” are important keywords as they are present in all three stages; however, covid-19 is only presented in the third stage. Most of the studies focused on using various technologies to improve language learning (Fig. 40).

Fig. 40
A Sankey graph with 3-field plot of keywords. Keywords, 2013 to 2016 are computer-assisted language learning, e f l teachers, grammar, call, and e-learning. 2017 to 2020, call, attitudes, e f l teachers, and activity theory. 2021 to 2023, collaborative writing, language learning, and e f l learning.

Thematic evolution of keywords

3.6.1 Factorial Analysis

After conducting a factorial analysis of the author’s keywords of the articles about the English language and technology, it was determined that the following concepts have a high factor load in the first dimension when examined in this study: computer-mediated communication, collaborative writing, writing, second language acquisition, collaborative learning, second language writing, computer assisted language learning, telecollaboration, identity, call, online learning, pronunciation, language learning, e-learning, listening comprehension, attitudes, vocabulary acquisition, mobile assisted language learning, and English. Likewise, in the second cluster of keywords, you can find the following terms: technology, grammar, feedback, WhatsApp, mall, and CMC, among others (Fig. 41).

Fig. 41
A chart illustrates the conceptual structure map and keyword clusters of D i m 2 versus D i m 1. The keywords such as collaborative learning, telecollaboration, identity, online learning, e-learning, English, grammar, c m c, feedback, and WhatsApp.

Conceptual map and keyword clusters

Dendrograms are diagrams that depict hierarchical relationships between objects, illustrated by their hierarchical arrangement. By contrast, hierarchical clustering illustrates the arrangement of the clusters determined by the corresponding analyses through a diagrammatic representation. As the name suggests, a dendrogram is primarily used to determine the best way to allocate objects to clusters based on their characteristics. Furthermore, the distance between the clusters can be seen on the Y-axis of the graph, which is the distance between the clusters, and on the X-axis, there are the subject concepts for which the data points of the clusters are grouped (Fig. 42).

Fig. 42
A dendrogram depicts the hierarchical relationship between objects. It is divided into many clusters. The highest cluster includes WhatsApp, c m c, technology, language learning, mall, feedback, and grammar.

Topic dendrogram

3.7 Intellectual Structure

An intellectual structure is a methodological technique for identifying what authors, documents, or sources have had a major impact on the academic field (Kessler 1963; Small 1973). Several basic concepts describe a field, such as the major themes, subthemes, and patterns that are associated with that field. Co-citation network data shows that the result is a result of the impact of different sources on each other, which is clear from this graph (Fig. 43).

Fig. 43
A network diagram represents a co-citation network of sources. The sources such as computer-assisted language 1, system, recall, language learning technology, and computer educations.

Co-citation network of sources

In addition, the degree plot diagram shows the cumulative degree of the sources based on the order in which they appear. Using the co-citation network, it is possible to see how the sources influence one another and how they are interconnected. The degree plot diagram represents the sources’ cumulative degree in order of occurrence (Fig. 44).

Fig. 44
A line graph of cumulative degree versus node for a co-citation network of sources. The decreasing value of the curve labeled computer-assisted language learning degree 1, language learning technology degree 0.958, system degree 0.877, modern language journal degree 0.795, and recall degree 0.781.

Degree-plot of co-citation network of sources

Hence, as shown in Fig. 45, the historigraph of the study gives a full view of the document, keyword, and first author’s collaboration, along with the year in which it was completed.

Fig. 45
A network diagram illustrates the first authors. The authors include Wang Y J 2013, Hwang W Y 2013, and 2014, Yen Y C 2015, Chen T 2016, and Shadev R 2017.

Historigraph of first authors

3.8 Social Network

Collaboration networks can be used to describe the social structure at the various levels of scientific cooperation (Glänzel 2002). In the last section, we are able to see how authors, institutions, and countries collaborate, which illustrates an easier understanding of the issue. China was regarded as the number one country, followed by the USA and Iran. It is worth noting that Islamic Azad University in Iran was considered a leading institution in the world, followed by the National Normal University of Taiwan and Education University of Hong Kong. Among the authors with the most collaborations, HWANG WY, RAHIMI M, JIANG LJ, and WU WCV had the highest collaboration rates, respectively (Figs. 46, 47, and 48).

Fig. 46
A network diagram illustrates the collaboration of authors. The authors include Jiang L J, W U W C V, Hwang W Y, Shadiev R, Edadi S, and Rahmi M.

Collaboration network of authors

Fig. 47
A network diagram of collaboration of institutions. The institution such as Islamic Azad University, Education University Hong Kong, Natl Taiwan Normal University, and Nati Central University.

Collaboration network of institutions

Fig. 48
A network diagram illustrates the collaboration of countries. Countries such as China, Iran, the U S A, Canada, Norway, Bangladesh, Sweden, and Spain.

Collaboration network of countries

Additionally, this multi-purpose software allows users to visualize the collaboration between countries on the map. The frequency of the number of publications counted determines the thickness of each curve on the map of international collaborations. Thus, there is a very dense network between the USA and China. In other words, the USA has become CHINA’s first international co-authorship partner with 30 frequencies. The United Kingdom became China’s second international partner with 8 frequencies (Fig. 49).

Fig. 49
The world map represents the countries' collaboration. The thickness of the curve indicates the frequency of the number of publications. The dense network between U S A and China.

Countries’ collaboration across the world

4 Conclusion

It is very necessary and intriguing for academic professionals to be aware of the most popular research topics, such as books, journals, and other works that have made significant contributions to their field. Hence, scholars are encouraged to stay updated with research trends in their profession to make more educated judgments about the topics they will investigate and remain on top of emerging trends (Lei and Liu 2019). Furthermore, the significance of science mapping as a vital activity is becoming increasingly obvious to researchers working in every field of the scientific discipline. There is no doubt that the number of publications is expanding at an accelerating rate and that many of these articles are emerging in a vague manner, which complicates the process of knowledge accumulation (Aria and Cuccurullo 2017). Meanwhile, policy and practice rely upon that, and it helps determine the intellectual framework and research frontiers of scientific subjects. Generally, it is common to use specialized software tools to perform only certain steps in a science mapping analysis. In fact, only a few of them are available to scholars that permit them to follow the complete workflow from start to finish. A comprehensive science mapping analysis of scientific literature can be implemented using the open-source software Bibliometrix. It is possible to gain information about a particular work’s intellectual structure and conceptual framework due to bibliometric analysis, a data-driven approach to analyzing the literature. In this way, the user can gain insight into the progress of research on certain topics (Zupic and Čater 2015, 9).

Science mapping is becoming an essential activity for scholars of all scientific disciplines. However, in bibliometrics, the use of scientific workflows is still in its early stages. Moreover, it has become increasingly difficult to accumulate knowledge as the number of publications continues to grow exponentially (Aria and Cuccurullo 2017; Guler et al. 2016). Software tools that are specialized in science mapping analysis are commonly used only to perform certain steps of the process. In fact, only a few of these software allow scholars to follow the entire workflow in detail. This article has demonstrated how scientific workflow managers such as Biblioshiny, a powerful tool for managing bibliometric analyses that allows users to integrate online databases, statistical analysis, and data visualization, can facilitate scientific workflow management. Based on the above, identifying the intellectual structure and research frontiers of scientific domains has become vital to research, policy-making, and practice as a whole.

In summary, this study was conducted in order to introduce a bibliometric analysis using the Bibliometrix package and to introduce the Biblioshiny interface, which is easily implemented by the R programming language, and to perform a bibliometric analysis using the Biblioshiny. This study focused on English language and technology-related publications published in the fields of Language and Linguistics, Education and Educational Research. It is worth mentioning that between 2013 and 2022, 881 articles were searched in the WoS database.