Keywords

1 Introduction

Scientific social networks such as ResearchGate and free-of-charge open access repositories such as the Computing Research Repository (CoRRFootnote 1) have significantly lowered the barrier for sharing research results in the form of individual papers. Open access repositories for complete proceedings of scientific events include the Proceedings of Machine Learning Research (PMLR) and the Electronic Proceedings in Theoretical Computer Science (EPTCS), addressing specific fields of computer science, and the CEUR Workshop Proceedings (CEUR-WS.org), addressing workshops from all over computer science.Footnote 2 Each of these employ an individual workflow for publishing, which proceedings editors and/or authors need to follow strictly to keep the effort low for those who run the service, usually volunteers. For example, PMLR requires editors to provide a BibTeX metadata database following specific rulesFootnote 3, EPTCS acts as an overlay to CoRR, i.e. requires papers to be pre-published there, and CEUR-WS.org requires editors to provide an HTML table of contents following a certain structureFootnote 4. Here, we focus on facilitating the latter by adding a web-based graphical user interface to a tool that auto-generates such tables of content, improving over the usability issues of the previous standalone command-line version of that tool.

Section 2 provides a more precise problem statement. Section 3 discusses related work. Section 4 presents the design and implementation of our web-based user interface. Section 5 evaluates the usability of the frontend in comparison to its command-line backend. Section 6 concludes with an outlook to future work.

2 Problem Statement

2.1 The Publishing Workflow

The HTML table of contents of a CEUR-WS.org workshop proceedings volume includes metadata about the workshop (title, date, venue, proceedings editors, etc.) and each of its papers (title, authors). This structure is prescribedFootnote 5; around once a year, it has so far evolved a bit, e.g., in the form of more explicit semantic annotations to facilitate reuse of the metadata. Besides following the latest template and producing syntactically valid HTML, requirements for proceedings editors include following a consistent capitalisation scheme for paper titles, providing full names of authors, and using relative links to the full texts of the individual papers (typically PDF files). The HTML table of contents together with the full texts has to be submitted to CEUR-WS.org as a ZIP archive.

2.2 Automation of the Workflow with Ceur-Make

Traditionally, proceedings editors had to prepare the submission ZIP file manually. With ceur-makeFootnote 6, the second author, technical editor of CEUR-WS.org, has provided a tool to automate part of this job – aiming at three objectives:

  • Helping proceedings editors to learn more quickly how to create a table of contents, reducing their effort, and helping recurrent editors to cope with structural changes.

  • Reducing the workload of the volunteers who carry out the subsequent publishing steps at CEUR-WS.org; so far, around one in ten submissions requires further communication with its editors to resolve problems, mainly rooted in the table of contents.

  • Reducing the implications that subsequent improvements to the structure of the table of contents have on both proceedings editors and the CEUR-WS.org team by reducing their exposure to manual editing.

For ceur-make, the metadata about the workshop and its papers have to be provided in two XML files. ceur-make can auto-generate the latter XML file from the metadata that the widely used EasyChair submission and review management system exports in LNCS mode (cf. Sect. 3.1). From these two XML files, ceur-make auto-generates an HTML table of contents and finally a ZIP archive conforming with the CEUR-WS.org requirements. In addition, ceur-make can generate a BibTeX database to facilitate the citation of the papers in a proceedings volume, as well as a copyright form by which the authors agree to the publication of their papers with CEUR-WS.org.

2.3 Shortcomings of Ceur-Make

Shortcomings of ceur-make include that it depends on a Unix-style command line environment and a number of software packages that typically only developers have installed: the Make build automation toolFootnote 7, the Saxon XSLT processor and the Perl scripting language. Furthermore, it requires proceedings editors to edit one or two XML files manually, without validating their content with regard to all rules that editors should follow. It also requires them to follow certain conventions for naming and arranging files and directories; most importantly, the sources of ceur-make have to be downloaded to the same directory in which the proceedings volume is being prepared. These reasons may explain why ceur-make has so far only been used for less than one in ten proceedings volumes.

2.4 Research Objectives

The objectives of our research were 1. to assess the shortcomings of ceur-make in a more systematic way, and 2. to overcome them by providing a user-friendly web frontend to ceur-make.

3 Related Work

3.1 Conference Management Systems

The complex process of managing scientific events (conferences, workshops, etc.) is facilitated by a broad range of systems, of which we briefly review three representatives and their proceedings generation capabilities. Parra et al. have reviewed further systems without providing details on proceedings generation [6]. In computer science, EasyChair Footnote 8 enjoys particularly wide usage.Footnote 9 EasyChair features a special “proceedings manager” interface, which is initialised by adding all accepted papers and then supports the collection of the final (“camera ready”) versions, including a printable form (usually PDF), editable sources (, Word, or anything else, e.g., HTML, in a ZIP archive), and a copyright transfer form. Proceedings chairs can define an order of papers and add or edit additional documents such as a preface. Specific support for exporting all these files and their metadata is provided for events that publish in Springer’s Lecture Notes in Computer Science (LNCS) series. Microsoft’s Conference Management Toolkit (CMTFootnote 10) assists with publishing accepted papers to CoRR. With a professional license, ConfTool Footnote 11 supports the export of metadata in multiple formats (Excel, XML and CSV) to facilitate proceedings generation.

3.2 Usability Evaluation of Command Line Vs. GUI

Comparing the usability of command-line (CLI) vs. graphical user interfaces (GUI) has been a long-standing research topic. Hazari and Reaves have evaluated the performance of students in technical writings tasks using a graphical word processor vs. a command-line tool [3]. Starting from the same level of background knowledge and given the same time for training, a significantly larger share of users felt comfortable using the GUI rather than the command line; also, their task-based performance was slightly higher with the GUI. Gracoli is an operating system shell with a hybrid user interface that combines GUI and CLI [12]. Its design is motivated by common drawbacks of CLIs, which are stated as follows:  – the user can interact with the application in a limited way; – the output is hard to understand for the user;– the user does not easily get a clue of how to perform a task.

4 Design and Implementation of CEUR Make GUI

4.1 Architecture

The CEUR Make GUI is a graphical layer built on top of ceur-make. Figure 1 shows its three-layer architecture (Interface, Middleware, and Storage).

Fig. 1.
figure 1

System architecture of CEUR make graphical user interface

The Interface Layer consists of all the presentation elements. It displays visual elements, handles dependencies on external libraries for user interface elements, styles the web pages, validates forms and manages user interaction the web pages. It also initiates the communication with the Middleware Layer on user’s request and displays the results from the Middleware. Technologies used on Interface Layer include standard web technologies used for front end clients (HTML 5, CSS, JavaScript), and the following libraries: Materialize CSSFootnote 12 is a JavaScript and CSS library based on Google’s Material Design principlesFootnote 13, used here to incorporate standard design patterns into the GUI. jQuery StepsFootnote 14 is used to create wizards for taking inputs.

The Middleware Layer generates artifacts required for publishing at CEUR-WS.org. The Middleware Layer creates the files, as requested through the Interface Layer, by running ceur-make. It returns links to the artifacts stored at the Storage Layer to the Interface Layer, thus acting as a service provider.

The Storage Layer stores the files that are created temporarily on the server. It separates the files based on the user’s identity and then also based on the workflow that the user chooses to create the artifacts for publishing (manual metadata input vs. EasyChair import).

Fig. 2.
figure 2

Navigational menu of CEUR make graphical user interface

4.2 Interface

We aim at providing an easy to use, task oriented interface. On the main screen, we give users the option of switching between four tasks: viewing announcements, viewing published proceedings, publishing a proceeding and reporting an issue through a navigational menu (cf. Fig. 2). Further, we separate the site navigation of the two proceedings publishing workflows using a Card design pattern  [9], representing each workflow as an independent card (cf. Fig. 3a). We follow the Wizard design pattern [11] to collect workshop metadata input from users (cf. Fig. 3b). For the list of all proceedings volumesFootnote 15, we follow the style of the CEUR-WS.org user interface but make it more accessible by following standard design patterns. We applied the Pagination design pattern [10] to address the problem of the current CEUR-WS.org site that one has to scroll down a lot because all content is displayed at once; secondly, we applied the Autocomplete design pattern [8] to facilitate the task of finding proceedings volumes already published at CEUR-WS.org easier.

Source code, documentation and a working installation of the CEUR Make GUI are available at https://github.com/ceurws/ceur-make-ui.

Fig. 3.
figure 3

Workflow screens

5 Evaluation

5.1 Methodology

Participants. Twelve persons participated in the evaluation of the usability of the ceur-make CLI vs. the GUI. We chose nine participants with previous CEUR-WS.org publishing experienceFootnote 16 and three participants without. The latter were trained to publish at CEUR-WS.org to avoid learning biases in our evaluation results.

Procedure. The participants were divided into two groups based on their availability. Those who were physically available participated in a Thinking Aloud test [7], and the other ones participated in a Question Asking test:

 

Thinking Aloud: :

Participants were provided with task definitions as explained below. They were asked to think aloud about their plans and their interaction with the system, particularly including problems they faced or unusual mental models, while the evaluator took notes. The task completion time was recorded for the purpose of comparison.

Question Asking: :

In a video conferencing setting with screen sharing (using Skype), the evaluator performed each task according to its definition. The participants were allowed to ask questions during the usability test, where the evaluator also asked questions to test the user’s understanding. From an audio recording, the evaluator compiled a transcript of pain points afterwards.

 

Following a within-subject design setupFootnote 17, each participant first had to test the CLI and then the GUI. The participants were given four tasks to be performed in each system, designed to cover all major use cases of the system in a comparable way: 1. Initiate generation of a proceedings volume, 2. Generating workshop metadata, 3. Generating table of contents metadata, and 4. Search a proceedings volume. Each tasks were subdivided into smaller steps, e.g., as follows for Task 4:

figure a

For the full list of task definitions, please see Appendix A and B of [1].

Usability tests were followed by a post study questionnaire for each user, which was created and filled using Google FormsFootnote 18. The questionnaire was divided into following sections:

  • System Usability Scale (SUS [4]), a ten point heuristic questionnaire to evaluate general usability of the system on a Likert scale from 1 (strongly agree) to 5 (strongly disagree).

  • Question for User Interaction Satisfaction (QUIS [5]), a 27 point questionnaire to evaluate specific usability aspects of the system, covering overall reaction to the software, screen layout, terminology and system information, as well as learning and system capabilities, from a scale from 0 (lowest) to 9 (highest). The mean score was calculated for every user.

Dataset. All users used the same input data for both systems to ensure unbiased comparability of the content created and of completion times across users and systems. A full record of the data is provided in Appendix A and B of [1].

5.2 Results

This section summarizes the evaluation results; for full details see Appendix C and D of [1].

Quantitative Results (Completion Times).

Table 1 shows the completion times per system and task - in detail:

  1. 1.

    Task 1 (Initiate generation of a proceedings volume): This required entering a terminal command for the CLI and pressing a button in the GUI. On average, this took less time in the CLI, but the difference is too marginal to be significant.

  2. 2.

    Task 2 (Generating Workshop Metadata): This required entering workshop metadata into the GUI input wizard, and using a text editor and the command line in the CLI. The difference in completion time is significant: completing the task using the GUI took only 60% of the time taken using the CLI, which emphasizes the user-friendliness of the GUI.

  3. 3.

    Task 3 (Generating Table of Contents Metadata): This required entering metadata of two papers similarly as for Task 2, with similar results.

  4. 4.

    Task 4 (Search a proceedings volume): This task took 7.6 times as long on the CEUR-WS.org homepage compared to the GUI. This result highlights the importance of using the autocomplete design pattern for searching in the graphical user interface, compared to just the “find in page” search built into browsers.

Overall, users took significantly less time to complete tasks with the GUI, which proves the usability improvement it provides over the CLI.

Table 1. Quantitative usability evaluation results using thinking aloud
Table 2. Qualitative results for ceur-make and the CEUR make GUI

Qualitative Results. Notes recorded while performing the usability test were categorized in ten heuristics, i.e.: Speed in performing a task, Documentation of the software, Ease in performing a Task, Learnability of the software, clear Navigation structure of the system, Portability of the system, Error correction chances, easy to use Interface, Dependency on other systems and Features to be added. Table 2 shows the number of responses of the twelve participants for each qualitative heuristic, where “bad” means they were not comfortable using it, “good” means they liked the software, and “excited” means that the user is interested but would like to see more features to be implemented.

The total number of qualitative responses was 36 for the CLI and 34 for the GUI. 15 good responses were recorded for the CLI, regarding the heuristics Speed, Documentation and Task, whereas 21 bad responses were recorded regarding the heuristics Learnability, Portability, Navigation, Error, Interface and Dependency. For the GUI no response was recorded against the heuristics Speed, Documentation and Task, which were reported as good for the CLI. This was the case because users did not require documentation to operate the GUI as they never requested for it from the evaluator, speed was not an issue while using it as they never complained about it, and it enabled them to perform their respective tasks. 26 good responses were recorded for the GUI against the heuristics Learnability, Portability, Navigation, Error, Interface and Dependency, which were all recorded as bad in case of the CLI. This highlights the usability improvement provided by the GUI over the CLI. Only two bad responses were recorded for the GUI, against the heuristic Navigation, which means a slight improvement in navigation is required − as quoted by a user: “ceur-make make things easier but has a complex setup, whereas the GUI is straightforward and requires no prior learning. With little improvement in the flow of screens it could be even better.”

Moreover, for the GUI, 6 responses were recorded as excited, against the heuristic Feature, which means users would like to use the software and would like additional features to be integrated.

Fig. 4.
figure 4

SUS score: ceur-make vs CEUR make graphical user interface

Post Evaluation Questionnaire. The overall usability of the two systems was evaluated using System Usability Scale Footnote 19. The SUS score for ceur-make was 41.25, which is below grade F (x-axis) as shown in Fig. 4. This rating demands immediate usability improvements. On the other hand, the SUS score of the GUI was 87.08, which is above grade A (x-axis). This means that the GUI has a good usability and its users would recommend it to others.

Results of the Questionnaire for User Interaction Satisfaction reflect a high usability improvement of the GUI over the CLI. For the questions related to the learnability of the system, a visible difference in mean scores was recorded for easy to remember the commands, learning to operate the system and trying new features by trial and error. For these three questions, mean scores of the CLI were 3.75, 3.25, and 4.25 (all below average) and for the GUI they were 8.5, 8.25, and 8.0 (all above average). Likewise, mean scores for the GUI for the questions related to information representation, including information organization, positioning of messages, highlighting of information to simplify task, prompts and progress were 7.75, 8.0, 6, 6, and 6.25 (above average), whereas for the CLI they were 4, 4, 2.25, 4, and 3.25 − i.e.a notable difference. Another highlight was that users appreciated that the GUI was considered to be designed for all levels of users as backed by a mean score of 7.75, whereas the CLI was considered not to be designed for all levels of users as its mean score was just 3.

6 Conclusion

We aimed at automating a workflow for publishing scientific results with open access, focused on the CEUR-WS.org workshop proceedings repository. We developed a graphical user interface on top of the ceur-make command line tool and systematically evaluated the usability of both. Quantitative results on task completion time prove that the GUI is more efficient in performing common tasks. Qualitative evaluation suggests that on all heuristics where ceur-make performed badly, i.e., learnability, navigation, portability, error, interface and dependency, the GUI yielded good responses. In the post-evaluation questionnaires, a notable difference was recorded in the SUS scores of the two systems: grade F for ceur-make vs. grade A for the GUI. 11 out of 27 QUIS questions of ceur-make had responses below average, and others were satisfactory, whereas for the GUI all responses were above average. Overall, results indicate that the usability of the GUI has noticeably improved over the command line. As our evaluation setup covered most typical tasks of proceedings editors, the results suggest that the GUI makes the overall process of publishing with CEUR-WS.org more effective and efficient and thus will attract a broad range of users. Thanks to the input validation of the metadata wizard and to the detailed explicit semantic RDFa annotations of tables of contents that ceur-make outputs, broad usage of the GUI will improve the quality of CEUR-WS.org metadata, largely eliminating the need for reverse-engineering data quality by information extraction (cf. [2]).

Future Work The next immediate step is to officially invite all CEUR-WS.org users to use the GUI for preparing their proceedings volumes. Partly inspired by feedback from the evaluation participants, we are planning to enhance the GUI with functionality addressing the following use cases (all filed as issues at https://github.com/ceurws/ceur-make-ui/issues/): User Profiles would help to automatically suggest information related to the editors while working on that section of the workshop metadata (Issue #1). Even without user profiles, building and accessing a database of previously published workshops’ metadata would facilitate input, e.g., by auto-completing author names, and by reusing metadata of a workshop’s previous edition. The RDF linked open data embedded into ceur-make generated tables of contents or extracted from old tables of contents (cf. [2]) can serve as such a database once collected in a central place (#5). A Collaborative Space for Editors would support multiple editors to work in parallel on a proceedings volume (#2). Saving System State would improve user experience and give users more control (#4). Currently, there is no way to restore the state of the interface, where one left in case the browser is accidentally closed or users want to complete the task later. Extraction of Author Names, Titles and Page Numbers from the full texts of the papers would further lower task completion time, as the system would automatically suggest most metadata (#3).