Keywords

1 Introduction

The literature describes a large number of methods and instruments used to ensure the quality of the usability of a product or service [1,2,3]. There are usability evaluation methods for all design and development phases, from initial definition to final modifications of a product or service [4]. Furthermore, some of these methods are only suitable for a specific stage of the development process [5].

Within the usability evaluation methods, questionnaires assume a significant importance for qualitative self-reported data collection related to the characteristics, thoughts, feelings, perceptions, behaviors or attitudes of the users [4]. Questionnaires have the advantage of being low budget techniques, that do not require measurement equipment, and their results reflect the users’ opinions. They also provide useful information about what are the strengths and weaknesses of a product or service.

Since the 80 s of the last century, researchers felt the need to develop and evaluate products and services in a systematic and methodical way, considering psychometric properties of usability questionnaires [6]. Consequently, several questionnaires were developed and validated, and have long been used in the usability evaluation of products and services.

When translating any questionnaire it is necessary to ensure that the resulting version is valid and reliable. The assessment of the validity and reliability of a questionnaire turns possible to conclude that the measurements made by the respective translated version are congruent with the measurements made by the original version, and that these measurements are reproducible independently of the evaluators and the participants.

The Usefulness, Satisfaction and Ease of use (USE) was originally developed by Arnold Lund in 2001 [7]. It is a self-perceived usability questionnaire with 30 items, and each item with a seven-point Likert rating scale. Users are asked to rate agreement with the statements, ranging from strongly disagree to strongly agree [7].

The objective of this study was to translate and adapt the USE questionnaire to the Portuguese culture, and to validate the resulting European Portuguese version. The translation, cultural and linguistic adaptation, as well the validation process followed the guidelines established internationally [8] in order to ensure the quality of the resulting translation and semantic equivalence, safeguarding the consistency of the meaning of the original constructs.

In addition to this introductory section, the paper comprises four more sections: Related Work, Methods, Results, and Discussion and Conclusion.

2 Related Work

The USE is a generic questionnaire with 30 items utilized to assess several dimensions of usability. It is based on the principle that usability consists of usefulness and ease of use, which influence one another, and may determine user satisfaction and frequency of use [7]. It can be used to assess the usability of products and services (e.g. software, hardware, applications or user support materials) and it also allows meaningful comparisons in different domains, even if the products and services evaluations were performed at different development stages and perhaps under different circumstances. The USE is not intended to be a diagnostic instrument, but rather an instrument to evaluate the different dimensions of usability as dependent variables [7].

The USE has been utilized to evaluate the usability in very different domains. For example, it has been utilized to evaluate technologies in the health care domain (e.g. evaluation of rehabilitation equipment for patients with stroke [9], robotics applied to rehabilitation [10], personal health records [11, 12], prevention applications [13,14,15], tools to support diagnostic [16] or informal caregivers empowerment [17]), applications to browse the contents of maps [18] or videos [19], social networking sites [20], e-learning environments [21, 22], tools to support engineering work [23], virtual reality [15], remote labs management [24] or visualization of ontologies [25].

Some of the studies published in the literature are recent (e.g. less than two years [10, 12,13,14,15,16,17]), and others involve mobile applications [13, 14, 26].

In Portugal the literature reports the utilization of the USE from north [23] to south of the country [18, 19] although, strangely, it is not known any validated European Portuguese version of the questionnaire.

The fact that the USE is a usability questionnaire widely utilized, including in Portugal, justifies the importance of having a validated European Portuguese version.

In order to guarantee the quality of the translated version, it is not enough to perform a literal translation of the original version. A linguistic and cultural adaptation is crucial to ensure that the constructs of each item of the translated version have the same meaning as the respective original item.

To verify if the translated version is consistent with the original questionnaire, an observational study must be implemented to evaluate the internal consistency of the resulting version of the questionnaire, as well as its validity (i.e. accuracy of the results) and reliability (i.e. consistency of measurements, namely if they are reproducible).

3 Methods

The cross-cultural adaptation of a questionnaire involves two main steps [27]: (i) questionnaire translation - assessment of the conceptual and linguistic equivalence, and (ii) questionnaire validation - evaluation of the psychometric properties of the instrument.

3.1 Phase I - Questionnaire Translation

The translation process of the original version of the USE was performed in accordance with the internationally established guidelines [28] and involved the following steps:

  • Step 1 (Translation): The original version of the USE was translated to European Portuguese by two independent translators whose native language is European Portuguese.

  • Step 2 (Reconciliation version): Three researchers compared the two translations and built a reconciliation version between them and the original version of the USE.

  • Step 3 (Retroversion): The reconciliation version was translated from European Portuguese into English by a translator whose native language is English, without knowledge of the USE original version. The retroversion was made to confront the original version with the translated one and an analysis was performed to verify if both were equivalent.

  • Step 4 (Pre-final version): A committee of three researchers developed the pre-final translated version of the USE based on the back-translation and on its original version.

  • Step 5 (Pilot Test): The pre-final version was submitted to a pilot test with 4 individuals of the general population to assess the easiness/ difficulty of understanding the questions, according to the methodology proposed by Foddy [28]. The collected information was used to improve the instrument and build its final version.

  • Step 6: The back-translation and the description of the translation process were sent, as a courtesy, to the author of the USE.

3.2 Phase II – Questionnaire Validation

For the questionnaire validation, an observational study was performed in a non-profit social organization, the Cáritas Diocesana de Coimbra. The reliability and validity of the USE were based on real data collection. This process consisted of a usability assessment of a Virtual Assistive Companion (VAC) [29], which is a virtual avatar that assists elderly users in their daily activities.

The VAC prototype included components for speech recognition, dialogue management and synthesis of expression, sound and movement. The VAC operates on an all-in-one computer (Lenovo ThinkCentre Edge 93z All-in-One) and supports various interaction modalities.

According to the preferences of the user and depending on the distance between the user and the hardware, interaction can be done either by speech (2–3 m) or via a graphical user interface on the computer’s touchscreen (arm length).

The VAC graphical user interface (Fig. 1) is built on top of the Behavioral Markup Language to describe the physical realization of behaviors, such as speech and gesture and the synchronization constraints between these behaviors. It closely simulates human conversational behavior through the use of synthesized voice and synchronized non-verbal behavior such as head nods, posture shifts, facial expressions and hand gestures.

Fig. 1.
figure 1

Usability evaluation of the implemented VAC prototype.

The usability of the VAC prototype was evaluated with USE and other usability questionnaire, the Post-Study System Usability Questionnaire (PSSUQ) [30] that was already validated for European Portuguese [31].

The PSSUQ is a usability evaluation questionnaire developed by IBM. It is composed by 19 items aimed at addressing five usability characteristics of a product or service: (i) rapid completion of the task; (ii) ease of learning; (iii) documentation quality and online information; (iv) functional adequacy; and (v) rapid acquisition of productivity [30, 32].

The participants were selected according to the following inclusion criteria: (i) age over 18 years; and (ii) ability to read, understand and sign the informed consent. Therefore, all adults able to fill in the instruments used in this usability assessment were eligible to participate if they gave written informed consent. The written informed consent was obtained prior to data collection.

The observational study took place between May and August, 2016, and comprised two period sessions separated by 2 to 4 weeks for each subject. The sessions consisted of four parts:

  • Introduction - The evaluator applied a social demographic questionnaire and then delivered the session script, explaining orally all information contained therein.

  • Test - The subject performed the tasks described in the session script.

  • Usability Assessment Instruments - The evaluator assisted the USE and the PSSUQ filling.

  • Summary - The evaluator thanked the participation of the subject and, if necessary, scheduled the next evaluation session.

Although the data collected were not of a sensitive nature, the underlying principles of the Helsinki Declaration were considered [33]. Therefore, in addition to all participants gave written informed consent, which was part of the data collection protocol, the necessary authorizations were requested and all data collected were anonymized.

Statistical analyzes were performed with SPSS - Statistical Package for Social Sciences (SPSS Inc, Chicago). To describe and characterize the subjects who constitute the sample, central tendency and dispersion measures were used, including mean, range and standard deviation.

To assess the internal consistency, the Cronbach’s alpha (α) was calculated. The Cronbach’s alpha values range between 0 and 1 being “inadmissible” if α < 0.60; “Weak” if 0.60 ≤ α < 0.70; “Reasonable” if 0.70 ≤ α < 0.80; “Good” if 0.80 ≤ α < 0.90 and “very good” if α ≥ 0.90 [33].

The inter-rater reliability was assessed using Intraclass Correlation Coefficient (ICC). The ICC varies between 0 and 1, and is considered “weak” (ICC < 0.40); “satisfactory” (0.40 ≤ ICC < 0.75); and “very good” (ICC ≥ 0.75) [33].

The construct validity was assessed through the correlation between the USE and the PSSUQ using the Spearman Correlation Coefficient. The level of significance was set at p < 0.05.

4 Results

4.1 Phase I – Questionnaire Translation

After the translation, reconciliation and retroversion processes, the resulting version was compared with the original version of the questionnaire. They were considered equivalent in terms of semantic and the meaning of the content.

The pilot study was conducted with 4 participants from the general community, 2 male and 2 female, aged between 28 and 41 years old. The participants understood the semantic and the content of each item. In general, the European Portuguese version of USE was considered easy to understand.

Therefore, the 30 items of the European Portuguese version of the USE were considered equivalent to the corresponding items of the original version (Table 1).

Table 1. Original item vs. corresponding item in European Portuguese.

4.2 Phase II - Questionnaire Validation

The sample consisted of 24 participants with an average age of 47.9 years (SD = 22.0) with a minimum of 26 and a maximum of 89 years old. The sample was 58% female and 42% male. The characterization of the participants is presented in Table 2.

Table 2. Characterization of the participants.

The Cronbach’s alpha for the European Portuguese version of the USE was 0.96. Regarding the reliability results, the ICC value was 0.69 (ICC 95% −0.23; 0.90).

The validity of the questionnaire was assessed by comparing the USE with other usability questionnaire, the PSSUQ. The USE and the PSSUQ present a statistically significant negative correlation (r = −0.84, p > 0.05).

5 Discussion and Conclusion

The translation of a measuring instrument to another language requires different levels of equivalence, whether of lexical nature (language) or cultural. The European Portuguese version of USE was developed through a translation-retroversion procedure in accordance with the recommendations for the translation of questionnaires.

This study aimed to translate and adapt the original version of the USE questionnaire into the European Portuguese language and culture, and to perform its validation. The result of the questionnaire translation phase indicates that the items were easy to understand and there were no semantic or content problems, and, as a final result, an intelligible version of USE in European Portuguese was obtained. All the translated items were considered equivalent to the original version.

The result of the validation phase indicates that the questionnaire has very good internal consistency (α = 0.96) and a highly satisfactory inter-rater reliability (ICC = 0.69). In terms of its validity, the USE and the PSSUQ present a statistically significant negative correlation (r = −0.84), which means that both evaluate the same, suggesting that the European Portuguese version of USE has construct validity. The USE and PSSUQ correlation is negative just because a higher score in USE corresponds to a lower score in the PSSUQ, were lower values indicate better usability.

Up to now, the USE questionnaire has been used in Portugal without a validated European Portuguese version, which denotes a general lack of awareness regarding the importance of using validated instruments to evaluate usability.

The fact that the USE is a usability questionnaire widely used, including in Portugal, justifies the importance of having a validated European Portuguese version. Once the quality of the European Portuguese version is guaranteed, then all kinds of conclusions and comparisons are possible because the measurements are feasible and reliable. The linguistic and cultural adaptation was crucial to ensure that the constructs measured by the USE questionnaire are equivalent in both the translated and the original versions.

During the observational study conducted at Cáritas Diocesana de Coimbra, there were some issues, especially with older participants, concerning the “ease of use” of the questionnaire itself. The evaluators observed that some participants had difficulties to understanding the scoring system and intended to select more than one option for each item. Therefore, as future work, the authors propose either a general alteration of the European Portuguese version of the USE questionnaire or a specific version for older adults in order to include instructions such as ‘Por favor, escolha apenas uma resposta para cada pergunta’ (‘Please select only one answer for each question’) in the heading of each item or page, and also visual information to complement the numbers of the seven-point Likert rating scales associated to the items of the questionnaire.