Keywords

1 Introduction

Downloading digital content from online content distribution services has been a common practice for Internet users. To enhance user satisfaction and loyalty, service providers often analyze the users’ download history to mine the interests and needs of users, and utilize the recommender systems to provide personalized recommendations for content that suit a user’s interest [2].

Modern recommender systems have been developed in two dimensions. The first is content-based filtering that finds content items similar to what the user has liked before, where content similarity can be defined on the features of content items. The second and more successful one is collaborative filtering (CF) [11], which utilizes not only the past behavior of one user but also of other users. The content-based filtering and CF approaches have been combined in existing recommender systems [2].

Recently, recommender systems have been further empowered by taking into account the temporal dynamics. Indeed, user interests are often not static but rather time varying. Some work has been done to detect and to adapt to the user interest drifts in making recommendations. For example, Koren studied a series of temporal dynamic models within the framework of CF [13]; Cao et al. proposed four patterns to describe user interest drifts, including Single Interest Pattern, Multiple Interests Pattern, Interests Drift Pattern, and Casual Noise Pattern [9]; Abel et al. investigated temporal dynamic user modeling from user behavior on the Social Web such as Twitter [1]. All such work reported more accurate recommendations compared to traditional systems that adopted time invariant user models.

Though automated recommender systems have achieved remarkable success, enabling user interactive recommendations is nontheless important to improve user experience, where visualization is key to engage user’s participation [7, 8, 16]. Visually analyzing user behavior also provides great help to service providers, as it capably supports the open-ended exploration and flexible questions that analysts may generate [4]. One question that interests service providers, is whether visualization of users’ data can be understandable by users themselves, and thus recommendations based on this visualization are also understandable and more acceptable?

However, there are some challenges in visualizing user behavior for analyses as well as for making recommendations. In view of the possible user interest drifts, it is crucial to display the relevances between the content items that one user had accessed, but how to quantify and visualize the relevances is a difficulty. Displaying temporal and drifting data remains a challenge in visualization, especially taken into account the interpretability and perceivability. Moreover, how to evaluate the visualization design, both objectively and subjectively, is not well studied before.

In our previous work [19], we have studied a visualization approach to analyzing user interest drifts from users’ music download history. Our main purpose was to depict the underlying relevances among the downloaded music tracks so as to identify the user interest drifts. To that end, we considered feature-based relevances and collaborative relevances in accordance with the existing recommender systems. For feature-based relevances, we utilized the metadata of music and selected genre and release year to represent categorical and numerical features, respectively. For collaborative relevances, we had been inspired by the CF approach and define relevance between music tracks as co-occurrence in all users’ download history. Moreover, we designed three new kinds of plots to display the music download history of one user, namely Bean plot, Transitional Pie plot, and Instrument plot.

In this paper, we report our conducted user studies to evaluate the usability of the visualization design, during which we focus on the capability of the visualization in assisting analyses, as well as the ease of learning and use and the analyst’s experience. Such user studies remain largely explorative rather than quantitative in the literature, and how to perform user studies is also an open problem. Our studies try to deliver a learning-practice-test workflow to observe how normal users could leverage the visualization tools to perform data analyses tasks as if they were professional analysts, which may be inspiring for further research.

The remainder of this paper is organized as follows. Section 2 describes some related work. Section 3 summarizes our visualization design. The conducted user studies are discussed in detail in Sect. 4. Concluding remarks are presented in Sect. 5.

2 Related Work

Our work is closely related to visual recommendations that help user discover interesting items and help the service provider interpret the reason of recommendations. Visualization of music itself is also of great interests recently. How to evaluate the visualization design has been studied mostly in an empirical manner. On the above aspects will we briefly review some related work.

As mentioned before, recommender systems can be roughly classified into three categories: content-based filtering [17], CF [6], and hybrid approaches [3]. Visual recommendations also fall into one of the three categories. For example, Bogdanov et al. [7] proposed a content-based recommendation that infers high-level semantic descriptors from the music tracks of one user, and then utilizes the semantic descriptions to perform recommendations or visualization of user’s preferences. PeerChooser [15] is a collaborative recommender system with an interactive graphical explanation interface, which enables user to select “similar users” in her own mind. Hybrid visual recommendations are more attended, for example the recommendations from multiple social and semantic web resources [8], and the visual user-controllable interface that encourages user to manually control the recommendation strategies [16]. Regarding the type of content, the work in [5] may be the most similar one to our work, as it proposed several visualizations for music download history, also for recommendations. However, all the above-mentioned work did not consider the underlying user interest drifts in the user’s behavior data.

Music-related visualization attracts the attention of researchers in a wide range. Earlier work was done to visualize music collections such as personal music libraries [12, 18]. The visualization of music download history was presented in [5], which added the temporal dimension into consideration. Based on the user’s music library, a humanoid cartoon-like character called Musical Avatar was generated to visualize the user’s interests [7]. In such work, the implicit relations among music tracks are less studied, and the temporal dynamics are not taken into account.

Evaluation of information visualizations has always been an important part of related research. Carpendale discussed different types of evaluations as well as their pros and cons [10]. Lam et al. [14] focused on empirical studies in information visualization and summarized seven scenarios to discuss what might be the most effective evaluation of a given information visualization. Basole et al. [4] designed a three-phase user study including tutorial, practice, and evaluation for assessing their visualization design. In this paper, we design user studies as a learning-practice-test workflow, similar to that in [4] but enables users to utilize the visualization tools as if they were professional analysts.

3 Visualization Design

The raw data recording the music download history of one user can be described as pairs of downloading timestamp and the identity of music track. Raw data are pre-processed in two steps. First, one user’s download history is divided into sessions, where each session consists of a series of consecutive downloads that have short intervals, and the intervals between sessions are usually much longer. Second, from the metadata of each music track, genre and release year are extracted as features. Moreover, we calculate collaborative relevances within any pair of two tracks, the relevance is evaluated by the number of users who downloaded both tracks within a short period. Then, we designed three kinds of visualization plots, namely Bean plot, Transitional Pie plot, and Instrument plot, to display the music download history that indeed imply the dynamics of user interests. Please refer to Figs. 12, and 3 for the plots and user interactions. For more details please refer to our previous paper [19].

Fig. 1.
figure 1

(Left) Bean plot showing the download history of three users. Each small, color-filled circle (named a bean) represents a music track and each larger disc (named a pod) represents a download session. Colors of beans stand for genres. Pods are arranged in chronological order. (Right) Bean plot provides interactive display that one pod being clicked will unfold to multiple smaller pods to represent subsessions, each of which has a single genre (Colour figure online).

Fig. 2.
figure 2

Transitional Pie plot showing the download history of one user. Similar to pie chart, the disc shows the proportions of different genres. Within each genre, tracks are arranged in chronological order so that each downloaded track has a corresponding position on the disc. Bezier curves inside the disc display the collaborative relevances among music tracks. Bezier curves outside the disc show the transitions between genres, two successively downloaded tracks of different genres are connected by a outer-disc curve.

Fig. 3.
figure 3

(Left) Instrument plot showing the download history of one user. The timeline is represented by a disc as the body of the instrument, where the music tracks are arranged in chronological order. The gray bars alongside the tracks indicate release years. Bezier curves inside the disc represent the collaborative relevances among music tracks. The distributions of release year and genre are shown as the headstock and the neck of the instrument, respectively. (Right) Instrument plot provides interactive display, once a track is clicked, all its related tracks will be highlighted.

4 User Studies

We have conducted user studies to evaluate the usability of our proposed visualization. In the studies, participants are required to first learn the design of the three plots; then after practice, they are asked to analyze the plots of new users and to answer both specific questions and open-ended questions regarding the user interest drifts; finally, questionnaire and survey are performed to collect the participants’ feedback about the experience of using the visualization plots.

4.1 Implementation

We have implemented the proposed visualization design and tested it with a real-world data set provided by an online music service. Data preprocessing is performed offline, including the division of sessions and the calculation of collaborative relevances. The visualization plots of each user are drawn upon request in a webpage view, implemented by HTML5 and JavaScript techniques, which also enables the designed user interactions. Online rendering of the plots is computationally efficient and does not incur noticeable delay in mainstream web browsers.

4.2 Participants

15 undergraduate students (6 females and 9 males) participate the user studies. Participants have different majors including science and engineering, with ages ranging from 20 to 23 (mean: 21.6, median: 22). Participants reported different levels of interests in online music services, but none of them had experience of visual analyses.

4.3 Tasks

Each participant undertakes the user study in 4 sessions: learning, practice, test, and questionnaire. The learning session is to help participants understand the proposed visualization design. In the learning session, the instructor briefly introduces background and the data, explains the three plots in detail, and then shows a screen-captured video to display the user interface as well as interactions. Participant is encouraged to ask any question at any time and will be answered immediately. The learning session lasts for around 15 min for one participant.

The practice session is then conducted to enhance the comprehension and familiarity of the participant on the plots. In the practice session, participant is asked to analyze some plots provided to him/her, and then to answer some questions (the questions are given in Table 1). For example, the instructor provides the Bean plots of user 7 and user 8, and then asks the participant: “whose download history is more consistent in music genre, user 7 or user 8?” (PQ3 in Table 1) Instructor will answer any question of the participant, and will explain the plots again to the participant, if necessary. The practice session lasts for around 19 min for one participant.

The test session is the most important part of the user study. In the test session, the visualization plots of 10 usersFootnote 1 are provided to participant for visual analyses. Participant is asked to analyze all the plots and then to answer 3 specific questions and 2 open-ended questions (the questions are given in Table 2). In the test session, instructor does not offer any assistance to the participant, neither tell the participant which plot to look at, nor answer any question regarding the plots. The only instruction is to encourage the participant to answer the open-ended questions as comprehensive as possible. By this design, we hope to verify whether the participant has learnt the characteristics of different plots and can choose the plots for visual analyses, and to observe what the participant can find from the exploration of the plots. The test session lasts for around 37 min for one participant.

The last session is to ask the participant to finish a five-point Likert-scale questionnaire regarding the experience of using the plots for visual analyses. Also, a quick survey is conducted to collect the participant’s feedback on the visualization design. In total, the user study for one participant lasts for 57 to 84 min (mean: 71, median: 70). All participants are instructed by the same investigator.

Table 1. Questions in the practice session
Table 2. Questions in the test session

4.4 Results

Practice Session. In the practice session, we ask the participants to answer some specific questions (given in Table 1) regarding the visual analyses of the provided plots. Participants’ answers show a good consistence: for 5 out of 9 questions (i.e. PQ1, PQ4, PQ5, PQ7, PQ9), all the 15 participants give the same answer; for the other 3 questions (PQ2, PQ3, PQ8), 14, 14, and 13 participants give the same answer, respectively. The only exception is PQ6: for this question, 7 out of 15 participants choose user 2 and the other 8 participants choose user 3. These results show that the three plots are not difficult to understand for the participants except for the collaborative relevances and transitions that are shown simultaneously in the Transitional Pie plot, which may lead to diverse understandings.

Test Session. In the test session, we ask the participants to answer both specific questions and open-ended questions (given in Table 2) after their analyses of the visualization plots of 10 users. These questions are believed to be much more difficult and more subjective compared to the questions in the practice session. On the one hand, the analyst shall compare the plots of 10 users to give out the answers to the specific questions. On the other, the open-ended questions indeed have diverse answers.

For the specific questions shown in Table 2, Q1-1, Q1-2, and Q2 all receive quite consistent answers from the participants. For Q1-1, 13 participants select user 6 and the other two select user 24, which can be verified by looking at the Bean plots of these two users. For Q1-2, 11 participants select user 2, two select user 3, and the other two select user 4, which has been made visible in the Bean plots and Instrument plots. For Q2, observing the Instrument plots or Transitional Pie plots, 10 participants select user 6, and the other five select user 24. Note that such consistence is not trivial, since the participants need to pick one user from the 10 users as the answer to these questions, and the interest patterns of several users are very similar. Therefore, we believe that the plots have depicted the characteristics of user interests that help participants make the correct analyses.

Fig. 4.
figure 4

The answers to Q3-1 (left) and Q3-2 (right) in the test session. Please refer to the questions shown in Table 2.

The answers to Q3-1 and Q3-2 (summarized in Fig. 4) show more diversity among participants. These two questions are not specific to time, genre, or collaborative relevance, but require the participants to present their own understandings of “user interest drifts,” which is quite subjective. Participants report difficulty in considering genre and collaborative relevance together when looking for the user with the most or the least consistent interest. This is not surprising since genre and collaborative relevance are displayed separately in our designed plots. This issue shall be addressed in the future work.

Table 3. The answers to the open-ended questions in the test session

The open-ended questions Q4 and Q5 (also shown in Table 2) ask the participants to write down as many as possible their findings of one user’s download history and of comparison between two users’. Participants’ findings are summarized in Table 3. Overall, these findings achieve a good consistence among different participants and at the same time exhibit subjective diversity. There are only two findings inconsistent with the others (typeset italic in Table 3), each approved by only one participant, and both relate to collaborative relevances. It reveals that few participants have misunderstanding on the collaborative relevances, which seem a less straightforward concept for them. Moreover, participants have made findings at different aspects including the release year and genre of music, the download sessions, and the collaborative relevances. Interestingly, participants mention some findings that are beyond our previous analyses. For example, “transitions between genres only happen among tracks that have collaborative relevances,” which is mentioned by two participants, was not easy to find at the first sight. Also, some findings are more conjectured than analyzed, e.g. “the user might have different moods in different sessions,” “since the relevances between tracks within each session is high, but cross-session relevances are almost none,” said the participant. Last but not the least, the findings of comparison between two users (answers to Q5) are more consistent among different participants compared to the answers to Q4, which implies that the visual comparison may be easier and more obvious than the visual analysis of single user.

Table 4. Questions and answers in the questionnaire and survey session

Questionnaire and Survey Session. After practice and test sessions, we ask the participants to finish a five-point Likert-scale questionnaire that consists of 9 questions shown in Table 4, which also shows the average score of answers to each question. We also conduct a survey to collect participants’ feedback.

Questionnaire. According to the questionnaire, participants feel the visualization interface is easy to learn and use. This is also verified by the fact that all participants, having no experience of visual analyses, can finish the entire user study with good performance in less than 90 min. Regarding the usability of the visualization, participants agree that collaborative relevance is helpful in analyzing user interests, and all of them indeed take the collaborative relevance into consideration during the test session. 13 out of 15 participants agree that the three plots make user interests obvious, and the other two are neutral to this question. Most participants think the designed interactions are intuitive, and we have observed that all participants learn the interactions in Instrument plot very quickly and use the highlighting feature effectively; the interactions in Bean plot are also not a difficulty for the participants, but three of them need some time to understand the concept of subsessions (unfolded pods), once understood, they all use the interactions well in the test session. Furthermore, on whether there is redundancy in the three plots, Bean plot is believed to have none, but Instrument plot and Transitional Pie plot are believed to have some redundancy to some extent by several participants. The reason may be that the Instrument plot displays the genre and release year of each track as well as the statistics of them, and the Transitional Pie plot uses gradient color for the transitions between genres. Finally, participants report willingness to use the visualization plots (9 agree, 5 neutral, and 1 disagree).

Survey. In the survey, we ask the participants for their opinions on the visualization plots. About the Bean plot, although it does not display collaborative relevance, its distinctive layout is acknowledged by the participants and it is utilized frequently in the test session. Almost all participants believe that Instrument plot displays the most information and the most important information; participants express an overwhelming preference on the Instrument plot in the test session. The Transitional Pie plot is endorsed by some participants but disliked by some others. Several participants are fond of the Transitional Pie plot as “the inner- and outer-disc curves can be jointly considered,” but some others think the Transitional Pie plot can be covered by the Instrument plot and thus is unnecessary. Moreover, participants provide comments and suggestions on the plots. For example, the shape of beans can be changed to square so that the layout of Bean plot may look more regular; the color coding of genres may be adaptive for each user to better distinguish different genres; and so on. These issues will be addressed in our future work.

4.5 Discussion

Comparing the three plots, Instrument plot receives the most preference due to its comprehensiveness. Bean plot is also approved when the analysts do not concern collaborative relevance. Transitional Pie plot is endorsed by several people but less utilized by some others. Collaborative relevance is believed to be helpful in the analyses, but people have difficulty in combining feature-based and collaborative relevances at the same time since the plots display them separately.

With the help of the proposed visualization design, non-expert people can learn and use the plots for visual analyses of user interests without much difficulty. As the visualization makes user interests obvious, there is a good consistence among the analyses of different people. At the same time, people also make diverse, sometimes subjective findings from the explorative analyses, which implies the visualization may inspire analysts to further investigate the user interests.

5 Conclusions

In this paper, we have reported our conducted user studies to evaluate our proposed visualization approach to analyzing user interest drifts from the music download history. We examined users’ feedback on our designed three new kinds of plots, i.e. Bean plot, Transitional Pie plot, and Instrument plot, that display the music download history while making the user interest drifts visible to analysts. The results demonstrate the feasibility of our visualization design, and the user studies may be inspiring for further research on visual analyses tasks.