Keywords

1 Introduction

With the transition from physical to digital music distribution, streaming platforms have become the main source for musical listening. With online music streaming platforms like Spotify, Pandora, and Apple Music making tens of millions of songs available to hundreds of millions of subscribers, music recommendation systems (MRS) have become a critical component of providing a satisfactory listening experience for users. Developing a recommendation system that satisfies a variety of user needs in such a growing and dynamic space is becoming increasingly challenging in the industry and appealing to academia [1]. Though MRS have made considerable gains in efficacy over the last decade, the challenges to deliver a truly personalized user experience remain to be illuminated [2].

Current MRS can be broadly classified into three categories: collaborative filtering (CF) based, content based (CB), and hybrid systems that blend both CF and CB techniques [3, 4]. All three methodologies are heavily dependent on information and, importantly, comparisons, in order to make a particular recommendation. CF-based applications typically rely on rich data sets of user interactions in order to draw a recommendation from the comparison of user listening habits, playlists, song ratings, and so on. Similarly, effective content-based recommendations require considerable information on the characteristics of a song, artist, or user in order to develop a recommendation. Hybrid systems, as a combination of CF and CB techniques, also require external user information as part of the recommendation process. As a result, many MRS tend to recommend music from the short tail of the available music catalog but fall short of music discovery [2].

Although the majority of existing systems perform well within the context of creating low effort “set-and-forget” playlists, due to the nature of the algorithms used by MRS, most users are likely to encounter a stream of songs and artists with which they are already familiar. Additionally, most of these MRS implementations do not adequately address other potentially important listener use cases, primarily in the domain of automated music discovery. On many music streaming platforms, a novel discovery is a somewhat rare occurrence that requires artists and albums to gain significant streams, likes, follows, or plays to gain enough traction to be recommended to listeners. The effect is especially pronounced for older or less common areas of a catalogue that may become consigned to the so-called “long-tail” of obscurity [5]. Ultimately, discovery still requires a significant amount of listener effort to seek out new and novel pieces of music.

In this chapter, we explore the viability of a network graph-based method, with an emphasis on novelty and music discovery. The proposed method takes into consideration the relationships between musical groups, which are defined as two groups sharing at least one artist. In exploring these relationships from a network perspective, then we generate an artist recommendation that requires little or no preexisting artist or listener information. By avoiding the popularity-bias inherent in existing MRS, our artist graph allows broader, but still relevant, artist recommendation with greater diversity and novelty.

2 Related Work

Due to the massive variety and quantity of musical options available to users, much progress has been made in the field of MRS and the capabilities and shortcomings of traditional MRS are well documented [2]. For example, MRS were shown to bias toward popular artists at the detriment of less-known artists. In an audio content-based (CB) networks, users tend to listen to more music independent of the quality or popularity of the music [5]. Additionally, though good at long-tail recommendations, CB networks can cause musical-genre-based bias. In a collaborative filtering network, however, users tend to listen to more mainstream music.

As the widely used CF-based MRS is affected by the disadvantage of cold start caused by the lack of user historic data, Van den Oord et al. developed a latent factor model for music recommendation [4], which predicted the latent factors from music audio when they could not be obtained from usage data. Their results showed that predicted latent factors produced sensible recommendations.

Network graphs have also been used to improve music recommendation [6,7,8,9]. For instance, Zhao et al. proposed genre-based link prediction in bipartite graph for music recommendation [7]. They enhanced recommendation by representing complex numbers and computing users’ similarity by genres weight relations [7]. With music-genre information, they demonstrated that user similarity and link prediction ability were improved and accordingly overall recommendation accuracy was increased. In another study of bipartite graph, they revealed that the performance of recommendation can be significantly improved by considering homogeneous node similarity [9].

A common issue encountered by many different recommendation systems is the need to incorporate information about users and subjects that may be implicit, unknown, or otherwise missing. Tiroshi et al. addressed this issue by representing the data using graphs, and then systematically extracting graph-based features, with which to enrich original user models [8]. Their results showed features derived from graph increased recommendation accuracy across tasks and domains.

In this chapter, we explore network graphs that make recommendations with emphasis on novelty and music discovery. The remainder of the chapter is organized as follows. First, we describe the data to use and how we preprocess it. Then, we describe the generation and analysis of our artist-based networks. Next, we provide our results and lastly suggest directions for future development.

3 Data Collection and Preprocessing

The data for constructing our artist networks was downloaded from the MusicBrainz Database, a freely available open music encyclopedia [10]. The database contains a wealth of user-maintained meta-data for over 143,111 music artists. In particular, we utilized two data fields for building and enriching our artist-based networks: an artist’s membership in a musical group and user-submitted descriptors called “tags.”

The database defines an artist as an entity that can be an individual performer, a group of musicians, or other contributors such as songwriters or producers. We limited our discussion of artist-artist relationships to musical groups and their current and past group members, as well as other members who may have contributed or played with the group in some meaningful way. With respect to group membership, we did not weigh or otherwise discriminate between member “categories” (current member, past member, touring, recording, etc.); i.e., we consider all associated members to be equal for simplicity.

User-submitted tags allow users to attach various categorical identifiers to artists. Table 1 shows the tags associated with artist The Who. The primary artist tags are related to the musical genres and typically are used to describe the artist’s music. Secondary tags, labeled “Other tags,” are used to more broadly categorize an artist, e.g., countries of origin or periods of time and may refer to certain musical movements with which the artist is typically associated. For the convenience of analysis, we did not weigh or discriminate between tags or their types, meaning we considered all tags to be equally informative.

Table 1 Tags submitted by users for the artist The Who

Using the artist-artist relationship and tag data described above, we created two subsets of data for our experimentation and analysis. The first was created by mapping all artists to the musical groups to which they belong, resulting in a simple dictionary of 143,111 unique musical groups and 495,291 of their related group members. In much the same way, we mapped user-submitted tags to their parent artists, generating a second dictionary of 143,111 musical groups and 352,180 their associated tags. The Python code for processing the datasets, together with the processed data files, is publicly available at GitHub website (at https://github.com/jrwaggs/music/blob/master/Data%20Gathering.ipynb).

Using Spotify API (see our code in Github), we also collected Spotify’s music data, which includes user-generated playlists, artist info, album data, musical genre data, and data specific to individual tracks. Although we had planned to select playlists of varying length and genre content, with limited time we were able to examine only seven playlists in our experimentation. Table 2 below provides a brief description of the seven Spotify playlists. The unique artists extracted from these playlists are used to construct our test data sets.

Table 2 Description of seven Spotify playlists used in our experiment

4 Method

Different from other methods that exploit similarity among users, in this chapter we propose to create multigraphs based on the relationships between individual artists and musical groups. This design enables us to create models even in the absence of user attribute or preference data, which are not always easy to obtain. More importantly, as shown by our results later, our graphs can effectively avoid low novelty exhibited by existing MRS methods.

Network Graph Concept

Our network graph contains two basic elements: vertices or nodes that represent the objects in the graph, and edges or links that connect vertices. When multiple types of relationships exist between objects in a network, multilayer or multidimensional graphs are required. The multidimensional graph G used in our study is defined as:

$$ \boldsymbol{G}=\left(\boldsymbol{V},\boldsymbol{E},\boldsymbol{D}\right) $$

where

  • V = A finite set of Vertices or nodes

  • E = A finite set of labeled Edges or links connecting vertices

  • D = A set of labels representing each Dimension or layer [11]

In our model, each node represents a unique musical group, each edge represents a single labeled relationship between the two, and the dimension of the layer is expressed as either artist or tag. This definition allows for creating and layering together two separate networks: (1) networks in which the relationship between two music groups is defined as having a shared artist or band member, and (2) networks in which the relationship is defined by a shared tag. Edges in our graph are thus represented as tuples (u, v, d) or in our case (Artist, Artist, Dimension: label) where Dimension is either artist or tag. For instance, below are two such types of edges:

  • (The Yardbirds, Cream, artist: Eric Clapton)

or

  • (The Rolling Stones, The Who, tag: British)

Graph Construction

Our approach to constructing artist and tag-based network graphs follows a breadth-first-search-style procedure: we begin with a root node and systematically traverse an entire level of a graph structure before moving on to the next level [12]. But rather than searching for a single type of nodes in a network, we build alternating layers of artists and musical groups, from which a graph structure is created. The flowchart of our graph construction is depicted in Fig. 1.

Fig. 1
figure 1

Illustration of graph construction procedure

As show in Fig. 1, we start with a single seed music group, from which we retrieve a list of all artists that have ever been a member of the group. These artists will make up the first artist level in the tree structure. Next, we traverse the first artist level, retrieving all musical groups of which an artist has been a member. By repeating this process for each successive layer, we are able to construct a fully connected artist-based graph. By traversing each band level and appending the band to a separate list, we create a set of music groups that serve as the node set in a graph and are the foundation of our graph building method.

It is important to note that, to obtain a manageable graph with decent depth and breadth, we first use artist-artist relationships to generate our initial artist graphs. Because an artist may be labeled with as many as 20 user tags, the use of tag-based relationships can result in tree structures that are incredibly shallow (only a few layers “deep”) and wide, in comparison with artist-based graphs of the same size. Our experimentation showed that wide, shallow trees were too large and contained far too much variation to be able to produce any meaningful result.

With the nodes in our initial artist graphs, the process of generating the sets of artist and tag edges is straightforward. Similar to artists, tags represent a set of user-submitted descriptors assigned to groups and artists. For example, the tags for the group The Who include “rock,” “classic rock,” “mod,” and “British invasion” [13] (see Table 1). We compare each music group’s artists or tags against every other music group in the node set. If they share a common artist or tag, an undirected, labeled edge is then added to the graph. All of our undirected, multidimensional music graphs are created in this way by combining node, artist-edge, and tag-edge sets.

When experimenting with different artist-based graphs of varying size and complexity, we found that a flaw in the aforementioned procedure is: an ideal level of musical variety may not be reached efficiently in certain cases. For example, to obtain a desired level of variety for a graph that contains two groups associated with very different genres, we may have to create incredibly large graphs that are inefficient to work with and evaluate. In other cases, “natural” connection between two bands or artists may not exist at all.

To address this issue, we layer together multiple artist-based multidimensional graphs to form combined multilayer graphs that may contain any number of specified artists. We develop such graphs by first creating separate artist-based graphs, and then repeating the process to incorporate tag-based edges to each graph. With a fair number of graphs of varying origin and size, the shared user-submitted tags will be more likely to connect two artists whose natural relationship either does not exist or requires an unreasonably large graph to find a path between them. For example, creating a network to link the artists The Who and The Cure requires a very large network graph. With the use of the ‘Pop Rock’ user tag, however, the two graphs containing either artist can be easily connected.

Measuring Node Importance

To evaluate the importance of nodes in a graph, we calculate a number of graph connectedness measures, such as degree centrality, load centrality, and PageRank score. PageRank score is calculated using Python code NetworkX (https://github.com/jrwaggs/music/blob/master/Network%20Graph.ipynb) [14]. Of these metrics, PageRank consistently produced more relevant results than others. For example, the bands Cream, Eric Clapton, and The Yardbirds all have many artists and tags in common and are relatively important to that era of music. Their scores are more similar with PageRank than they do with other centrality measures. The performance of PageRank can be explained by its design, which considers not only the number of connections a node has in a graph, but also the relative importance of neighboring nodes [15]. It is important to accurately measure the importance of each node in the graph, because the top-ranked artists in a graph are used by our method to produce artist recommendations.

To verify the efficacy of artist-based networks, in conjunction with PageRank, we constructed two multigraphs and examined the top 25 ranked artists produced by each. The results of the two multigraphs are provided in Appendix A. The first graph was constructed using five root artists commonly found together in a classic rock playlist: The Rolling Stones, The Who, The Eagles, Van Halen, and Boston. To demonstrate the ability of our method to link disparate musical groups, the second graph is constructed using five root artists that are more varied in genre and musical era: R.E.M., Abba, Megadeth, The Spice Girls, and The Cardigans. Both graphs were constructed using the same specifications: a graph of size V = 200 was constructed with both artist and user-tag edge sets for each artist and then individual graphs were combined into a single multigraph.

Graph Evaluation

We employed two commonly used performance metrics, accuracy, and novelty, to evaluate our multigraph models.

Accuracy

A recommendation system should maintain a level of user comfort by recommending artists with some of whom they are familiar. Accuracy is used to measure the similarity of the recommended artists to the artists in the user’s playlist. Let UA u be the set of artists in the original user playlist and RA u the set of recommended artists. We define the accuracy of an individual playlist as:

$$ \mathrm{Accuracy}=\frac{\left|{RA}_u\cap {UA}_u\right|}{\left|{RA}_u\right|} $$
(1)

The accuracy of our recommendation model is defined as the average of the accuracy of individual playlists, which is calculated using the Eq. (1) above.

Novelty

Together with a number of other metrics, novelty was used by Chou et al. to evaluate several MRS [16]. A user is better served if recommendations include new and novel songs or artists. Novelty measures music discovery by quantifying the recommended artists not in the original playlist. Let UA u be the set of artists in the original playlist and RA u the set of recommended artists. The novelty of an individual playlist is defined as:

$$ \mathrm{Novelty}=\frac{\left|{RA}_u\backslash {UA}_u\right|}{\left|{RA}_u\right|} $$
(2)

To measure method novelty, we compute the novelty for each individual playlist and take the average as the novelty measurement of our method [16].

From Eqs. (1) and (2), we can see that accuracy and novelty are complementary and, thus, negatively correlated. Hence, in the section below, we just focus our discussion on novelty.

5 Results

Firstly, we evaluated the node importance metrics used in our model (see Appendix A), by calculating the correlation coefficient between them and Spotify popularity score, which is in the range of 0–100. The result in Table 3 shows PageRank and degree centrality are moderately correlated to Spotify popularity score, with correlation coefficients being 0.39 and 0.51, respectively.

Table 3 The correlation between node importance metrics and Spotify Popularity Score

Using the graph for the Rock Classics playlist, Fig. 2 shows a scatterplot of PageRank and Spotify popularity score. It confirms the moderate correlation between artists’ PageRank and Spotify popularity scores.

Fig. 2
figure 2

Scatterplot of PageRank and Spotify Popularity scores

Next, we evaluated artist recommendation by mimicking a real-world context. For each of the seven Spotify user playlists in Table 2, we generated a number of multigraphs and computed the average novelty of the artists in the graphs against the original playlist. We also evaluated two parameters used in graph construction – number of seed artists and size of individual graphs – and assessed their influence on model accuracy and novelty.

Below is the description of our evaluation process. We applied the same procedure to each playlist and assessment:

  1. 1.

    Let UA be the set of artists in the original user playlist. We used a random number generator to select N random seed artists from UA.

    • To study the effect of seed artist selection on novelty, we selected 2, 5, 7, 10, and 15 artists, respectively, for our test.

    • To study the impact of graph size on novelty, we selected 10 artists in our experiment.

  2. 2.

    For each artist, we built graphs of size V and combined them to form a single multigraph.

    • For the study of the effect of seed artist selection on novelty, we built graphs of size V = 100.

    • For the study of the impact of graph size on novelty, we built graphs of size V = 10, 25, 50, 75, 100, 150, 200, and 250, respectively.

  3. 3.

    We evaluated the graphs and calculated a PageRank score for each node.

  4. 4.

    Let RA be the set of recommended artists derived from the graphs and let |RA| = |UA|. Then we computed accuracy and novelty against the original playlist.

  5. 5.

    We repeated this process 10 times for each playlist to compute average accuracy and novelty.

Figure 3 shows the average novelty of our recommendations for the seven playlists. The x-axis of the plot represents the number of nodes (artists) in the resulting multi-graphs. It shows with the increase of graph size, the novelty of our recommendations decreased significantly for two of the playlists, and stayed relatively flat for other playlists. This is expected, as the two playlists exhibiting drastic decrease in novelty have a higher percentage of multi-member bands and thus more artist-artist relationships to build graphs. This figure indicates our recommendations are affected more by artist-artist relationships than by artist tag data.

Fig. 3
figure 3

Multigraph novelty by nodes

The impact of the number of seed artists used for building graphs on recommendation novelty is provided in Fig. 4. With the increase of the number of seed artists used, as shown in the figure, the novelty drops uniformly. Therefore, for listeners in favor of novel music, fewer seed artists should be used for graph construction.

Fig. 4
figure 4

Multigraph novelty by seed artists

Finally, we compared our results to the published work by Chou et al. [16]. In general, our average novelty, which is from 64% to 98%, is comparable to their results, which fell between 89% and 97%. Chou’s study, however, used KKBOX dataset, different from our MusicBrainz dataset. More comprehensive evaluation is needed to better understand the capability and limitation of our method.

6 Discussion

In this chapter, we developed a multigraph method for music recommendation. We built graphs by utilizing artists’ relationships and user-submitted tags, which reinforce artists’ relationships and therefore help identify more relevant recommendations. Comparing with current MRS, whose recommendations are generally positioned within the most popular artists/groups, our recommendations provide improved novelty and diversity and, hence, are more representative of the spectrum of musical groups and artists. Our method is ideal to be used as an add-in or user-controlled option for popular streaming services. This added diversity would provide a more unique listener experience. Our method can also be used as a component of an MRS to help overcome cold-start issue in the CF-based streaming services.

Our recommendations could be negatively affected by the ambiguity of artists’ names. There are multiple bands with the same names. It is difficult to determine which one is referenced, in particular in the use of Spotify popularity scores. This challenge can be potentially cracked by assigning a unique ID to each artist. But we did not do it due to time constraints. In addition, we used the MusicBrainz data as our test data set. If there is an artist missing, we have to manually add it to the dataset. This is an arduous process and our work could be alleviated by combining MusicBrainz with additional data sources. A more comprehensive dataset can potentially promote the accuracy of recommendations further.

For future development, graph structure can be optimized to generate more consistent recommendations. In our graph model, the edges that represent artist-artist relationships are clearly more relevant than tag-based edges. So a way to refine the model is to distinguish two types of edges by associating edges with weights. Another direction is to integrate other relationships into the model, such as songwriter, producer, or other behind-the-scene contributors, and this would be particularly helpful in identifying stronger relationships for single artists or new groups with few or no artist-artist connections. Lastly, to improve our music recommendation, it is necessary to have it evaluated by real-world users. The level of satisfaction of users can be used to fine-tune the model.