A multi-label movie genre classification scheme based on the movie’s subtitles

Rajput, Nikhil Kumar; Grover, Bhavya Ahuja

doi:10.1007/s11042-022-12961-6

A multi-label movie genre classification scheme based on the movie’s subtitles

Published: 13 April 2022

Volume 81, pages 32469–32490, (2022)
Cite this article

Download PDF

Access provided by Autonomous University of Puebla

Multimedia Tools and Applications Aims and scope Submit manuscript

A multi-label movie genre classification scheme based on the movie’s subtitles

Download PDF

737 Accesses
8 Citations
Explore all metrics

Abstract

Prediction of movie genres is an intriguing problem that has several applications in designing recommendation systems for the audiences, analyzing movie box office performance and understanding the theme of the movie to list some. This is a classic multi-label classification problem. An algorithm for movie genre detection has been proposed built on the yet unused movie’s subtitles which are a documented account of the movie’s visual content and dialogues. The basic idea is to identify words that have high frequency in a particular genre and use them as features for training the classification machine learning models. The performance of the algorithm was tested on English subtitles of 964 movies of six genres: Action, Fantasy, Horror, Romance, Sports and War. Experiments were conducted with varied number of features and six machine learning models. The best result was obtained using K-Nearest Neighbour (kNN) with the average precision for all genres being 77.7% with 200 features. Another noteworthy result was an average precision of 75.2% using kNN with merely 50 features. The algorithm performed very well for the genres: Sports and War with above 90% precision in some cases.

Movie genre classification using binary relevance, label powerset, and machine learning classifiers

Article 11 June 2022

A multimodal approach for multi-label movie genre classification

Article 07 November 2020

Comparison of Machine Learning Techniques for Multi-label Genre Classification

Discover the latest articles, news and stories from top researchers in related subjects.

Artificial Intelligence

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

1 Introduction

Movies are characterized in several ways; some of them are its type (for instance, cinema, animation, documentary, flash), duration, background score and emotional or suspense quotient to name a few. One of the most important traits of a movie is it’s genre. There are several genres prevalent like sports, horror, romantic, thriller, crime and fantasy. A movie’s genre speaks about it’s basic theme. A movie may belong to only one genre or may have a combination of multiple genres in it’s various parts (generally the case). Genres are mostly decided manually by the director of the movie or some experts like critics. It is imperative that a movie’s genre be correctly identified as a major audience decide to watch a movie on the basis of it’s genre, that reflects the probable content of the movie.

With the fruits of machine learning, it is relatively easy to identify a movie’s genre using algorithms that use past information for training and then can make reliable predictions. This automation has made the task faster and simpler; the major concern ofcourse being the effective performance of the classifier in making predictions.

If a movie’s genre is known, it can be used in several applications. One of these is predicting a movie’s box office performance. The authors in [16], produced a genre specific empirical analysis using basic and extended regression model to identify key factors that determine the success of a movie at the box office. They based their study on two genres: computer animation and comic based films and used a dataset spanning thirty years. They deduced in their analysis that actors with exceptional popularity, award nominations and production budget play an important role in deciding a film’s success. They specifically highlighted the fact that genre preference affects the movie choices of the consumers.

The most relevant movie genre application is in designing movie recommendation systems that extend recommendations to the audience in view of their past movie likings. Several such algorithms have been proposed by academia. In [3], the authors propose a movie recommender system set to solve the well known cold start problem during collaborative filtering. The recommender system is based on category correlations which include movie genres provided by directors and experts. They computed genre correlations in two ways; one on different number of movies and the other over movies across several decades. Their experiment on GroupLens movie database showed that precise recommendations could be made using decade-based genre correlations. A movie recommendation algorithm built on genre correlations was proposed that measures correlation between genres (using ratings given by users) based on a genre probability and weight and further classifies movies based on these correlations [13]. This classification list is then recommended to the users. But the algorithm is posed with the issue of data sparsity. A movie recommender system developed on movie genre preference was proposed in [2] that utilizes neuro-fuzzy decision tree (NFDT). User reviews for multiple genres and their star ratings are used for training.

Another movie recommender system was developed in [17] that uses genre similarity and preferred genres. Genre similarity has been obtained using Pearson correlation coefficient and hence clusters have been obtained using k-nearest method. A recommendation system built on user comments and reviews on Youtube was described in [9]. The authors used their Morphological Sentence Pattern (MSP) model to extract relevant aspects and expressions. Then these were used to derive genre similarity based on tf-idf vectors and a genre score which is proposed as a measure of correlation between a movie and the genres. Post this, K-Nearest Neighbour and K-means clustering are used to group movies based on their genre similarity. Results for 100 movie reviews and 2,000 YouTube comments each for a total of 100 movies have been given and found quite satisfactory. In the next section, we discuss several such algorithms proposed in this domain.

The major contribution of the paper is as follows:

This paper intends to propose a multi-label movie classification scheme built on the movie’s subtitle. Subtitles are a complete account of the movie’s content. We would like to clarify here that by subtitles we do not mean the lines appended to the main title. For instance, for the movie “Holiday: A Soldier Is Never Off Duty”, “A Soldier Is Never Off Duty”, is a subtitle for the main title “Holiday” but this is not the subtitle we refer to. By subtitle, we mean the dialogues or sounds displayed with each scene at the bottom. Subtitle files are readily available that contain the movie’s dialogues, sounds, exclamations, a description of the background etc. Subtitles hence are the actual representation of the complete movie’s script. The classification is multi-label which means the algorithm predicts multiple genres for a movie. In our knowledge, the work is novel as no scheme based on subtitles has been proposed yet. The proposed scheme can be extremely useful in designing genre based recommender systems for the audiences, predicting box office collections for a movie through text analysis of the readily available movie subtitles. The experimental results indicate promising performance of the proposed scheme.

The next section provides a review of some of the important schemes proposed in this area by several researchers. Section 3 describes the proposed classification scheme built on subtitles. Section 4 presents and discusses the results obtained by the classification model built from the proposed scheme. The last section concludes the paper.

2 Literature review

Several movie classification schemes based on it’s genre have been proposed that can also be used to predict the genre of a movie and further use the knowledge to develop recommender systems. The features used for classification have been taken from sources like movie’s plot, images, trailers or visuals. Below, we discuss some well known techniques designed on these features.

Several models have been derived from the movie plots. An approach based on Wikipedia movie plot has been given that detects fractions of a genre present in a movie [20]. Text mining technique has been used to develop bag-of-words with frequencies 1,5 and 15. A corpus for 20 genres was created and the results have been produced by training on 540 movie plots with best results reported on refined corpus with word frequency > 15. Topological data analysis was used as a tool to build a movie genre classification system [6]. Persistent homology was used to quantify topological features in movie plot summaries. Term frequency matrices were generated for top words for a genre and barcodes for the same were created. Movie genres were identified by comparing one-dimensional holes/loops in each barcode. A Jaccard score of 54.8% was reported during performance evaluation on 250 movies and 4 genres.

Images can prove immensely useful to assign labels in varied classification problems. In [25], the authors propose a CNN based sketch-based image retrieval (SBIR) system to recommend images that are similar to a sketch. Similarly [8], proposes an image classification scheme for plant species identification. Hence, its use is well pronounced in recommendation systems. Some approaches have been developed on movie poster images. A deep neural network based model was proposed in [4] that classifies movies into genres based on movie posters. A convolution neural network was trained to extract a visual representation and then objects were detected in the posters. The approach was tested on 8191 images with 23 genres and the classifier assigned probabilities for each genre; the thresholds of which were decided by a grid search scheme. Movie posters were used to extract semantic features for movie genre classification. Twelve meaningful features were derived including theme, layout, emotion and dominant color. After computing these values, classification was done using five multi-label algorithms: Multi-Label kNN (MLkNN), Binary Relevance, Classifier Chains (CC), RAndom k-labELsets (RAkELd), Label Powerset (LP) combined with 3 classifiers Multinomial Naive Bayes (MNB), C-Support Vector Classification (SVC) and Random Forest Classifier (RF) [23]. Multinomial Naïve Bayes with Label Power Set gave a Jaccard score 41.78% while testing with 18 genres and MovieLens 100k dataset. An approach for movie genre classification utilizing low level features like color and edges from movie posters was proposed in [15]. The extracted features were used to train the classifiers distance ranking, Naïve Bayes and RAKEL. Results were obtained for 1500 posters and 6 movie genres with an accuracy of 67% for at least one of two correctly detected labels.

Movie trailers that possess both audio and visual components are also significant in predicting movie genres. A multi-label movie genre classification approach built on movie trailers was proposed that used deep convolution neural networks. The method named Convolution-Through-Time for Multi-label Movie genre Classification (CTT-MMC) uses an ultra deep neural network with residual connections and a convolutional layer to redeem temporal information from image based features [26]. A scheme derived from movie previews used audio-visual features of previews to classify movies into genres [19]. First, movies were classified into action and non-action based on visual disturbance and average shot-length. Then, color, audio and cinematic attributes were used to further classify into genres like comedy, horror and drama. Features like light intensity, sudden changes in audio level, motion were used and tested against thresholds for classification.

A meta-heuristic optimization algorithm termed Self-Adaptive Harmony Search (SAHS) was used to extract relevant audio and visual features from movie trailers [12]. The extracted features were then fed to an SVM to assign a genre to the movie. Experiment was conducted on 223 movie trailers and a total of 277 features were determined and 25 of them were used for classification. An accuracy of 91.9% was reported and it was seen that audio features were more relevant than visual features. Scene categorization from movie trailers was used in [28], for movie genre classification. The trailer is fragmented into keyframes using shot boundary analysis. Then, scene extractor and descriptor schemes namely GIST, CENTRIST and W-CENTRIST are used for extracting features which are then used to classify the movie genre using nearest neighbor approach. The scheme was tested on 1239 movie trailers over 4 genres and the best accuracy reported was 74.7%. Probabilistic latent semantic analysis (PLSA) was used in [11] to classify movie genres using movie previews. Audio and visual features were derived from the previews and text is obtained from social tags. Three models have been proposed: Standard PLSA for using only one feature from audio, video and text, double PLSA using two features and triple PLSA using all the three. Experimentation on 140 movie previews and their tags and 4 movie genres show that the triple PLSA scheme performs best and the authors also highlight the significance of the text feature in the same.

A classification model based on movie scenes was proposed in [14] wherein the authors classify the scene into eight emotional categories. The features were derived using an affective audio-visual words (AAVWs) method that was built upon the tf-idf technique. For labeling the features to an emotion, the authors present a model named latent topic driving model (LTDM) as a combination of topic model based on latent Dirichlet allocation and an emotional model estimating sequence of emotions through past scenes. LTDM uses conditional probability for classification. Their results based on SAR highlight the good performance of the model.

A scheme for identification of starring characters in movie scenes was proposed in [10]. The technique termed DeepStar was designed to detect main characters in a scene by extracting clear faces from the scene, face clustering using robust deep features and selecting the starring characters by generating an occurrence matrix. This work is potentially useful for movie analysis for instance, movie summarization and indexing. Another approach significantly useful in movie summarization is proposed in [24]. The technique has been designed to be able to provide user preferred summarization. The tools deployed include an entropy-based shots segmentation, computation of temporal saliency of shots to improve detection of character faces and facial expression recognition using trained deep CNN model to classify into seven emotions. An attempt on similar lines includes the paper [27], wherein the authors propose a framework for emotion detection in a video using emotion recognition system, emotion attribution and emotion based summarization. The authors use an auxiliary emotional image dataset to improve the performance of the emotion recognition system.

A unique movie genre classification scheme that utilized the movie’s music score was proposed in [1]. The authors analyzed instrumental music using timbral and select rhythm features to classify the genre into Action, Drama, Romance and Horror. Support vector machines were used as the classifier. An investigation on music score of 98 movies showed best results for action while drama and romance were least distinguishable. The scheme could further be improved using more features. A song extractor was proposed in [7] that fragments the movie into musical and non-musical segments and produces the song part. Also the genre of the song is predicted based on audio and video sequences in the song. Three genres were identified: tragic, pop and romance. The classifier is built using SVM. On a dataset of 105 movies, an accuracy of 89.5% was reported.

Some schemes have been devised from a combination of different features. An interesting work has been presented in [18] that predicts the genre of a movie by deriving features from the movie’s synopsis and an image description. The approach uses a measure of thematic intensity created using the synopsis text and color and activity features from images. The method has been tested on 107 animated movies for estimating their drama content and reports a precision of 78%.

A genre classification scheme for Flash movies based on Bayesian classifier was presented that classified the movie into six genres using 10 features some of which are movie length, amount of user interactions, number of event sounds and embedded images/videos [5]. The performance of the approach was tested on 2000 Flash movies and an average accuracy of 72.4% was reported. The authors in [22], gave a movie classification scheme based on nine movie type indicators (MTI) like Fun, Serious and Eye-Catching. The authors used principal component analysis to derive the relevant features from the audience reviews and then deployed K-Means clustering to cluster the movies using these MTIs.

Most of the schemes mentioned above utilize the movie’s plot, images, trailers, videos or scenes or a combination of these to predict it’s genre. In this paper, we use the yet unutilized movie’s subtitles to build our multi-label genre classification technique described in the next section.

3 Proposed multi-label movie genre classification scheme

In this section, we propose the multi-label movie genre classification algorithm that is developed on the movie subtitles. The model broadly consists of five modules: Data collection and preprocessing, Feature extraction, Feature selection, Building the data set and Training.

In the following subsections, we provide details of the processes involved in each module.

3.1 Data collection and preprocessing

The movie subtitles can be considered as a script-aligned description of the movie in terms of the various dialogues and actual sequences. Sometimes, it also reflects on the background score, character expressions and utterances. First, a set of movie subtitles are collected. These are the .srt files that majorly contain two types of information for each scene:

1.
Some text representing the audio and visual content of the scene
2.
Time sequences

For instance, Fig. 1 shows the first three subtitle entries for the movie “Clash of The Titans”. Hence the data comprises of words and time values. Before the data can be used for prediction, it needs to be preprocessed. Major filtering done in this module involves:

1.
Removing the time values as they do not play any role in prediction
2.
Converting the text to lower case
3.
Removal of stop words like “and”, “the” as they are irrelevant

3.2 Feature extraction

After the data is preprocessed, relevant features for training the machine learning algorithm need to be identified. The features being talked about in this case are those words that can help decide the genre of the movie. For instance, for a movie belonging to the “sports” genre, some of the expected words in the subtitle would be “game”, “play”, “score”, “team”, “victory”, “lost” and many more. Our submission here is that the frequency of occurrence of these words would be much higher in a movie belonging to the “sports” genre than a movie belonging to “romance” or “horror” category. Hence, our algorithm tries to extract these words that can serve as our feature set. Below we describe this process in detail. Consider there are M subtitle files. We build a dictionary that stores items of the form < w_i,f_i >, where w_i denotes the i^th word and f_i denotes the combined frequency of the i^th word in all the M subtitle files. Initially the dictionary is empty. It is built by the following steps:

1.
Each subtitle file is read and tokenized into words. Say, N words are found in the k^th subtitle.
2.
For each word w_i, iε[1,N], its frequency of occurrence, f_i, is computed.
3.
Now, w_i is located in the dictionary. If the word w_i is found, say at j^th index in the dictionary, the corresponding frequency f_j is obtained from the dictionary and updated to f_i + f_j. If w_i is not found, then, a new entry < w_i,f_i > is made in the dictionary.

Now, the algorithm extracts the words that are most likely to play a role in taking decision on the movie’s genre. These words would serve as the feature set for the machine learning algorithm. For extracting these words, we choose a lower and upper threshold for the word frequency termed as Thresh_Low and Thresh_Upp respectively. For a total of L words in the dictionary, a word w_i, where 1 <= i <= L, is retained in the feature set if:

$$ Thresh_{Low} <= f_{i} <= Thresh_{Upp} $$

(1)

For this, we first rank the words in the dictionary in order of decreasing frequency and then extract the words with corresponding frequencies satisfying (1).

In the present work, we haven’t derived any formal way to accurately deduce the values of Thresh_Low and Thresh_Upp. We would like to deeply analyze this factor in our future endeavors.

3.3 Feature selection

In this step, a feature selection algorithm is deployed to further extract more relevant features from the domain. Here we aim to choose those features in our data that contribute most to predict the target class. This can help improve the performance of the algorithm especially in case of high dimensional data sets. Feature selection can help reduce overfitting, improve model accuracy and also reduce training time.

In our proposed model, we have used the SelectKBest technique [21] for feature selection that extracts the most relevant features using the univariate statistical chi square test. The technique removes all the features except the K features that score the highest. A chi square test is performed on the sample to retrieve the best features. The function returns the scores obtained and the p-values.

We have tried using different values of K while conducting our performance analysis and some variation in results has been observed. This is an important parameter in our proposed model.

3.4 Building the data set

In this step, the data set is created which keeps a record for each movie. The fields in the data set comprise of the feature set built from words in the step above and the classes (genres in this case). The record for each movie is represented in the data set as follows:

For all the words, w_i in the feature set, if w_i is present in that movie’s subtitle, then the corresponding frequency for w_i in the movie, i.e. f_i is recorded in the data set. In case, w_i is not present in the movie, then, the corresponding feature value is set to “0”.
For recording the classes the movie belongs to; we do the following:

if the movie belongs to a particular genre, then, a “1” is recorded in the column corresponding to that genre. For other genres, the value stored is “0”. Hence, a binary vector with 1^′s for all genres that the movie belongs to and 0 for the others is recorded in the data set.

This process is followed for all the movies whose subtitles are available.

3.5 Training

Once the data set is ready, the model is trained for building a classifier. A training set with a representation from all the genres is used to train the machine learning algorithm. The model can now be used for classification of a new movie into a genre through its subtitles.

Algorithm 1 presents the pseudocode for the proposed classification model and Fig. 2 illustrates the processes involved in each module of the algorithm. A miniature application of the algorithm is shown on a small sample from the subtitles of the movie “The Last Rescue” that belongs to the genres: “Action, Drama and War”. The complete subtitles from this movie along with several other movies would together formulate the training data set used to build the classifier.

4 Results and performance analysis

The performance of the proposed algorithm was tested on English subtitles of 964 movies belonging to 6 genres namely: Action, Fantasy, Horror, Romance, Sports and War. Each movie belongs to either one or more genres from the six genres considered. The data source of the subtitles was yifysubtitles.com from which complete .srt files of some real movies from across the world were picked up. The distribution of movies from each genre is as follows:

Action : 223, Fantasy : 223, Horror : 216, Romance : 227, Sports : 185 and War : 219

So, a nearly uniform representation from each genre has been taken. After data preprocessing, we experimented with three different values for Thresh_Low, keeping Thresh_Upp at 10000 to alter the number of words to be taken for final consideration for the feature selection module. The values of Thresh_Low were taken as 100, 500 and 1000. After this, feature selection using the Python SelectKBest technique (uses chi-square to find the most relevant features for the label), was done to select the final set of features to be used for training.

For training, we have used several different machine learning algorithms for obtaining the one that works well for all the genres considered. The algorithms used are Logistic regression, Support vector machine, Naïve bayes classifier, Decision tree, Neural network (multilayer perceptron with three hidden layers each with 20 nodes) and K-Nearest Neighbor (kNN) with K = 10.

In order to prevent overfitting and to obtain a near uniform representation of each genre in the training set, the data set was randomly shuffled and k-fold cross validation with k = 10 was used while training each model.

Average classification results for the ten folds have been presented in Tables 1–5 available in the Appendix section. Table1 presents the results for Thresh_Low = 100. 3223 features were extracted in this case and passed onto the feature selection module from which 2000 best features were selected. This case used the maximum number of features and can be considered as the upper bound for the performance measure. The best results were obtained using multi-layer neural network with the average precision for all genres being 77.4% and the recall being 65.2%. The next best performance was achieved in case of logistic regression with average precision for all genres being 75.6%. In all the cases, it was seen that the classifier could most appropriately predict correct genres for the movies belonging to Sports and War. Good results were also obtained for “Horror” (except for kNN). The average precision approximated below 70% for the genres Romance (though 76% with kNN) and Action which brought down the total performance.

Tables 2–4 present the results for Thresh_Low = 500. A total of 694 features were extracted in this case. With this threshold, we deeply investigated the behavior of the classifier on varying the final number of features. The number of features was taken as 50,100,200,300 and 500. The best average precision was achieved with kNN with 300 features at 76.9%. This can be attributed to the fact that the best 300 words from the 694 high frequency ones would have been used for training. Also, noteworthy is the 75.2% precision with kNN with merely 50 features.

The precision in all cases reduced from the values reported in case of Thresh_Low = 100 with exception being kNN. On varying the size of the final feature set, no significant changes were seen in the precision values across models. The precision ranged from [0.688 − 0.704] for logistic regression, [0.593 − 0.603] for naive bayes, [0.633 − 0.649] for SVM, [0.527 − 0.533] for decision tree, [0.718 − 0.731] for neural network, and [0.73 − 0.752] for kNN.

It is noteworthy that neural network and kNN could deliver a precision > 90% for the genres of Sports and War in most cases. kNN also reported good results for Romance. It’s performance majorly degraded in the Horror case.

In Table 5, the results for Thresh_Low = 1000 have been shown. The total number of features was 303. The training was done using 150 and 200 features. In this case, kNN gave the best average precision of 77.7% with 200 features (it’s best and overall highest). Though kNN reported good results in this case, the performance of other algorithms showed a minor drop. Neural network gave an average precision of 67.9% followed by logistic regression at 64.2% with 200 features.

Figure 3 presents the results of the precision values obtained in the different cases for all the six genres on applying the six machine learning models. The average precision for all genres using different models has also been depicted. The figure provides a comparative analysis of the performance of the different models when applied to the six genres. It is noteworthy that kNN performed the best for all genres except for Horror and gave the highest average precision for all the genres. It could also give 100% precision in case of Sports for Thresh_Low = 100 and 2000 features. Also, it can be seen that with only 200 features, kNN could provide a precision of nearly 80%. The results are hence quite promising and we feel that they can be further improved by deriving a mathematical formulation to obtain the values of Thresh_Low and Thresh_High and working on feature selection.

5 Conclusion and future work

This paper presents an algorithm for movie genre classification based on the movie’s subtitles. The algorithm retrieves the most relevant words that relate to a movie’s genre. The algorithm was run on 964 movies from six genres: Action, Fantasy, Horror, Romance, Sports and War. Experiments were conducted with six models and varying number of features selected through Python SelectKBest module that selects features based on k-highest scores computed using chi-square between labels and features. 10-fold cross validation was used. An average precision in the range of 70 − 80% was obtained with neural network, logistic regression and KNN with the best being 77.7% for 200 features with kNN. Good classification results were obtained for Sports and War. For the other genres too, precision was obtained in the range of 60 − 70% in most cases. As part of future work, we would like to devise a strategy for choosing the values of Thresh_Upp and Thresh_Low. Also, we would work to improve the algorithm to get better results for the other genres as well. We also envisage extending the analysis to other genres not yet considered.

Code Availability

The code can be shared if required.

References

Austin A, Moore E, Gupta U, Chordia P (2010) Characterization of movie genre based on music score. In: 2010 IEEE international conference on acoustics, speech and signal processing, IEEE, pp 421–424
Bhatt RB (2009) Neuro-fuzzy decision trees for content popularity model and multi-genre movie recommendation system over social network. In: TENCON 2009-2009 IEEE region 10 conference, IEEE, pp 1–6
Choi SM, Ko SK, Han YS (2012) A movie recommendation algorithm based on genre correlations. Expert Syst Appl 39(9):8079–8085
Article Google Scholar
Chu WT, Guo HJ (2017) Movie genre classification based on poster images with deep neural networks. In: Proceedings of the workshop on multimodal understanding of social, Affective and Subjective Attributes, ACM, pp 39–45
Ding D, Yang J, Li Q, Wang L, Wenyin L (2004) Automatic detection of flash movie genre using bayesian approach. In: 2004 IEEE International conference on multimedia and expo (ICME)(IEEE cat. no. 04TH8763), vol 1. IEEE, pp 603–606
Doshi P, Zadrozny W (2018) Movie genre detection using topological data analysis. In: International conference on statistical language and speech processing, Springer, pp 117–128
Doudpota SM, Guha S, Baber J (2013) Mining movies for song sequences with video based music genre identification system. Inform Process Manage 49 (2):529–544
Article Google Scholar
Fan J, Zhou N, Peng J, Gao L (2015) Hierarchical learning of tree classifiers for large-scale plant species identification. IEEE Trans Image Process 24 (11):4172–4184
Article MathSciNet Google Scholar
Han Y, Kim Y (2017) An extracting method of movie genre similarity using aspect-based approach in social media. ACM SIGAPP Applied Computing Review 17(2):36–45
Article Google Scholar
Haq IU, Muhammad K, Ullah A, Baik SW (2019) Deepstar: Detecting starring characters in movies. IEEE Access 7:9265–9272
Article Google Scholar
Hong HZ, Hwang JIG (2015) Multimodal plsa for movie genre classification. In: International workshop on multiple classifier systems, Springer, pp 159–167
Huang YF, Wang SH (2012) Movie genre classification using svm with audio and video features. In: International conference on active media technology, Springer, pp 1–10
Hwang TG, Park CS, Hong JH, Kim SK (2016) An algorithm for movie classification and recommendation using genre correlation. Multimed Tools Appl 75(20):12843–12858
Article Google Scholar
Irie G, Satou T, Kojima A, Yamasaki T, Aizawa K (2010) Affective audio-visual words and latent topic driving model for realizing movie affective scene classification. IEEE Transactions on Multimedia 12(6):523–535
Article Google Scholar
Ivasic-Kos M, Miran P, Luka M (2014) Movie posters classification into genres based on low-level features. In: 2014 37th international convention on information and communication technology, electronics and microelectronics (MIPRO),IEEE, pp 1198–1203
Kaimann D (2013) ’to infinity and beyond!’-a genre-specific film analysis of movie success mechanisms. Center for International Economics Working Paper Series (2011-05)
Kim KR, Moon N (2012) Recommender system design using movie genre similarity and preferred genres in smartphone. Multimed Tools Appl 61(1):87–104
Article Google Scholar
Païs G, Lambert P, Beauchêne D, Deloule F, Ionescu B (2012) Animated movie genre detection using symbolic fusion of text and image descriptors. In: 2012 10th international workshop on content-based multimedia indexing (CBMI), IEEE, pp 1–6
Rasheed Z, Shah M (2002) Movie genre classification by exploiting audio-visual features of previews. In: Object recognition supported by user interaction for service robots, vol 2. IEEE, pp 1086–1089
Saumya S, Kumar J, Singh JP (2018) Genre fraction detection of a movie using text mining. In: Advanced Computing and Systems for Security, Springer, pp 167–177
ScikitLearn (Accessed: 2020) SelectKBest
Shon JH, Kim YG, Yim SJ (2012) Dissecting movie genres from an audience perspective: Mti movie classification method
Sirattanajakarin S, Thusaranon P (2019) Movie genre in multi-label classification using semantic extraction from only movie poster. In: Proceedings of the 2019 7th international conference on computer and communications management, ACM, pp 23–27
Ul Haq I, Ullah A, Muhammad K, Lee MY, Baik SW (2019) Personalized movie summarization using deep cnn-assisted facial expression recognition. Complexity 2019
Wang L, Qian X, Zhang Y, Shen J, Cao X (2019) Enhancing sketch-based image retrieval by cnn semantic re-ranking. IEEE Trans Cybern 50 (7):3330–3342
Article Google Scholar
Wehrmann J, Barros RC (2017) Movie genre classification: A multi-label approach based on convolutions through time. Appl Soft Comput 61:973–982
Article Google Scholar
Xu B, Fu Y, Jiang YG, Li B, Sigal L (2016) Heterogeneous knowledge transfer in video emotion recognition, attribution and summarization. IEEE Trans Affect Comput 9(2):255–270
Article Google Scholar
Zhou H, Hermans T, Karandikar AV, Rehg JM (2010) Movie genre classification via scene categorization. In: Proceedings of the 18th ACM international conference on Multimedia, ACM, pp 747–750

Download references

Author information

Authors and Affiliations

Department of Computer Science, Ramanujan College (University of Delhi), New Delhi, India
Nikhil Kumar Rajput & Bhavya Ahuja Grover

Authors

Nikhil Kumar Rajput
View author publications
You can also search for this author in PubMed Google Scholar
Bhavya Ahuja Grover
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Bhavya Ahuja Grover.

Ethics declarations

Conflict of Interests

The authors declare that they have no conflict of interest.

Additional information

Availability of data and material

All data has been taken from yifysubtitles.com.

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendix

Table 1 Results for Thresh_Low = 100

Full size table

Table 2 Results for Thresh_Low = 500

Full size table

Table 3 Results for Thresh_Low = 500

Full size table

Table 4 Results for Thresh_Low = 500

Full size table

Table 5 Results for Thresh_Low = 1000

Full size table

Rights and permissions

Reprints and permissions

About this article

Cite this article

Rajput, N.K., Grover, B.A. A multi-label movie genre classification scheme based on the movie’s subtitles. Multimed Tools Appl 81, 32469–32490 (2022). https://doi.org/10.1007/s11042-022-12961-6

Download citation

Received: 14 November 2020
Revised: 22 March 2021
Accepted: 13 March 2022
Published: 13 April 2022
Issue Date: September 2022
DOI: https://doi.org/10.1007/s11042-022-12961-6

Keywords

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

A multi-label movie genre classification scheme based on the movie’s subtitles

Abstract

Similar content being viewed by others

Movie genre classification using binary relevance, label powerset, and machine learning classifiers

A multimodal approach for multi-label movie genre classification

Comparison of Machine Learning Techniques for Multi-label Genre Classification

1 Introduction

2 Literature review