Revisiting predictions of movie economic success: random Forest applied to profits

e Souza, Thaís Luiza Donega; Nishijima, Marislei; Pires, Ricardo

doi:10.1007/s11042-023-15169-4

Revisiting predictions of movie economic success: random Forest applied to profits

Published: 28 March 2023

Volume 82, pages 38397–38420, (2023)
Cite this article

Download PDF

Multimedia Tools and Applications Aims and scope Submit manuscript

Revisiting predictions of movie economic success: random Forest applied to profits

Download PDF

Thaís Luiza Donega e Souza ORCID: orcid.org/0000-0002-2879-694X¹,
Marislei Nishijima² &
Ricardo Pires³

1761 Accesses
1 Citation
1 Altmetric
Explore all metrics

Abstract

Previous studies have employed machine learning tools to classify films according to success to guide a reduction in the degree of uncertainty of film production. We revisited the literature to contribute to three relevant issues in classifying films according to economic success. First, we explored the differences between the results of the shortest or longest samples in terms of time to study possible changes in patterns of consumption mainly due to technological changes and between total and wide-released films. Second, we used profits free of price inflation as measures of economic success instead of the usual box office nominal revenues. Third, we employed a smaller set of features, only the ones available at the time of production, to help producers maneuver contingencies since little or nothing can be done by the time a film is in the theaters. We followed the literature to choose the classifiers - Random Forest, Support Vector Machine, and Neural Network - and designed sub-datasets to model and compare the performance of our results. Our dataset includes all films with budgets disclosed at the Box Office Mojo website, resulting in 3167 movies released at theaters worldwide between 1980 and 2019. The Random Forest results outperform previous similar studies with different sampling in time, including results for a less usual larger sample, with the best data sample about 97% both in accuracy and F1-score.

The Study for the Influencing Factors of the Box Office and Prediction Based on Machine Learning Models

Predicting Movie Success Using Regression Techniques

Machine Learning Applications on Box-Office Revenue Forecasting: The Taiwanese Film Market Case Study

Discover the latest articles, news and stories from top researchers in related subjects.

Artificial Intelligence

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

1 Introduction

The film industry has been responsible for about 30% of the total revenue of films since the 2000s [56] and reflects a film’s economic success, due to the significant consumption of related goods during and after the release and of the film being consumed with other complementary goods, TV, cable, and others. According to the Motion Picture Association of America, in 2019, ticket revenues alone in the US and Canada were around $11.4 billion, while 76% of their populations could be classified as moviegoers [54]. At the same time, the motion picture and television industries support more than 2.5 million jobs in the United States.^{Footnote 1}

According to economic theory, film is both an information and an experience good. As an information good, it has a high fixed cost (actors, directors, editors, and others) and almost zero reproduction (marginal) cost [70]. As an experience good, its quality is not known until the time of consumption, which explains the uncertainty in its production [64]. These characteristics and recent technological changes make it difficult for an entrepreneur to know in advance whether a new film will be successful as an economic venture [3].

The rapid growth of the Internet and digitization, led by technological innovations in information and communication technologies (ICTs), has reduced production and distribution costs, creating a golden age for creative economic endeavors, such as information goods like music, movies, and books [71]. For example, today, a film can be consumed on any device with Internet access, such as mobile phones and tablets.

In addition, there are multiple substitute ways to consume a movie since it can be watched at home or virtually in any place and at any time just after the theater release, or simultaneously in some situations. Specifically, films’ concurrency at movie theaters has increased due to Internet downloads and online streaming platforms [35, 71]. The same ICT development that allowed the reduction in film costs also incentivizes other markets to establish concurrence. Netflix, for example, is using consumer data and artificial intelligence to target consumption tastes to maximize its returns.^{Footnote 2}

Given the effects of these new technologies and the high risk of film production [35, 48, 65], we employ a decision support system to produce guidelines for film producers and their stakeholders such as studios distributors, and their shareholders. A film is a risky endeavor since it is very expensive to produce - including expenses for actors, directors, and marketing among others - and may not find enough viewers to pay for itself. In this sense, an application that allows and indicates how producers can change decisions like budget, distributors, and film duration among others, can reduce the risk of not being profitable. Such a tool, thus, can prevent heavy losses and improve productivity. To try to obtain this tool, we revisit the literature and focus on at least three main issues to contribute to and improve the performance of the previous studies regarding whether a film will have enough consumers to make it profitable.

First, we follow the economic literature to sample our procedures according to short and long periods to deal with a potential change in the “regime” that models a film’s profits. The model that describes film success can change due to ICT evolution over time, which can be seen as an exogenous shock on the model’s parameters [10]. In this sense, innovations occurring in a sector could change the model that generates the best classification/prediction, and just increasing the number of observations by using the information of the distant time, as is usual in the case of films, will not necessarily improve accuracy. In this sense, using small samples (near specific date) could produce a more homogeneous sample or one free of outliers. We also explore results using only wide-released film samples since they are more similar (a wide-released movie is very different from a limited-released one in costs, consumers, and so on), thus creating a more homogeneous sample [22].

Second, we measure a film’s success based on its profit deflated by the CPI (the US Consumer Price Index). Using profit as a success measure allows us to account for revenues and costs of production since even a colossal box office cannot be profitable if the costs of production are also high. The literature mainly uses total revenues as a measure of economic success and does not control for the effects of inflation (at least, they do not explicitly mention them). Not correcting for the effects of inflation may lead to inaccurate classification of success since the more recent films have higher profits and revenues in current values. Following the still scarce literature, we investigate two measures of success based on profits at theaters as a measure of success in two experiments. The first is a binary measure of film profits, where we consider box office revenues and costs of production (budget) to account for the film’s success; the second is a 6-class classification in profit ranges to be closer to reality and more directly comparable with the literature.

Third, we evaluate whether economic success in the theatrical film market can be predicted by a small set of readily observable features available after the film’s financial plan and the green lights, that is, before or at the time of film production and release [22]. The literature, on the other hand, tends to not take into account the timing when the features are available, employing features indistinctly observed before and after a film release – such as critic reviews, consumer reviews, the time a film is kept on screens – and there is no room to change features to get better results before a film release. Using variables available at the time of production allows the producers to have a higher degree of freedom in timing to control investment decisions [76].

In connection with our first contribution, we employ a uniquely configured set of data according to the shortest or longest sample in time, and total and wide-released films, to the three most popular machine-learning (ML) algorithms – Random Forest (RF), Support Vector Machine (SVM), and Neural Network (NN). The results allow us to properly compare the performance of the methods and datasets with the existing literature. From our knowledge, the three issues addressed jointly, as we propose, offer an additional contribution to the literature.

Employing a dataset scraped from the Box Office Mojo and IMDB sites and features available before a film release, we get about 96% and 97% accuracy and F1-score, respectively, in binary break-even (BE) classification and about 90% of Average Percent Hit Rate (APHR) for profit ranges (PR). Our results indicate an improvement in accuracy compared with the literature. Moreover, the results are more compelling – considering the use of a stricter measure for film economic success (the profits in a constant/deflated dollar value) and a reduced set of features – and more reliable due to the several tests with different numbers of observations and cuts in time. Finally, the results suggest that our models produce a better performance than those in the literature to date, indicating that our small number of features were appropriately chosen and that RF may be a better tool for predictions of movie profitability.

Given the limited number of movies released per year, the increase in sample size implies an increase in the time window in numbers of years. This, however, means the possibility of ignoring shocks that change the conditions of consumption of films each year.

Therefore, using a larger sample in time should be explored with caution similar to the econometric literature, which can open a new agenda for future studies. The literature of ML applied to film success suggests a trend; small datasets (few years), contrary to expectations, perform as well as or even better than larger datasets (more observations based on longer periods) to classify films by economic success, and our results support this conclusion. We attribute these results to a possible change of the “regime” that could drive the economic performance in theaters. In this sense, technological innovations, changes in individuals’ preferences, and other shocks, as COVID-19, could cause these regime changes;

Following this introduction, Section 2 summarizes the literature; Section 3 presents our data and methodological strategy; Section 4 comprises our results and discussion, while the last section summarizes our main findings.

2 Literature on movie success

Due to the uncertain returns of films, many scholars have attempted to predict the economic success of a film at theaters aiming to guide producers, studios, distributors, and theater chains. Most of these studies are explanatory, investigating factors and their relations with movie box office performance through regression analysis, and have been published in different fields: Economy [11, 16, 25, 28, 37, 45, 58, 66], Business and Information [41, 52, 55], Marketing [14, 21, 44, 53] and Computer Science [2, 6, 19, 51, 69].

Recent ICT developments have reduced the costs of producing films and increased the number of films produced, and this has resulted in more film data available. These data and new computational methods have increased the number of studies predicting movies’ success [71]. Most of these ML studies use features available along the whole movie lifecycle to predict a film’s success. Yet, as the greater part of data is available only after a film release, most of the studies use these data to predict success. In this case, however, there is no room to change film production decisions.

In this sense, the literature on predicting movie success usually employs post-release features like critic reviews, ratings, nominations, awards, other forms of word-of-mouth (WOM), and awareness information [18]. For example, studies employ social media microblogging to forecast box office revenues using ML in China [63] and the Korean market [34]. There are also similar studies using other methods of classification. For example, one study uses online user reviews applied to Support Vector Machine Regression (SVR) to predict box office revenues according to the genre [33]. Another involves text mining on Twitter to get insights on customer preferences to predict box office revenues with CART and NN regression [47], both in the US movie market. Some authors transform movie box office predictions into a classification problem [23, 38]; in particular, these authors also employ user opinion mining. For example, a study uses critic ratings and visual elements from movie posters, besides other movie metadata, to classify film success employing deep NN for 6-class box office prediction [78]. Another study uses data extracted from visual elements in trailers and text features from film abstracts, employing a NN to predict box office revenue [73]. Finally, another study explores daily box office patterns through the clustering approach and after-release features [72]. The literature also reports studies using alternative movie success measures, like critic reviews. For instance, some studies implement ML methods and social media to predict movie ratings [1, 5, 17].

Among studies exploring features before the film release, some use the “hype” generated online immediately before the film release through comments, search patterns, and other “buzz” around the movie. Even in this case, however, production and marketing expenditures are already made, leaving no time to reverse decisions. For example, studies utilize social media mentions as proxies for WOM to predict box office returns in the Korean market [39, 40, 43]. Another study mines popularity and purchase intentions from social media in China to predict box office [49]. Yet another uses Gradient Boosting Decision Tree and daily gross revenues to predict daily box office gross [75]. Finally, another study [32] employs ML binary classifiers and Tweet patterns for the US movie gross.

Still considering post-release features, a study that predicts economic success with profit classes instead of gross revenues develops Multilayer Backpropagation NN to predict movie profitability in a binary classification approach [60]. The authors include ratings from users and critics and the volume of reviews by film in their model for 375 movies released in the US and achieve an accuracy of 88.8%. Along the same line, [68] employs SVM and features after release to explore a film’s return on investment (ROI) as a 4-class problem – the data were obtained from 138 movies released in 2015 in the US market, and the result is about 56% accuracy.

An ML seminal study reduced the information set to variables observed before a film’s release [65]. The authors employ a Multilayer Perceptron NN to solve a 9-class box office problem. Their set of features is composed of competition degree, genre, MPAA rating, star power, number of screens in the first-week release, and a binary feature for a sequel for 834 movies released in the US market. The authors get a performance of 36.9% in APHR accuracy. A comparison study improves [65]‘s results with backpropagation, showing 68.1% of APHR in a 6-class 241-sized dataset [76]. In the same way, [24] also improves [65]‘s results using a Dynamic NN in a smaller dataset, getting 74.4% Bingo APHR accuracy as a result for the same box office gross 9-class problem. The authors also perform an additional test in an even smaller dataset (354 movies) and add marketing expenditures to the feature set, which resulted in 94.1% Bingo APHR accuracy with the same Dynamic NN.

More recently, other studies have been updating the methods and features for early box office prediction at earlier stages of the film lifecycle. For example, one applies pruned RF and different comparative ML classifiers to predict 8-class first-week box office using Chinese theater-level data and theaters’ revenues as the economic success measure [27]. A second study focuses on animated movie gross, with a 3-class NN and basic movie metadata [61]. A third work uses CART to predict 7-class box office revenue in the Chinese market [77]. A fourth study analyzes the differences between movie features while using RF regression, having the early box office prediction as the economic success measure [4]. Lastly, [3] develops an ensemble with several ML classifiers to predict box office revenues in nine classes.

Finally, very few studies explore profit as the success measure and features available before the film’s release or during its production simultaneously (Table 1 – bolded). Employing SVM and NN to predict profitability in five range classes, [57] uses budget, the number of screens, release month, MPAA, and star and director power in a 755-observation dataset to get 49.54% of Bingo APHR. The work most similar to ours, however, uses a 2506 sample size to predict who, what, and when a film could be profitable [42]. The authors explore cast relationships, movie abstracts, and release season to classify American movies according to their raw profit and ROI. The authors perform a few experiments, including binary for ROI and profit and 3-class for ROI. Their best result is 90.4% of accuracy for binary profit.

Table 1 Literature summary

Full size table

This study distinguishes three main aspects performed simultaneously from the previous closest studies summarized in Table 1.^{Footnote 3} First, we account for the effects of ICT advances or another possible shock in the recent period; then, we design sub-datasets to account for differences in the short and long run as similar as possible to these studied datasets to compare performance. Second, profits were deflated and used as the measure of economic film success. Third, the sets of our features are smaller and more intuitive than the ones used by those studies and available at the time of film production (see the arguments in Section 3.2.)

In addition, considering Table 1, it is notable that we employ a decision support system to classify and forecast film profits using RF, SVM, and NN. Using this set of tools differs from the literature and could also be viewed as a marginal contribution (see Section 4).

3 Data and methodological strategy

3.1 Data

Around 22% (3167) of the movie releases between 1980 and 2019 (14,510) available at the Box Office Mojo and IMDB sites – the most common data sources used by literature – have budget information.^{Footnote 4} The smaller amount of information on film costs is due to “industry trade secrets” [76]. The collected sample, however, is far larger than the data size average used in the literature, which is 361 observations/movies [39].

All monetary values used in this study were deflated by the 2019 CPI (CPI-2019) to control for inflation over the years; we keep prices of 1980 constant. This procedure is not usual in this literature. Not correcting for inflation, however, can mislead decision makers and compromise results since comparing revenues over time demands control of price inflation to avoid the more recent films being classified wrongly as more profitable or with higher revenues. Figure 1 shows the evolution of revenues and budget by year, between 1980 and 2019, both controlled for inflation.

We collected budgets and worldwide gross revenues to create the profit measures, which means we are considering the box office revenues in all countries where the film premiered. According to Box Office Mojo, all information received from countries is reported. Then, we follow the scarce literature that uses profit measures to create success classes for binary classes [42, 73, 78] and multiclasses [60].

3.2 Methods

Figure 2 presents the general workflow of the methodology described in the following sections.

3.2.1 Variable selection

Unlike most previous studies, we use a reduced set of features easily observable during film production to classify film success. Furthermore, we limit the features to include only those available before the film release, particularly at the film production stage. Thus, differently from previous studies, we can offer a policy guide for producers and stakeholders that allows changes while a movie is still in production. Additionally, the variables were chosen carefully and based on the literature to bring the most meaningful features for an optimal classification given the dimensionality course. Table 2 summarizes the features and their preprocessing step based on the literature.

Table 2 Variable names and description

Full size table

Compared to [42], we employ more straightforward, less costly, and directly observable features or at least the ones that industry agents have to bet on. For instance, during the production process of a film, it is possible to know the planned runtime, the season to be released, the distributor, and the genres. Thus, if the proposed tool’s prediction is faulty, there is time to change characteristics to increase the chances of success. Among the studies, [42] is the most similar to ours regarding using binary profit, but not deflated, as a success measure and variables before a film’s release (see Table 1).

3.2.2 Predicting methods

We choose the three most popular ML classifier algorithms in the film literature –SVM, Multilayer Perceptron Neural Network (MLP-NN), and RF – to conduct our experiments.

SVM is a supervised classifier based on the statistical framework proposed by Vapnik and Chervonenkis (VC Theory). It aims to find the best hyperplane to maximize the separation between data points; it can perform linear and nonlinear classification by applying kernel tricks. For further information and the math behind it, please refer to [7, 12].

The MLP-NN is also a supervised classifier that approximates functions that lead the entered data to the output class by adjusting weights between layers (forwards and backwards). For further information about MLP, see [30].

The RF classifier [8] is an ensemble of decision trees and can perform very well in different tasks [26, 36, 59], in particular, regarding the heterogeneity of data, including continuous and discrete variables, as the binary/dummy features employed. Besides being versatile in binary and multiclass classification, RF is also simple to build, train, tune, and the method is robust and less sensitive to noise [29]. Additionally, RF can outperform other non-ensemble methods [62]. Finally, since most previous movie prediction studies focused on NN, this makes the RF method still little explored (Table 1); we can thus consider its use as a marginal contribution to the domain. In addition, RF is very well suited to preview movie financial success thanks to its capacity to handle mixed data (dummies/binaries and continuous/discrete) Fig. 4.

In our samples, RF is less sensitive to noise (giant blockbusters or flops) and is explainable, allowing us to assess the feature’s importance in the models and evaluate whether samples of different ranges of time matter to predict success. Thus, it works as an indirect measure of shocks effects. We brought this idea from the economic literature in time series [10, 50], which states that a process generating a model, in this case, film profits, can change its regime throughout time due to shocks. To implement and test this, we created different data sets using different timing and a complete full dataset including year dummies to test the Gini importance effect of the years on RF (Fig. 5). RF’s lower sensitivity to noise is also suitable to compare total and wide-released film sets to preview success.

3.2.3 Experiments

Following the literature, the prediction problem was transformed into a classification problem, aiming to classify the movie into its profit success or failure based on its worldwide gross revenues and budget. Two different class arrangements were designed: Break-even (BE) and Profit Ranges (PR).

Break-Even (BE): Similar to [42, 60], the output is binary, 1 when the film’s profit is zero or positive – the worldwide gross is equal or greater than its budget – and 0 in the contrary case. In this sense, a movie only has to collect (in terms of box office gross) the exact amount spent in production (announced budget).

Profit Ranges (PR): To get results closer to actual profit values and comparable with previous literature, we created a 6-class problem considering the total amount of profits of a given movie following [73, 78].

3.2.4 Sets

Although the results for the full dataset (1980–2019) were good (see Section 4), we noticed that the literature uses much smaller datasets. Therefore, we also analyze different slices of the dataset to explore possible heterogeneous results among the smallest and greatest samples in time, which could capture changes in consumer behavior over time due to technical changes, for example, and between and within datasets. We also explore wide-released film subsets since they are more homogeneous in box office revenues. Thus, we created 12 subsets of data, considering the years of film releases and wide and total releases.^{Footnote 5} Tables 3 and 4 present the thresholds for classification and the rules to separate data in these subsets.

Table 3 Classifications thresholds for each class arrangement (break-even and profit ranges)

Full size table

Table 4 Datasets, slice rules and number of observations for binary class

Full size table

Figure 3 shows the class distributions of the full sample (A) over the years for BE (panel a) and PR (panel b), while Fig. 3 presents similar class distributions for the wide-released movies (B) over the years.

It is necessary to sort out the imbalanced class problem as observed in Fig. 3a (862 unsuccessful vs. 2305 successful films for dataset A) to classify a film according to the profitability’s BE classes to avoid biased results toward the success/positive class. This imbalance in our sample is mainly due to budget information, a feature generally disclosed only from big studios. We use SMOTE to oversample the minority (negative) class and address the imbalance. SMOTE is an algorithm that creates, by mimicking, synthetic new observations. The new observations are not duplicated; they are similar to the examples by selecting records and altering one column in that record by a random amount within the difference to the neighboring records [13]; note that the synthetic instances are used only in training folds. Thus, we balanced all BE experiment datasets and the SMOTE proved, through tests, to be better than class weight and near-miss methods.

To obtain the best hyperparameter set, we use the Grid Search tool to optimize all experiments, models and sets. We start with big ranges and different configurations of hyperparameters and refine them to get the best scenario. The best sets of hyperparameters are in the footnotes following the results.

For both experimental setups (BE and PR) and all datasets (A to L), we use 10-fold cross-validation. This validation method allows a decrease in the train dependency and creates a more fairly comparable method [67]. Therefore, the results are presented based on the average of these 10 executions.

Finally, to evaluate, present, and discuss the results properly, we use accuracy (Eq. 1) and F1-score (Eq. 2) metrics for both binary (BE) and 6-class (PR) experiments. In addition, APHR is used for multiclass sets (PR), following the most common literature approaches. APHR (Eq. 3) is the total correct classifications to the total number of samples, averaged for all classes in the classification problem – or precision in multiclass problems.

$$ Accuracy=\frac{True\ Positives+ True\ Negatives}{True\ Positives+ True\ Negatives+ False\ Positives+ False\ Negatives} $$

(1)

$$ F1=2\ast \frac{1}{\frac{1}{precision}+\frac{1}{recall}} $$

(2)

$$ {APHR}_{Bingo}=\frac{number\ of\ a\ class\ samples\ correctly\ classified}{total\ number\ of\ a\ class\ samples} $$

(3)

4 Results and discussion

Table 5 presents the BE results under all datasets (A to L) for the three ML methods: RF, SVM, and NN.

Table 5 Break Even (BE) experiment 10-fold cross validation median Accuracy (Acc) and F1 score average results for RF, MLP and SVM and for each dataset

Full size table

The results show a good performance of the model in predicting whether a movie will pay its production costs compared with the literature. The best accuracy result for BE is 96.7% in the B dataset (wide release only) and datasets G, H, I, and L with 95%, 93.3%, 92.1%, and 94.2%, respectively – all with RF. For more details of parameters, see Table 10 in the Appendix. The referenced datasets also have an F1-score above 95%. Except for set J, RF performed better than MLP and SVM. Figure 5 presents the performance of the three classifiers along with their confidence intervals for the best result sets; the confidence intervals reinforce the superiority of RF for the cases presented.

Most studies classify film success employing their revenues as the main measure of success; thus, regarding studies that use revenue net of costs, our best binary experiment result, 96.77%, outperforms the literature with significant margins, 88% in [60], and 90.4% in [42].

For the multiclass experiment, PR, the best average accuracy is from dataset I, with roughly 50% accuracy with RF, followed by sets A, C, D, H and L – all with about 46% accuracy. As shown in Table 6, RF has the best performance for all datasets. The APHR results for PR-I are presented in Table 7.

Table 6 Profit Range (PR) experiment 10-fold cross validation median Accuracy (Acc) and F1 score average results for RF, MLP and SVM and for each dataset

Full size table

Table 7 APHR of experiment PR in set I with RF

Full size table

As Table 7 shows, we obtained 89.8% of the APHR-Bingo average, therefore being better than APHR 56% from [68] and APHR 49.5% from [57]. Broadly comparing these results with literature that uses information before film release to classify, since their measure of success is raw revenues and we use deflated profits, our models also have better performance in prediction than 54.4% from [78], 36.9% from [65], and 68.1% from [76]. Considering that these authors utilize some NN architecture as a predictor method in a multiclass problem, we conclude that RF performs better to support movie stakeholders’ decisions. Table 8 shows a better view of the comparison between our results and the literature.

Table 8 Summary table of results and comparison

Full size table

Overall, the four BE results (B, H, I, and L – Table 5) have excellent scores in predicting the profitability of movie theaters since their metrics are better. In addition, results suggest that profits can be more adequate measures of a film’s success because they account for the tradeoff between revenues and costs. Yet, as the exclusive use of features available before the film release or during its production process significantly reduces the number of features available in classifiers, the results are much more significant since we are not using information like critics, user reviews, and WOM data.

These best datasets – H, I, and L – include only wide-released films and brief periods after 2000, explaining their similarities (See Table 3). These datasets perform better than dataset F, which contains all wide-released movies after 1999. The difference may shed light on the timing in which a window slice is designed, consequently on the sample size, where smaller samples and more recent datasets had better performances. Another way to discuss these findings is the homogeneity underlying the data slices, since set B covers all wide-released movies, with no time slice, and the model got the best performance. The same occurs for set G, which has no outliers (Isolation Forest), reinforcing the importance of homogeneity in the predictions.

These results suggest the model generating data might have changed due to structural breaks [10]. Shocks – like technological innovations, changes in consumer preferences, political and economic interventions, and natural shocks like COVID-19 – cause a structural break. To evaluate this possibility, we explore the feature importance generated by RF, via Gini Index, for BE experiments with different sample sized datasets to check whether there are changes in their relative feature importance since such changes indicate a different model. Using datasets distant in time – K (wide releases between 2000 and 2004, 567 observations) and L (wide releases between 2010 and 2014, 536 observations) – we extract the Gini Feature Importance for each case, as Fig. 6 shows.

Comparing the features in K and L datasets, it is possible to notice a clear change in the relative importance of budget, runtime, crime and adventure genres, number of markets, and other features. This change in theatrical consumption can result from technological innovations, as an alternative way to consume a movie brought by the streaming videos or the availability of other new goods, like games, leading to a change in consumption behavior. To check the robustness of these changes and better comprehend a possible shift in the “regime” that governs the data generation, we also included another BE experiment with year dummy variables (from 1981 to 2019) as features to dataset A and performed RF classification. Note that we included all years because we are using worldwide revenues, having many countries, and it is difficult to define a specific year shock. If the year dummies are relevant determinants to film success, however, it means evidence of the regime’s change since it is supposed that the time would not affect the classification. The relative importance score of the first 20 features is shown in Fig. 7.

Additionally, by exploring different data samplings and the importance of features in each dataset, we find that the number of theaters, budget, runtime, and the number of market releases are the main features to explain a movie’s economic success. Note that the number of theaters, however, may bias the results to be suitable only for wide-opening films since these types of movies disclose budget information more commonly. Alternatively, the two least significant are the MPAA ratings of NC-17 and G; this may be because of their low representativeness in data. Apart from these last two, our models were able to classify very well by using a few variables easily observable or available in a movie’s planned production/pre-production period.

5 Concluding remarks

Uncertainty in new film production is high, with failure rates ranging between 25 and 45% [46]. Therefore, a large portion of movies are unprofitable, and productions with large budgets and impressive star power are not guarantees of profit [15]. We, thus, evaluate three classifiers to determine the economic success, measured by profits free of price inflation, of film release at theaters using few and simple observable types of information at the time of production stage. We consider economic success as the movie revenue over its costs (or profits) in two different approaches; binary classification (BE) and 6-class classification (PR). For binary classification, we use SMOTE to solve problems of class imbalance.

Forecasting film profitability based only on the early stages of film production is a complex task, mainly due to eliminating several other relevant determinants of film quality and economic performance available only at or after the release. Nevertheless, our results show better performance than the previous studies, mainly using RF and small datasets (accuracy of 96% and F1-score of 97% for binary and about 50% APHR for 6-class).

In addition, the analysis of feature importance suggests that the movie market model changes over time. The theoretical literature in ML and statistics [74] indicates that more data (more instances/information) improves performance. Our findings, similar to the literature on applied film success, show that limiting data to brief periods of timing supports patterns of similarity over time, thus resulting in better learning. We, therefore, argue that shocks like technological innovations, which change supply and demand behaviors, and other shocks can alter a model regime to classify film success.

Therefore, our study contributes to the productive sector and related academic studies. It can guide studios, producers, and other stakeholders to make better investments and decisions when there is room to change plans. In this case, they can count on the low cost of obtaining inputs to make predictions (directly observable features), excellent accuracy in prediction, and time enough to make changes in movie plans in case a poor contingent prediction occurs.

Regarding the literature contribution, we envisage five novelties that can be summarized in three main issues. First, we use deflated profits as the measure of film success instead of non-deflated film revenues as in most literature, which allows us to balance the trade-off between film revenues and costs. In addition, the few studies that employ profits as a film success measure do not deflate them, which can mislead the classification towards considering the most recent films as the most profitable. Second, to preview success, the proposed tool uses a small number of simple features that are not pre-processed and are directly observable. In addition, the features are available mainly at film production time; thus, when some bad results are predicted, there is room to change the production course to increase the chances of the film’s success. Third, it calls attention to the regime’s potential changes that describe a model over time due to shocks like technological innovations. In this sense, considering all the previous items and the cuts of sample to compare with the literature, the use of RF, and the higher scores obtained, we believe we have contributed to the literature.

Regarding the potential “regime” changes, more investigation is needed. In this sense, structural breaks should be analyzed through specific statistical tests – a future work to be explored – to develop exogenous tests to guarantee the future predictability of the film market and other social time-related domains. Another line of investigation is to exploit the differences between more homogeneous and heterogeneous samples employed to predict film success. For instance, eliminating outliers is a way to make a sample more homogeneous and improve binary predictions. In this sense, reducing a film sample in a shorter time makes films more homogeneous and improves results on success prediction, as we and other authors have found. Also, using samples with only wide-released films, a more homogeneous dataset, resulted in better prediction in our results. In addition, we might improve feature selection for future works, removing those that are minimally informative and adding others like a sequel and/or star power – for example, in agreement with the literature – and experiment with different computational models to estimate missing budget data to enhance data size.

Data availabity

https://data.mendeley.com/datasets/s36kp8rc4h/draft?a=b191f4c8-d0ba-4798-9c44-58f88d0231d7

Notes

According to Motion Picture Association, in 2019 the film and television industry supported 2.5 million jobs, payed out $188 billion in total wages, and comprised over 93,000 businesses in the US alone: https://www.motionpictures.org/what-we-do/driving-economic-growth/ (Last accessed: 09/10/2020)
Netflix do not disclose their rentals or all techniques behind their system recommendation tools, but these articles can give an overview https://insidebigdata.com/2018/01/20/netflix-uses-big-data-drive-success/ (Last accessed: 09/08/2021), https://www.theatlantic.com/technology/archive/2014/01/how-netflix-reverse-engineered-hollywood/282679/ (Last accessed: 10/08/2021) and https://netflixtechblog.com/supporting-content-decision-makers-with-machine-learning-995b7b76006f (Last accessed: 16/08/2021)
Table 1 presents the literature summary of previous ML studies. The previous studies with explanatory regression analysis are summarized in Appendix - Table 9.
Over the last two decades, movies with budget information had better revenue performances in the movie market than movies without disclosed budgets. This may indicate a bias in the total population towards wide released movies – a characteristic that is considered in the dataset slices described in Table 3.
According to [83] and the Box Office Mojo website, wide-released movies are those that have their opening in 600 or more screens.

References

Abidi SMR, Xu Y, Ni J, Wang X, Zhang W (2020) Popularity prediction of movies: from statistical modeling to machine learning techniques. Multimed Tools Appl 79:35583–35617. https://doi.org/10.1007/s11042-019-08546-5
Article Google Scholar
Ahmad J, Duraisamy P, Yousef A, Buckles B (2017) Movie success prediction using data mining. In: 8th International Conference on computing, communications and networking technologies, ICCCNT 2017
Ahmed U, Waqas H, Afzal MT (2020) Pre-production box-office success quotient forecasting. Soft Comput 24:6635–6653. https://doi.org/10.1007/s00500-019-04303-w
Article Google Scholar
Antipov EA, Pokryshevskaya EB (2017) Are box office revenues equally unpredictable for all movies? Evidence from a Random forest-based model. J Revenue Pricing Manag 16:295–307. https://doi.org/10.1057/s41272-016-0072-y
Article Google Scholar
Basu S (2019) Movie rating prediction system based on opinion mining and artificial neural networks. In: Advances in Intelligent Systems and Computing
Bhattacharjee B, Sridhar A, Dutta A (2017) Identifying the causal relationship between social media content of a Bollywood movie and its box-office success - a text mining approach. Int J Bus Inf Syst 24:344. https://doi.org/10.1504/IJBIS.2017.082039
Article Google Scholar
Boser BE, Guyon IM, Vapnik VN (1992) Training algorithm for optimal margin classifiers. In: Proceedings of the Fifth Annual ACM Workshop on Computational Learning Theory
Breiman L (2001) Random forests. Mach Learn https://doi.org/10.1023/A:1010933404324
Brewer SM, Kelley JM, Jozefowicz JJ (2009) A blueprint for success in the US film industry. Appl Econ 41:589–606. https://doi.org/10.1080/00036840601007351
Article Google Scholar
Casini A, Perron P (2019) Structural breaks in time series. In: Oxford Research Encyclopedia of Economics and Finance
Chang BH, Ki EJ (2005) Devising a practical model for predicting theatrical movie success: focusing on the experience good property. J Media Econ 18:247–269. https://doi.org/10.1207/s15327736me1804_2
Article Google Scholar
Chang CC, Lin CJ (2011) LIBSVM: a library for support vector machines. ACM Trans Intell Syst Technol 2:1–27. https://doi.org/10.1145/1961189.1961199
Article Google Scholar
Chawla NV, Bowyer KW, Hall LO, Kegelmeyer WP (2002) SMOTE: Synthetic minority over-sampling technique. J Artif Intell Res 16:321–357. https://doi.org/10.1613/jair.953
Chintagunta PK, Gopinath S, Venkataraman S (2010) The effects of online user reviews on movie box office performance: accounting for sequential. Mark Sci
De Vany A, Walls WD (1999) Uncertainty in the movie industry : Does star power reduce the terror of the box office ? J Cult Econ 23:285–318. https://doi.org/10.1023/a:1007608125988
Derrick FW, Williams NA, Scott CE (2014) A two-stage proxy variable approach to estimating movie box office receipts. J Cult Econ 38:173–189. https://doi.org/10.1007/s10824-012-9198-y
Article Google Scholar
Dhir R, Raj A (2018) Movie success prediction using machine learning algorithms and their comparison. ICSCCC 2018 - 1st Int Conf Secur cyber Comput Commun 385–390. https://doi.org/10.1109/ICSCCC.2018.8703320
Du J, Xu H, Huang X (2014) Box office prediction based on microblog. Expert Syst Appl 41:1680–1689. https://doi.org/10.1016/j.eswa.2013.08.065
Article Google Scholar
Duan J, Ding X, Liu T (2015) A Gaussian copula regression model for movie box-office revenue prediction with social media. In: Communications in Computer and Information Science
Einav L (2007) Seasonality in the U.S. motion picture industry. RAND J Econ. https://doi.org/10.1111/j.1756-2171.2007.tb00048.x
Eliashberg J, Shugan SM (1997) Film critics: influencers or predictors? J Mark 61:68–78. https://doi.org/10.2307/1251831
Article Google Scholar
Eliashberg J, Elberse A, Leenders MA (2006) The motion picture industry: Critical issues in practice, current research, and new research directions. Mark Sci 25:638–661. https://doi.org/10.1287/mksc.1050.0177
Gaikar DD, Marakarkandy B, Dasgupta C (2015) Using twitter data to predict the performance of bollywood movies. Ind Manag Data Syst 115:1604–1621. https://doi.org/10.1108/IMDS-04-2015-0145
Article Google Scholar
Ghiassi M, Lio D, Moon B (2015) Pre-production forecasting of movie revenues with a dynamic artificial neural network. Expert Syst Appl 42:3176–3193. https://doi.org/10.1016/j.eswa.2014.11.022
Article Google Scholar
Ginsburgh V (2003) Awards, success and aesthetic quality in the arts. In: Journal of Economic Perspectives
Gislason PO, Benediktsson JA, Sveinsson JR (2006) Random forests for land cover classification. In: Pattern Recognition Letters
Guo Z, Zhang X, Hou Y (2015) Predicting box office receipts of movies with pruned random forest. In: lecture notes in Computer science (including subseries lecture notes in artificial intelligence and lecture notes in bioinformatics)
Hadida AL (2010) Commercial success and artistic recognition of motion picture projects. J Cult Econ 34:45–80. https://doi.org/10.1007/s10824-009-9109-z
Article Google Scholar
Hastie T, Tibshirani R, Friedman J (2009) Elements of statistical learning 2nd ed.
Hecht-Nielsen R (1992) Theory of the Backpropagation Neural Network**Based on “nonindent” by Robert Hecht-Nielsen, which appeared in Proceedings of the International Joint Conference on Neural Networks 1, 593–611, June 1989. © 1989 IEEE. In: Neural Networks for Perception
Honthaner EL (2013) The complete film production handbook
Hossein N, Miller DW (2018) Predicting motion picture box office performance using temporal tweet patterns. Int J Intell Comput Cybern 11:64–80. https://doi.org/10.1108/IJICC-04-2017-0033
Article Google Scholar
Hu YH, Shiau WM, Shih SP, Chen CJ (2018) Considering online consumer reviews to predict movie box-office performance between the years 2009 and 2014 in the US. Electron Libr 36:1010–1026. https://doi.org/10.1108/EL-02-2018-0040
Article Google Scholar
Hur M, Kang P, Cho S (2016) Box-office forecasting based on sentiments of movie reviews and independent subspace method. Inf Sci (Ny) 372:608–624. https://doi.org/10.1016/j.ins.2016.08.027
Article Google Scholar
Husak W (2004) Economic and other considerations for digital cinema. Signal Process Image Commun 19:921–936. https://doi.org/10.1016/j.image.2004.06.006
Article Google Scholar
Khalilia M, Chakraborty S, Popescu M (2011) Predicting disease risks from highly imbalanced data using random forest. BMC Med Inform Decis Mak 11. https://doi.org/10.1186/1472-6947-11-51
Kim SH, Park N, Park SH (2013) Exploring the effects of online word of mouth and expert reviews on theatrical movies’ box office success. J Media Econ 26:98–114. https://doi.org/10.1080/08997764.2013.785551
Article Google Scholar
Kim D, Kim D, Hwang E, Choi HG (2013) A user opinion and metadata mining scheme for predicting box office performance of movies in the social network environment. New Rev Hypermedia Multimed 19:259–272. https://doi.org/10.1080/13614568.2013.835450
Article Google Scholar
Kim T, Hong J, Kang P (2015) Box office forecasting using machine learning algorithms based on SNS data. Int J Forecast 31:364–390. https://doi.org/10.1016/j.ijforecast.2014.05.006
Article Google Scholar
Kim T, Hong J, Kang P (2017) Box office forecasting considering competitive environment and word-of-mouth in social networks: a case study of Korean film market. Comput Intell Neurosci 2017:1–16. https://doi.org/10.1155/2017/4315419
Article Google Scholar
Krauss J, Nann S, Simon D, et al (2008) Predicting movie success and academy awards through sentiment and social network analysis. In: 16th European Conference on information systems, ECIS 2008
Lash MT, Zhao K (2016) Early predictions of movie success: the who, what, and when of profitability. J Manag Inf Syst 33:874–903. https://doi.org/10.1080/07421222.2016.1243969
Article Google Scholar
Lee K, Park J, Kim I, Choi Y (2018) Predicting movie success with machine learning techniques: Ways to improve accuracy. Inf Syst Front 20:577–588. https://doi.org/10.1007/s10796-016-9689-z
Legoux R, Larocque D, Laporte S, Belmati S, Boquet T (2016) The effect of critical reviews on exhibitors’ decisions: do reviews affect the survival of a movie on screen? Int J Res Mark 33:357–374. https://doi.org/10.1016/j.ijresmar.2015.07.003
Article Google Scholar
Lehrer S, Xie T (2017) Box office buzz: Does social media data steal the show from model uncertainty when forecasting for Hollywood? Rev Econ Stat 99:749–755. https://doi.org/10.1162/REST_a_00671
Leung TC, Qi S, Yuan J (2020) Movie industry demand and theater availability. Rev Ind Organ.https://doi.org/10.1007/s11151-019-09706-5
Lipizzi C, Iandoli L, Marquez JER (2016) Combining structure, content and meaning in online social networks: the analysis of public’s early reaction in social media to newly launched movies. Technol Forecast Soc Change 109:35–49. https://doi.org/10.1016/j.techfore.2016.05.013
Article Google Scholar
Litman BR (1983) Predicting success of theatrical movies: An empirical study. J Pop Cult 16:159–175. https://doi.org/10.1111/j.0022-3840.1983.1604_159.x
Liu T, Ding X, Chen Y, Chen H, Guo M (2016) Predicting movie box-office revenues by exploiting large-scale social media content. Multimed Tools Appl 75:1509–1528. https://doi.org/10.1007/s11042-014-2270-1
Article Google Scholar
Lucas RE (1976) Econometric policy evaluation: A critique. Carnegie-Rochester Confer Ser Public Policy 1:19–46. https://doi.org/10.1016/S0167-2231(76)80003-6
Article Google Scholar
Mestyán M, Yasseri T, Kertész J (2013) Early Prediction of Movie Box Office Success Based on Wikipedia Activity Big Data PLoS One 8: https://doi.org/10.1371/journal.pone.0071226
Mohanty S, Clements N, Gupta V (2018) Investigating the effect of eWOM in movie box office success through an aspect-based approach. Int J Bus Anal 5:1–15. https://doi.org/10.4018/IJBAN.2018010101
Article Google Scholar
Moon S, Bergey PK, Lacobucci D (2010) Dynamic effects among movie ratings, movie revenues and viewer satisfaction. J Mark 74:108–121. https://doi.org/10.1509/jmkg.74.1.108
Article Google Scholar
MPA - Motion Picture Association (2019) 2019 THEME Report - Motion Picture Association
Oh C, Roumani Y, Nwankpa JK, Hu HF (2017) Beyond likes and tweets: Consumer engagement behavior and movie box office in social media. Inf Manag 54:25–37. https://doi.org/10.1016/j.im.2016.03.004
Pokorny M, Sedgwick J (2010) Profitability trends in Hollywood, 1929 to 1999: somebody must know something. Econ Hist Rev 63:56–84. https://doi.org/10.1111/j.1468-0289.2009.00488.x
Article Google Scholar
Quader N, Gani MO, Chaki D, Ali MH (2017) A machine learning approach to predict movie box-office success. In: 2017 20TH INTERNATIONAL CONFERENCE OF COMPUTER AND INFORMATION TECHNOLOGY (ICCIT)
Reinstein DA, Snyder CM (2005) The influence of expert reviews on consumer demand for experience goods: a case study of movie critics. J Ind Econ 53:27–51. https://doi.org/10.1111/j.0022-1821.2005.00244.x
Article Google Scholar
Ren Y, Zhang L, Suganthan PN (2016) Ensemble classification and regression-recent developments, applications and future directions [review article]. IEEE Comput Intell Mag 11:41–53
Article Google Scholar
Rhee TG, Zulkernine F (2016) Predicting movie box office profitability: a neural network approach. 2016 15TH IEEE Int Conf Mach learn Appl (ICMLA 2016) 665–670. https://doi.org/10.1109/icmla.2016.0117
Riwinoto MT, Zega SA, Irlanda G (2015) Predicting animated film of box-office success with neural networks. J Teknol. https://doi.org/10.11113/jt.v77.6693
Rokach L (2010) Ensemble-based classifiers. Artif Intell Rev 33:1–39. https://doi.org/10.1007/s10462-009-9124-7
Article Google Scholar
Ru Y, Li B, Liu J, Chai J (2018) An effective daily box office prediction model based on deep neural networks. Cogn Syst Res 52:182–191. https://doi.org/10.1016/j.cogsys.2018.06.018
Article Google Scholar
Shapiro C, Varian HR (1999) Information rules
Sharda R, Delen D (2006) Predicting box-office success of motion pictures with neural networks. Expert Syst Appl 30:243–254. https://doi.org/10.1016/j.eswa.2005.07.018
Article Google Scholar
Souza TLD, Nishijima M, Fava ACP (2019) Do consumer and expert reviews affect the length of time a film is kept on screens in the USA? J Cult Econ 43:145–171. https://doi.org/10.1007/s10824-018-9332-6
Article Google Scholar
Stone M (1974) Cross-Validatory choice and assessment of statistical predictions. J R Stat Soc Ser B https://doi.org/10.1111/j.2517-6161.1974.tb00994.x
Subramaniyaswamy V, Vaibhav MV, Prasad RV, Logesh R (2018) Predicting movie box office success using multiple regression and SVM. Proc Int Conf Intell Sustain Syst ICISS 2017:182–186. https://doi.org/10.1109/ISS1.2017.8389394
Article Google Scholar
Tadimari A, Kumar N, Guha T, Narayanan SS (2016) Opening big in box office? Trailer content can help. ICASSP, IEEE Int Conf Acoust Speech Signal Process - Proc 2016-May:2777–2781. https://doi.org/10.1109/ICASSP.2016.7472183
Varian HR (2000) Buying, sharing and renting information goods. J Ind Econ 48:473–488. https://doi.org/10.1111/1467-6451.00133
Article Google Scholar
Waldfogel J (2017) How digitization has created a golden age of music, movies, books, and television. J Econ Perspect 31:195–214
Article Google Scholar
Wang Y, Ru Y, Chai J (2019) Time series clustering based on sparse subspace clustering algorithm and its application to daily box-office data analysis. Neural Comput & Applic 31:4809–4818. https://doi.org/10.1007/s00521-018-3731-7
Article Google Scholar
Wang Z, Zhang J, Ji S, Meng C, Li T, Zheng Y (2020) Predicting and ranking box office revenue of movies based on big data. Inf Fusion 60:25–40. https://doi.org/10.1016/j.inffus.2020.02.002
Article Google Scholar
Wooldridge JM (2002) Econometric analysis of cross section and panel data. Booksgooglecom 58:752. https://doi.org/10.1515/humr.2003.021
Article MATH Google Scholar
Wu S, Zheng Y, Lai Z, et al (2019) Movie box office prediction based on ensemble learning. ISPCE-CN 2019 - IEEE Int Symp prod compliance Eng 2019 1–4. https://doi.org/10.1109/ISPCE-CN48734.2019.8958631
Zhang L, Luo J, Yang S (2009) Forecasting box office revenue of movies with BP neural network. Expert Syst Appl 36:6580–6587. https://doi.org/10.1016/j.eswa.2008.07.064
Article Google Scholar
Zhang Z, Chai J, Li B, et al (2016) Movie box office Inteval forecasting based on CART. In: proceedings - 2015 8th International symposium on computational intelligence and design, ISCID 2015
Zhou Y, Zhang L, Yi Z (2019) Predicting movie box-office revenues using deep neural networks. Neural Comput & Applic 31:1855–1865. https://doi.org/10.1007/s00521-017-3162-x
Article Google Scholar

Download references

Funding

This work was supported by CAPES (Higher Education Improvement Coordination); and FAPESP (São Paulo Research Foundation).

Author information

Authors and Affiliations

Information Systems Department, University of São Paulo, 1000, Arlindo Béttio – Ermelino Matarazzo, 03828-000, Room: L1 – 327, São Paulo, SP, Brazil
Thaís Luiza Donega e Souza
University of São Paulo, Institute of International Relationships, Av. Prof. Lúcio Martins Rodrigues, Tv. 4 e 5, Cidade Universitária, São Paulo, SP, 05508-020, Brazil
Marislei Nishijima
Department of Electricity, Federal Institute of São Paulo, R. Pedro Vicente, 625 - Canindé, São Paulo, SP, 01109-010, Brazil
Ricardo Pires

Authors

Thaís Luiza Donega e Souza
View author publications
You can also search for this author in PubMed Google Scholar
Marislei Nishijima
View author publications
You can also search for this author in PubMed Google Scholar
Ricardo Pires
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Thaís Luiza Donega e Souza.

Ethics declarations

Conflict of interest

none.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary Information

ESM 1

(XLSX 280 kb)

APPENDIX

Table 9 Summary of previous studies (regression analysis)

Full size table

Table 10 Hyperparameters used in the best results experiments

Full size table

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

e Souza, T.L.D., Nishijima, M. & Pires, R. Revisiting predictions of movie economic success: random Forest applied to profits. Multimed Tools Appl 82, 38397–38420 (2023). https://doi.org/10.1007/s11042-023-15169-4

Download citation

Received: 24 August 2021
Revised: 22 March 2022
Accepted: 22 March 2023
Published: 28 March 2023
Issue Date: October 2023
DOI: https://doi.org/10.1007/s11042-023-15169-4

Keywords

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

Revisiting predictions of movie economic success: random Forest applied to profits

Abstract

Similar content being viewed by others

The Study for the Influencing Factors of the Box Office and Prediction Based on Machine Learning Models

Predicting Movie Success Using Regression Techniques

Machine Learning Applications on Box-Office Revenue Forecasting: The Taiwanese Film Market Case Study

1 Introduction

2 Literature on movie success