1 Introduction

Sex crime is a severely under-reported crime category in the United States. Although a majority of victims chose to disclose their suffering to friends or family members, only less than 20% of all incidents ended up being reported to law enforcement (Thoennes and Tjaden 2000). Reporting a sex crime is a high-stakes personal decision due to prejudicial and false beliefs called “rape myths” that stereotype sex crime victims and inhibit reporting. For example, questioning motives of reporting, blaming victims for their victimization, exonerating perpetrators of blame, and downplaying the consequences of assault for victims would lead to revictimization and psychological traumas to survivors (Payne et al. 1999). More importantly, about 3 in 4 sex crime incidents in the United States are committed by offenders known to victims such as intimate partners and casual acquaintances (Planty et al. 2013). The strong feeling of humiliation and fear of reprisal further discourage victims from coming forward.

This study investigates how reports of sex crimes changed across areas in the United States after the MeToo movement. Commencing in October 2017 amidst the exposure of sexual assault allegations against Harvey Weinstein, the MeToo movement has garnered substantial attention, standing out as one of the most influential and impactful social media campaigns to date. The subsequent disclosures by victims of sex crimes, predominantly females, redirected public attention toward pervasive issues of sexual misconduct. A pivotal contribution of our study lies in identifying varying levels of MeToo salience across the United States. This is accomplished by constructing a novel measure of pre-MeToo sexism at the media market level, exploiting partial least squares (PLS) and several predictors associated with local sexism in the pre-MeToo era. Utilizing the constructed sexism index, we find empirical evidence that the salience of MeToo was more pronounced in areas characterized by lower measured sexism in the pre-MeToo era.

We extend our analysis by employing econometric methods to examine the shift in the number of reported sex crimes during the MeToo era in areas distinguished by low sexism, high MeToo salience compared to those with high sexism. Utilizing data on sex crimes from the National Incident-Based Reporting System, we observe that, post-MeToo, low sexism, high MeToo salience areas exhibit a relative increase in reported sex crimes compared to their high sexism counterparts. Specifically, a decrease in local sexism by one decile corresponds to an approximately 8% rise in the sex crime rate per 100,000 population. To mitigate potential confounding factors, we incorporate a range of control variables that consider demographic and socioeconomic distinctions across areas. Our results withstand these controls, signifying the robustness of the effect of sexism. Furthermore, we conduct various robustness checks, including using political preference as a proxy for ideological orientation and employing yearly-differenced crime rates to account for seasonality effects. We also explore alternative measures of sexism and exclude areas with a small population in jurisdictions reporting to the National Incident-Based Reporting System. Through these robustness checks, our main findings persist. Additionally, a heterogeneous analysis reveals that the increased reporting of sex crimes predominantly involves offenders known to victims, results in non-severe injuries, and is committed by local residents.

Finally, we explore whether the increase in reported sex crimes in low sexism, high MeToo salience areas stems from actual incidents or shifts in reporting behaviors of sex crime victims during the MeToo era. Given the unknown precise number of sex crimes, we employ two strategies. In the indirect strategy, we initially examine the rate of non-sex crimes against women post-MeToo, finding no statistically significant results. This suggests that the observed increase in sex crimes is unlikely to be confounded by concurrent policies or an overall rise in crimes against women in low sexism areas. Second, we focus on two sex crime types less likely to be affected by reporting effects: homicide and aggravated assault of women under sex crime circumstances. Homicide is arguably the least under-reported crime type, and aggravated assault often results in serious physical injury requiring immediate medical treatment. We observe no notable differences in these two sex crime types, which are less susceptible to reporting effects, between low and high sexism areas during the MeToo era. Furthermore, we do not detect a significant change in the arrest rate related to sex crimes, suggesting that the increased incidence of sex crimes in low sexism areas after MeToo is less likely to be caused by behavioral changes in local law enforcement. For the direct strategy, we adopt an alternative approach by analyzing responses from the National Crime Victimization Survey spanning from 2015 to 2019. Our empirical findings indicate a significant shift in the willingness of sex crime victims in low sexism areas to report their experiences to the police post-MeToo. Specifically, when the sexism index decreases by one interdecile, the likelihood of reporting increases by approximately 27 percentage points. This increase is particularly significant, especially given that a counterpart analysis of the pre-MeToo era reveals no statistically meaningful difference in reporting likelihood between low and high sexism areas. In essence, the upsurge in reported sex crimes in low sexism, high MeToo salience areas during the MeToo era should be primarily attributed to changes in reporting behavior rather than actual incidents.

Our study contributes not only to the small but growing literature on how MeToo affects the composition of public companies’ boards and legislative bodies (Heminway 2019), firm values (Lins et al. 2020), the criminal justice system (Conklin 2020), collaboration among Hollywood producers (Luo and Zhang 2022), and sex crime reporting in the OECD countries (Levy and Mattsson 2023), but also to the literature on violence against women (Aizer and Dal Bó 2009; Aizer 2010; Iyengar 2009; Iyer et al. 2012; Miller and Segal 2019). A study closely related to ours is by Levy and Mattsson (2023), who thoroughly examine the overall MeToo effect in both the OECD countries and United States. Our study differs from Levy and Mattsson (2023) in two important ways. First, we construct a novel PLS-based sexism index to differentiate local sexist attitudes that closely correlate with sex crime, and then provide first-stage evidence to show that MeToo has drawn differential levels of attention across these areas. PLS is a popular machine learning method and primarily used for dimensionality reduction as well as dealing with multicollinearity problems in linear regressions. The primary functionality of PLS is to generate a few linearly uncorrelated new variables that substantially explain the variation in the outcome variable from a large group of multicollineared predictors. Here, the predictors of local sexism are the Google search indices of derogatory terms, which include instances of sexist slurs and gender insults based on someone’s appearance, intellect, sexual experience, and mental stability. Although these indices contain a lot of noise unrelated to searchers’ sexist attitude, the spatial distribution shows that the MeToo-related terms are less likely to be searched in areas with higher sexism, while female-referential derogatory terms are more likely to be searched in those areas. This provides some anecdotal evidence on the correlation between local sexism and search behavior. Through PLS, we economically separate out information in the predictors that largely explains a unidimensional sexism index constructed by Charles et al. (2022), who exploit responses to questions related to respondents’ sexist attitude about women in the General Social Survey (GSS) between 1977 and 1998.Footnote 1 Second, unlike Levy and Mattsson (2023), who estimate the overall MeToo effect directly, our study concentrates on comparing the changes in the number of reported sex crimes across low and high sexism areas in the United States in the MeToo era.

The remaining part of the paper is structured as follows. Section 2 describes the procedure for constructing the sexism index using PLS. Section 3 demonstrates how MeToo shifted public awareness on sexual misconduct and garnered more attention in low sexism areas. Section 4 discusses the data and the econometric model, and Section 5 provides the results of empirical analyses. We conclude in Section 6. The Supplementary Appendix contains the algorithm of PLS and some additional results of the empirical analyses.

2 Measure of sexism

Our study initiates with outlining a statistical method for estimating local sexism during the pre-MeToo era. As highlighted earlier, the GSS sexism index introduced by Charles et al. (2022) provides a direct measure of the sexist attitudes of residents at the state level. However, this state-level index lacks the granularity needed to capture the heterogeneity among different areas within a state. To better evaluate differential sexist attitude across the United States, we adopt partial least squares (PLS), a supervised statistical learning method which helps identify useful predictors to an outcome variable and yield accurate out-of-sample prediction. Appendix A1 provides more technical details on the PLS algorithm. Briefly speaking, we first train a PLS model using the state-level data to find a structure that best explains the state-level sexism. Then, using this identified structure by PLS and new data at a level lower than state, we can predict a new set of sexism index at this level. Since the state-level sexism can be proxied by the GSS sexism index, our primary task is to find potential predictors that explain sexist attitude. In this analysis, we follow Stephens-Davidowitz (2014) and collect state and media market (MM) level Google Trends (GT) search data of sexism-related terms.Footnote 2 Using the GT measure is intuitive because internet searching behavior can be a proxy of demand for information which implies an individual’s level of attention to a topic. Compared with survey data, GT data are less susceptible to small-sample bias and could elicit Google users’ behaviors, subtle feelings, and socially sensitive attitude. In economics studies, the GT data have already been widely used since the seminal work by Stephens-Davidowitz (2014) who use the GT data to proxy racial animus. Subsequent studies then use the GT data as indicators of job search (Baker and Fradkin 2017), penetration rate of ride-hailing (Hall et al. 2018), tourism flow (Siliverstovs and Wochner 2018), awareness of immigration enforcement (Muchow and Amuedo-Dorantes 2020), psychological well-being (Brodeur et al. 2021), and access to mental healthcare (Deza et al. 2022).

The GT index captures the intensity of Google searches containing certain terms, and measures the relative popularity of these terms. Given a selected time frame and geographic regions, the GT index is calculated as the quotient of the number of searches for that term divided by the total searches, and is then normalized to be ranging between 0 and 100, where 100 is the most searches for that topic and 0 indicates that a given period does not have sufficient search volume for that term. This study exploits the GT indices of sexism-related terms from the following two dictionaries.

Sexism slurs

   This dictionary contains a list of 206 primarily female-referential derogatory terms collected by James (1998).Footnote 3 Since most terms on this list are regional slangs with multiple meanings, we examine the GT indices of these terms one by one at the state and MM level between Jan 1, 2015, and Oct 14, 2017. This step helps us narrow the list down to four sexist slurs ([Words] 1–4) displayed in panel (a) of Appendix Table A2, as all other terms either have zero index across the country or are searched in very few areas.Footnote 4 Due to their sparsity in GT, we exclusively concentrate on these four sexist slurs (and the plurals). To alleviate the concern that a vast majority of these searches are those directed to pornographic materials, we search “[Word] − pornhub” which returns results including searches containing that specific word but excluding searches with pornhub, one of the most-trafficked adult website in the world.Footnote 5

Derogatory terms

   Besides the four sexist slurs, sexist attitude toward women could be multidimensional and socially subtle. Felmlee et al. (2020) analyze more than 2.9 million tweets that contain gendered insults, and categorize the hostile contents into insulting someone’s appearance, intellect, sexual experience, mental stability, and age. These tweets shame women by accusing them of falling short of the standards in the five categories. To extract negative adjectives that are commonly used to refer to women from these five aspects, we build our second dictionary by exploiting Describing Words (https://describingwords.io/), an engine built to retrieve adjectives which commonly describe a noun based on a corpus including literature text files of about 10 gigabytes, mostly fiction and contemporary works. Its parser crawls through each book, returning their descriptions of nouns, and ranking adjectives by their usage frequency for that noun. We respectively search “woman” and “girl” in this engine and then identify the top two most frequently used adjectives in each of the four above semantic categories: “fat,” “ugly,” “emotional,” “mad,” “stupid,” “dumb,” “dirty,” “easy.”Footnote 6 Then, we document all possible permutations between the eight adjectives and the two nouns (including the plurals) in panel (b) of Appendix Table A2, and collect the GT indices of these terms between Jan 1, 2015, and Oct 14, 2017.

Fig. 1
figure 1

Media market-level sexism index in the United States. Notes. This figure plots the media market-level sexism index using partial least squares (PLS) discussed in Section 2. For ease of comparison, the index is normalized between 0 and 1. Alaska and Hawaii are not displayed. Index in Augusta-GA (denoted by “\(\times \)”) is not available due to missing values in the Google Trends index. Fairbanks-AK, Juneau-AK and Honolulu-HI are not displayed

We would like to emphasize that although the GT index has several drawbacks such as failing to reveal the actual search volume in different areas and adjusting indices of terms with searching intensity below an unknown, pre-determined threshold to zero, these technical adjustments will not seriously hamper the comparability of these indices across areas, because the state- or MM-level index could still reflect the relative popularity of a specific term. Another concern is that many searchers would use those terms for reasons unrelated to sexism, such as searches for adult services and entertainment featuring derogatory depictions of women. We cannot completely exclude this possibility, but note that states higher in sexism tend to have higher GT indices for these searches, as exhibited by Appendix Fig. A1(d)–(h). Such empirical evidence indicates that the search behaviors, proxied by the GT indices, contain some subtle information on searchers’ sexist attitude. By conducting the PLS algorithm, we can identify and extract the essential parts from the GT indices that best explain the local sexism, while information unrelated to sexist attitude would be isolated.

Figure 1 visualizes the constructed sexism index across 205 MMs (excluding Fairbanks-AK, Juneau-AK, Honolulu-HI, Bangor-ME and Augusta-GA) by PLS.Footnote 7 It exhibits two salient features. First, sexism tends to be lower in areas of the west coast and northeast, Colorado, as well as scattered urban areas such as Chicago, Detroit, Miami-Ft. Lauderdale, and Las Vegas. On the other hand, the highest sexism appears in the southern areas such as Tennessee, Arkansas, Mississippi, and Alabama. Second, the level of sexism can be enormously different across MMs even within the same state. For example, both Texas and Mississippi have low sexism areas (for example, Austin and Meridian) and extremely high ones (for example, Tyler-Longview and Greenwood-Greenville).

Fig. 2
figure 2

Comparison of sexism measures. Notes. This figure plots three state-level sexism measures: the PLS sexism (horizontal axis), the GSS sexism (vertical axis) by Charles et al. (2022), and the number of protesters per 1000 population in 2017 women’s marches (bubble size). The line denotes the linear regression line

As a robustness check, we compare the PLS sexism index based on the GSS sexism index with another proxy of local sexism: The number of participants of 2017 women’s marches. We collect the data on 2017 women’s marches from Count Love (https://countlove.org/), an online database collecting data on protests happened in the United States since 2016. Using reports from local newspaper and television outlets, it provides information on the location of protest and the number of protesters. The women’s marches on January 21, 2017, were among the largest in recent years, with 1 to 1.6 percent of the population in the United States participated in at least 408 marches on that day (Broomfield 2017). These local women’s marches promoted several causes on human rights, women’s rights included, and their sizes could partly reflect local residents’ attention to topics on women’s rights and gender inequalities prior to MeToo. Figure 2 compares the GSS sexism index (vertical axis), the fitted PLS sexism index (horizontal axis), and the number of protesters per 1000 state population (size of bubbles) in 43 states. It shows that the two sexism indices are positively correlated (\(\rho =0.77\)), and states lower in sexism tend to have more participants in the 2017 women’s march.

Fig. 3
figure 3

Google Trends paths of terms related to MeToo and sexual offenses. Notes. This figure plots the quarterly aggregated Google Trends indices for terms “metoo + metoo movement + #metoo” (solid black) and terms “sexual harassment + sexual abuse + sexual violence” (dashed grey) within the United States during 2015–2019. The vertical dashed line denotes 2017Q4 when MeToo was sparked

3 MeToo and shifted public awareness

The MeToo movement was sparked on October 15, 2017, when American actress Alyssa Milano posted a tweet and called her followers to post sexual violence they experienced with a “#MeToo” hashtag after the exposure of sexual assault allegations against Harvey Weinstein. This hashtag became viral immediately on social media and induced around 1.6 million posts on Twitter in the first week of the movement (October 14–21), instigating a massive social movement which leads to public disclosures of sexual misconduct (Modrek and Chakalov 2019). Within a year, more than 200 high-profile men were toppled down from their positions (Carlsen et al. 2018). Its impact was quickly spread to other countries and drew international attention (Levy and Mattsson 2023).

To illustrate how Weinstein’s scandal and the ensuing MeToo shifted public attention to sexual violence, we still take advantage of the GT data. Specifically, we experiment two groups of search terms, “metoo + metoo movement + #metoo” and “sexual harassment + sexual abuse + sexual violence,” where “\(+\)” indicates “or” in GT.Footnote 8 Since the two terms are unlikely to contain multiple meanings, their GT indices can partly proxy the public awareness of this social movement and sexual offense.Footnote 9 Setting the time period between Jan 1, 2015, and Dec 31, 2019, we collect the GT indices of the two terms in the United States under all query categories submitted to Google. Then, we aggregate the collected indices into quarterly level and scale them from 0 to 100. Figure 3 sketches the two terms’ GT index paths during 2015Q1–2019Q4, and shows the search intensity of MeToo-related terms spiked in 2017Q4 when Weinstein’s scandal was publicized, and stayed at a high level in 2018 due to Justice Brett Kavanaugh’s allegation. But the intensity gradually died down in 2019. The search of terms related to sexual offense reached its peak in 2017Q4 after the onset of MeToo, and the elevated public awareness of sexual violence sustained at a level around 50, indicating a far-reaching impact caused by MeToo.

Fig. 4
figure 4

Google Trends paths of terms related to MeToo: low sexism media markets vs. high sexism media markets. Notes. This figure plots the quarterly aggregated Google Trends indices for terms “metoo + metoo movement + #metoo” (solid black) in low (below the 50th percentile) and high (above the 50th percentile) sexism media markets during 2015–2019. The vertical dashed line denotes 2017Q4, when MeToo was sparked

Although MeToo is a sweeping social movement and drew broad public attention, empirical evidence shows that its impact could be differential across the United States. Appendix Fig. A1(b) and (c) display the spatial distribution of the two terms’ search intensity. The two figures clearly show that the two terms are more likely to be searched in states with lower GSS sexism. In states such as Arkansas, Alabama and Mississippi which are higher in sexism, the two terms received relative lower search intensity during the sample period. Furthermore, Fig. 4 systematically compares the trajectories of the Google Trends (GT) index, relative to the national trend, associated with “metoo + metoo movement + #metoo” in low (below the 50th percentile) and high (above the 50th percentile) sexism MMs throughout the sample period. As depicted, while the MeToo movement attracted significant attention in both categories, it consistently received relative more attention in low sexism MMs. However, Fig. 2 raises an omitted-variable concern to our identification: Sexism clearly correlate with several factors such as urbanity and geographic region. These are typical markers for contemporary socio-political differences in the United States along many dimensions. We will examine these issues by considering a host of controls that account for the regional differences.

4 Data and empirical model

4.1 Measures of crime

Our primary data source of crime is the National Incident-Based Reporting System (NIBRS) which is a part of FBI’s Uniform Crime Reporting (UCR) Program. Compared with standard UCR data, for each documented incident, the NIBRS links information on victims, offenders, and arrestees such as age, sex, and race, circumstances such as location, date, and relationship between victim and offender, and whether the incident causes injury. This allows us to study the heterogeneity by crime types. We focus on incidents involving sexual offenses, including forcible (rape, sodomy, sexual assault with an object, and fondling) and non-forcible (incest and statutory rape) types.Footnote 10 Another reason that the NIBRS is ideally suited for this study is because it reports the date of occurrence of an incident, whereas UCR data are monthly aggregated and only roughly reflect when an incident was reported, not necessarily the month in which it occurred.Footnote 11

A notable constraint of the NIBRS is its participation rate, with numerous police agencies in major cities such as New York City and Houston yet to participate, despite a consistent increase since its initiation in 1991. Although it covers jurisdictions representing over 96 million people in the United States as of 2015, equivalent to 36.1% of the population within UCR-reporting jurisdictions,Footnote 12 our focus in this study is on the 3464 constituent jurisdictions that consistently reported to the NIBRS each month from 2015 to 2019. These jurisdictions serve 85 million people in 134 of the 210 MMs. The average coverage rate, indicating the proportion of a MM’s population residing in jurisdictions reporting to the NIBRS, stands at approximately 0.52.Footnote 13 It is crucial to recognize, however, that a significant limitation of our empirical analysis arises from the disparity between NIBRS crime data, predominantly reported in less urban areas and smaller cities, and the PLS sexism measure, driven by GT indices primarily from urban areas and larger cities within the MMs. Therefore, a key assumption for the empirical analysis in this study is that there is a positive correlation between sex crime trends in the urban and rural areas within a MM, and we adopt a practical strategy to address this issue.Footnote 14 Our final dataset aggregates incident numbers at the MM-by-quarter level.

Table 1 Summary statistics
Fig. 5
figure 5

Sexism index and quarterly sex crime rate: basic patterns. Notes. This figure plots the paths of the UCR sexual crime rate between 2015 and 2019. Each measure denotes a quarterly sex crime rate per 100,000 population. The solid black (dashed grey) line represents agencies located in states with sexism indices in the fourth quartile (first quartile). The vertical dashed line denotes 2017Q4, when MeToo was sparked

Panel (a) of Table 1 documents the summary statistics of sexual and non-sex crimes before and after MeToo. For sex crimes, besides considering the types of offenders and victims, we additionally consider the number of incidents cleared by arrests and the number of incidents accompanied with homicides or aggravated assaults. For the ease of comparison, these summary statistics are displayed separately for low (below the 50th percentile) and high (above the 50th percentile) sexism MMs. This distinction will be useful because differential sexism index will be the key variable for the heterogeneity in sex crimes among areas. Panel (a) shows that low sexism MMs’ sex crime rate per 100,000 population served by the police departments that participate in the NIBRS is significantly lower than the high sexism counterparts’ before MeToo, but the gap becomes halved after MeToo. It is also evident that non-sex crime rate is significantly lower in the low sexism areas during the whole sample period. In Fig. 5, we further compare the quarterly average sex crime rates in the low (first quartile of sexism) and high (fourth quartile of sexism) sexism areas between 2015 and 2019. Prior to MeToo, sex crime rates in the two groups are comparable, with the high sexism areas’ sex crime rate slightly higher than the low sexism counterparts’. However, immediately after MeToo, their relative position switched as the sex crime rate in the low sexism areas outpaced that in the high sexism areas in 2018 and 2019, although both trended down to the pre-MeToo level at the end of 2019.

Table 2 MeToo salience across media markets

4.2 Controls

Since the sexist attitude toward women could correlate with some local characteristics that explain disproportionate increase in sex crimes in the aftermath of MeToo, we construct a host of controls to capture the demographic, socioeconomic, and policing/criminal differences across areas. Specifically, the demographic controls contain population (in log), the percentage of female, white, black, Hispanic, and population aging between 15 and 44. The socioeconomic controls contain per capita income, unemployment rate, labor force participation rate, and the percentage of population with college education and more. The policing/crime controls contain the number of police officer per 100,000 population, the number of UCR violent crimes (excluding sex crime) and property crimes per 100,000 population. The data sources are Census Bureau, Bureau of Economic Analysis, Bureau of Labor Statistics, UCR, and NIBRS. Panel (b) of Table 1 documents some summary statistics of these controls, and displays remarkable differences across areas.

In Table 2, we examine the MeToo salience across the 134 MMs by regressing the post-MeToo aggregate GT index related to MeToo on the PLS sexism. Result in column (1) indicates that an interdecile decrease (a value of 3.3) in sexism index corresponds to a statistically significant 21 unit increase in the GT index related to MeToo. Although the inclusion of additional controls will halve the magnitude of the estimate, as shown in columns (2)–(4), it is still significant at the 5% level, implying the notable MeToo salience in the low sexism MMs relative to the high sexism MMs.

4.3 Empirical model

Our empirical strategy is designed to evaluate the shift in the reporting of sex crimes in MMs with low sexism relative to those with high sexism during the MeToo era. This strategy can be generalized to a regression framework

$$\begin{aligned} Outcome_{myq} ={} & {} \beta _0 + \beta _1 SexismIndex_{m} \times Post_{yq}\nonumber \\{} & {} + \gamma Controls_{my} \times Post_{yq} + \alpha _m + \tau _{yq} + \epsilon _{myq}, \end{aligned}$$
(1)

where Outcome\(_{myq}\) is a measure of sex crime in media market m in quarter q of year y. SexismIndex\(_{m}\) denotes the standardized PLS sexism index ranging from \(-\)3.89 to 3.63 at the MM-level unless instructed otherwise. Post\(_{yq}\) is an indicator variable that equals 1 from 2017Q4 when MeToo became viral on internet. \(Controls_{my}\) denotes a host of MM-level control variables discussed earlier. \(\alpha _m\) is the MM fixed effects, accounting for the pre-existing differences among MMs with differential sexism levels that correlate with both sex crime rate and local sexism. \(\tau _{yq}\) is the year-by-quarter fixed effects and controls for temporal changes in sex crimes in all in-sample MMs. Standard errors are clustered by MM.Footnote 15 Unless otherwise instructed, all regressions are weighted by MM population of the constituent jurisdictions consistently reporting to the NIBRS within the sample period. Here, the coefficient of interest is \(\beta _1\), which corresponds to the interaction between the Post indicator and the sexism index SexismIndex. Intuitively, this coefficient gauges the differential response to MeToo in MMs with varying levels of sexism. However, SexismIndex may be correlated with other local characteristics. For example, high sexism areas may have a large share of population with shorter schooling years and certain political ideology, which could lead to more undocumented sexual assaults. This concern could be partially alleviated by interacting the rich set of observable MM characteristics with the Post dummy.

Table 3 Changes in sex crime rates after MeToo

5 Results

5.1 Results of baseline model

The first two columns of Table 3 present estimates of Eq. (1) with two specifications. In column (1), we consider the baseline specification that estimates sex crime rate per 100,000 population served by the police departments participating in NIBRS with the MM and year-by-quarter fixed effects, as well as a rich set of controls that accounts for local differences in demographics, socioeconomic factors, the number of police, and the number of non-sex crimes. The point estimate suggests that there was a large and highly significant increase in sex crime rate in MMs lower in sexism after MeToo. The point estimate of \(-\)0.5 implies that an interdecile decrease (a value of 3.3) in sexism index translates into 1.65 or about 8% increase in sex crime rate after MeToo. As mentioned before, due to the disparity between the NIBRS crime data which are largely reported in less urban areas and the PLS sexism measure which is driven by urban areas of a MM, a key assumption of the analysis here is that there is a positive correlation between sex crime trends in the urban and rural areas of MMs. Therefore, in column (2), we report the robustness to the inclusion of the state-by-year fixed effects using variation across jurisdictions within the same state. That is, by adding the state-by-year fixed effects, trends in crime reporting at the state level will be controlled, so the positive correlation will only have to hold within state. The estimate increases to \(-\)0.364 but is still significant at the 1% level.

Fig. 6
figure 6

Coefficients on interactions between sexism index and year-by-quarter dummies (2015Q1–2019Q4). Notes. This figure plots the coefficients on the interaction terms between the year-by-quarter dummies and sexism index in Eq. (2) using observations between 2015Q1 and 2019Q4, after accounting for media market fixed effects, year-by-quarter fixed effects, and all interacted controls. The black vertical bars denote the 95% confidence intervals for each estimate. 2017Q3 is dropped as the comparison group

To further examine how the estimated effect varies over time, we conduct an event study by estimating the following regression equation:

$$\begin{aligned} Outcome_{myq} ={} & {} \beta _0 + \sum _{\begin{array}{c} y=2015,...,2019 \\ q=1,..,4 \end{array}}\beta _{yq} SexismIndex_{m} \times Time_{yq}\nonumber \\{} & {} + \gamma Controls_{my}\times Post_{yq} + \alpha _m + \tau _{yq} + \epsilon _{myq}, \end{aligned}$$
(2)

where \(Time_{yq}\) is a set of 19 indicator variables that take the value of 1 for the combinations of \(y\in \{2015,2016,2017,2018,2019\}\) and \(q\in \{1,2,3,4\}\), with \(Time_{2017,3}\), the immediate period before MeToo, is left as the comparison group. Each coefficient \(\beta _{yq}\) can be interpreted as an estimate of the impact of sexism index on sex crime rate in quarter q of year y. Thus, it is a generalization of Eq. (1) to estimate the quarter-by-quarter contrasts. The 19 point estimates are plotted in Fig. 6 to provide a visual summary, with the black bars denoting the 95% confidence intervals. We do not find evidence of pre-trends, as all estimates before 2017Q4 are close to zero and insignificant at any conventional levels. After 2017Q3, all slope coefficients become negative and most are significant at either 5% or 10% level, indicating more documented sex crimes in the low sexism MMs than in the high sexism ones in 2018 and 2019.

5.2 Robustness checks

Political preference    Although we have controlled for a rich set of socioeconomic covariates, sexism could still correlate with factors such as ideology and culture which are typical makers for contemporary socio-political differences within the United States. In fact, the spatial distribution of sexism displayed in Fig. 1 largely overlaps with voters’ political preferences over the country.Footnote 16 Such an overlap became particularly salient in 2016 when Donald Trump was elected in the midst of multiple accusation of sexual misconduct, and the winning margins of Trump in these areas, to some extent, reflect the degrees of acceptance and normalization of rape myths.Footnote 17

To test the robustness of our result, we use political preference as a proxy of ideological orientation to further investigate whether sex crime could be affected through this channel. In column (3) of Table 3, we re-estimate Eq. (1) by controlling for the interaction between Trump vote share and the post-MeToo period. Specifically, after controlling Trump Share \(\times \) Post, the coefficient on SexismIndex \(\times \) Post increases from \(-0.5\) to \(-0.323\), which is significant at the 5% level. This indicates a 5% increase in sex crime for an interdecile decrease in sexism index.

Alternative measure of crime rate

   To address the concern of seasonality in crime, we estimate the following yearly-differenced model using observations in 2015 as the benchmark:

$$\begin{aligned} \Delta Outcome_{myq} ={} & {} \beta _1 SexismIndex_{m} \times \Delta Post_{yq} \nonumber \\{} & {} + \gamma \Delta (Controls_{my}\times Post_{yq}) + \Delta \tau _{yq} + \Delta \epsilon _{myq}, \end{aligned}$$
(3)

which is obtained by subtracting \(Outcome_{m,2015,q} = \beta _0 + \beta _1 SexismIndex_{m} \times Post_{2015,q} + \gamma Controls_{m,2015}\times Post_{2015,q} + \alpha _m + \tau _{2015,q} + \epsilon _{m,2015,q}\) from Eq. (1). Here, \(\beta _1\) in Eq. (3) still measures the effect of sexism index on sex crime rate, differenced by the sex crime level in 2015, after MeToo. Estimate in column (4) of Table 3 is still significant at the 5% level and in line with our main findings.

Alternative measures of sexism and MeToo awareness

   Thus far, our discussion exclusively relies on the PLS sexism index to represent sexist attitude toward women in different areas. We also use Weinstein’s scandal as the exogenous shock to separate the pre- and post-MeToo eras. To test the robustness of our finding based on this identification strategy, we consider alternative sexism measures and MeToo awareness to construct new interaction terms in Eq. (1).

In column (5) of Table 3, we interact the continuous MM-level PLS sexism index with the national level of MeToo awareness (Awareness) which is proxied by the GT index of terms related to sexual harassment. As displayed by Fig. 3, there is a clear uptick in the awareness of sexual violence in 2017Q4. All specifications exhibit negative estimates. The larger magnitudes suggest that the effect of sexism becomes attenuated in areas where sexual violence drew more public attention. Specifically, the estimate in column (5) implies that an interdecile decrease in the sexism index increases the sex crime rate by 1.1 after MeToo when the awareness index is 0.428, the full sample average. Additionally, we replace the PLS sexism by the GSS sexism index in Charles et al. (2022) directly. Result in column (6) further confirms the robustness of the estimate, which is still significant at the 1% level.

Fig. 7
figure 7

Dropping media markets with the bottom 10%, 20% and 30% coverage rates. Notes. This figure plots the estimates of \(\beta _1\) in Eq. (1) and its 95% confidence intervals after dropping media markets with NIBRS coverage rates below 10%, 20% and 30%

Dropping unrepresentative media markets

   As mentioned, one significant limitation of the NIBRS data is that not all police agencies, particularly those in large cities, have joined the program, which affects the overall representativeness of the NIBRS for nationwide studies. To examine the robustness of our result, we sequentially drop MMs that have the NIBRS coverage rates below 10%, 20% and 30%, respectively. We then re-estimate the coefficients and confidence intervals for each case, and the results are summarized in Fig. 7. For example, if we drop MMs with population coverage rate lower than 10%, the number of in-sample MMs will decrease from 134 to 117, but the OLS estimate, which is now \(-\)0.382, does not notably change. If we further drop MMs with coverage rates lower than 20% and 30%, the new estimates only slightly increase to \(-\)0.346 and \(-\)0.319, which are primarily due to the changes in the sample size from 134 to 106 and 96, but are still statistically significant at the 5% level. These robustness checks provide further confidence in the reliability of our results and demonstrate that our findings hold even when considering different subsets of MMs based on the representativeness of the population served by the participating agencies.

Table 4 Heterogeneity analysis

5.3 Heterogeneous effects

Thus far, we consider sex crimes committed by all types of offenders. Exploiting the rich information in the NIBRS, we conduct several analyses on the heterogeneity by the types of offenders and victims.

First, we compare the effect on sex crimes committed by offenders known or unknown to victims. We classify known and unknown offenders by the relationship between victim and offender. Known offenders indicate that victims are acquaintances, neighbors, employees, employers, friends, family members, or otherwise known, while unknown offenders mean that victims are strangers or the relationship is unknown. Estimates of the two offender types are presented in panel (a) of Table 4. Columns (1)–(3) display the effect on sex crimes committed by known offenders. In our preferred specification column (2) which contains all controls, the coefficient estimate is \(-\)0.361, and is significant at the 1% level. This indicates that as the sexism index decreases by an interdecile, sex crime rate will increase by about 1.2 relative to the pre-MeToo era. Notably, 78% of sex crimes in the United States during 2005–2010 involved an offender who is a family member, intimate partner, friend, or acquaintance (Planty et al. 2013). When sexually abused by known offenders, victims might be hesitant to report due to the concerns of stigmatization and retaliation which will increase their own costs of reporting, especially when the offenders are intimate partners or supervisors to whom victims remain committed. Our estimates here suggest that low sexism leads to an increase in reporting known offenders in the MeToo era. The inclusion of the state-by-year fixed effect does not meaningfully change the result, as documented by column (3). For sex crimes committed by unknown offenders, results in columns (4) and (5) are similar, but the effect becomes insignificant after including the state-by-year fixed effect.

Second, we compare the effect on incidents with and without injury. Unlike severe sex crimes such as rape which will cause serious physical injuries to victims, sexual misconduct such as verbal abuse and threat usually do not directly cause visible physical injury but less noticeable psychological and mental traumas. Therefore, some victims may decide not to report due to the lack of solid evidence. For non-injury incidents, estimates in columns (1)–(3) of panel (b) show that, relative to the pre-MeToo era, an interdecile decrease in sexism index leads to an approximate 1–1.5 increase in sex crime rate. The estimates are significant at the 5% or better levels in all three specifications. However, we do not observe any statistically meaningful change in sex crimes with injury, as shown in columns (4)–(6). These results suggest that the overall change in sex crime rate in low sexism MMs is primarily driven by incidents without injury in the MeToo era.

Next, we group victims by residential status in panel (c). Such a classification is sensible because local prevailing sexist attitude should generate larger impact on local MeToo salience and thus residents who internalize these social norms which might be remarkably different from non-residents.Footnote 18 Comparing results based on resident and non-resident victims before and after MeToo, we observe that all coefficients are significant at the conventional levels for resident victims, as shown in columns (1)–(3). The effect on non-resident victims, on the other hand, is insignificant at any conventional levels in most specifications. These estimates suggest that the overall change is primarily from resident victims.

Table 5 Falsification test and channel analysis

5.4 Discussion

5.4.1 Falsification

An alternative method to investigate whether sex crime rates in MMs with low sexism increased relative to those with high sexism post-MeToo is to assess whether crimes against women, which should not be influenced by the shifted public attention after MeToo, were actually affected. In other words, finding statistically significant effect on crimes that ought to be exogenous to sexism would invalidate our research design. Specifically, we investigate whether non-sexual UCR index crimes (homicide, robbery, aggravated assault, burglary, larceny theft, auto theft, and arson) against women meaningfully changed after MeToo across these areas. While some of these crimes such as burglary and robbery may accompany with sexual offense and are thus possibly affected by MeToo, we argue that such an effect would be second-order, at best.

Results in panel (a) of Table 5 show that the estimated change in non-sex crimes against women is imprecisely estimated and none is significant at any conventional levels across the three specifications. This finding reassures us that the estimated change in sex crime rate after MeToo is unlikely to be confounded by either concurrent policies or an overall increase in crimes against women in the low sexism areas. In addition, to ensure that we are making a correct inference about statistical significance of the main finding, we conduct a test in the spirit of Bertrand et al. (2004), in which we permute the sexism indices among the 134 MMs. Then, we re-run Eq. (1) to estimate the coefficient of the interaction term. The resulting distribution of \(\beta _1\) estimates based on 2000 repeated sampling is illustrated by Appendix Fig. A4. It shows our estimate of \(-\)0.5 in Table 3 ranks at the 3rd percentile, as only 72 placebo estimates are lower than the actual estimate in the 2000 permutations.

5.4.2 Increase in actual crimes or reporting?

The increased number of documented sex crime in MMs lower in sexism could be caused by an increase either in the actual number of incidents or in reporting in these areas. The former is not entirely impossible due to a retaliation effect which could result in an increase in sex crime against women for other reasons. For example, if potential offenders, predominantly men, take a resentful attitude toward MeToo which promotes gender equality and women’s rights, they would target women and commit more crimes against women in response to the movement. It is empirically challenging to differentiate these retaliation and reporting effects, because the real number of sex crime is unknown, and under-reporting should be still pervasive even after MeToo.

To find some suggestive evidence, we first focus on two types of sex crimes which are less likely to be affected by the reporting effect: Homicide and aggravated assault of women under sex crime circumstance. Homicide is arguably the least likely under-reported crime type, and aggravated assault usually leads to serious physical injury which needs immediate medical treatment. Panels (b) and (c) in Table 5 present the results about homicide, with panel (b) considers homicide rate per 10 million population and panel (c) uses count of homicide. All estimates are insignificant at any conventional levels.Footnote 19 Since homicide is rare, panel (d) additionally checks aggravated assault and finds similar result. In summary, there does not seem to be a notable difference in sex crime types that are insusceptible to the reporting effect between the low and high sexism areas after MeToo.

Table 6 Victims’ willingness to report after MeToo
Table 7 Victims’ willingness to report and comparison of sex crime rates before MeToo

Next, we investigate the reporting effect by directly examining the willingness of reporting a sex crime. Specifically, we consider the National Crime Victimization Survey (NCVS) which provides data of crime incidents on a nationally representative sample of approximately 49,000 to 77,400 households in the United States twice a year. Survey participants are asked some screening questions for possible crimes, and positive responses will be followed by additional questions, including report to police or not. We extract all incidents related to sexual offense happened between 2015 and 2019 (about 200 incidents per year), and then estimate the linear probability model and logit model with a binary dependent variable which equals 1 if a victim reported to police.Footnote 20 Since the NVCS does not release the geographic identifier of participants except for the census regions they live in (Northeast, Midwest, South, and West), we approximate the census region-level sexism index by computing the weighted average of the MM-level sexism indices with the weight being the aggregated population in those MMs located in the four regions. The estimated sexism indices for the four regions are \(-\)0.76, 0.47, 1.18, and \(-\)0.89, respectively, which is roughly in line with the visual evidence displayed by Appendix Fig. A5. Therefore, the slope coefficient in this case captures the effect of sexist attitude on the willingness of reporting after MeToo. Columns (1) and (3) in Table 6 document estimates of the three models with region dummies as well as dummies of incident years (2014–2019) and interview waves (2015Q1–2019Q4). Both estimates are significant at the conventional levels, and suggest an interdecile decrease in sexism index in a census region would lead to about 30 percentage points increase in the probability of reporting after MeToo. Estimates in columns (2) and (4) are still significant at the 5% or better levels when both incident and victim controls are added, and indicate about 27 percentage points increase in the probability of reporting after MeToo if one region’s sexism index is one interdecile lower.Footnote 21 Compared with the pre-MeToo overall reporting rate, which is about 32%, the proportion of reporting has substantially increased after 2017Q4 in regions with relative lower sexism index, although we want to emphasize that such a sizable estimated effect should be interpreted as suggestive evidence of the reporting effect because it only reflects incidents related to sexual offense that would be reported in the NCVS.Footnote 22 In Table 7, we further conduct a comparative analysis of sex crime reporting rates and sex crime rates across regions during the pre-MeToo period. Findings in columns (1)–(4) indicate no significant differences in sex crime reporting, and results in columns (5) and (6) demonstrate that the distinctions in sex crime rates are statistically insignificant among these regions.Footnote 23 This offers additional evidence supporting that the shift in reporting behavior in low sexism areas occurred during the post-MeToo period.

Finally, we explore whether there are behavioral changes in the law enforcement that contribute to the increased number of sex crime in the low sexism areas. Intuitively, local police agencies might take a more sympathetic attitude toward sex crime victims and become more responsive to reported incidents after MeToo, leading to more effective arrests. However, estimates in panel (e) of Table 5 suggest that the increase in sex crime in the low sexism areas is unlikely to be caused by the behavioral change in police in these areas, as none is statistically significant.

6 Conclusion

MeToo is a sweeping social movement which exposes many high-profile sexual misconduct committed by prominent men. Since sparked in 2017, this movement has shifted public attention to sexual violence survivors and induced heated discussions on gender discrimination and women’s rights. Constructing a novel media market-level measure on sexism, this study sheds light on how local sexism affected sex crimes across areas in the United States in the MeToo era. In particular, we find that low sexism areas witnessed higher sex crime rate than high sexism areas after MeToo. This result is robust even after including a host of controls that account for the differences in demographics, socioeconomic factor, and political preference across areas. We further demonstrate that the relative more incidents documented in these areas should be attributed to reporting rather than an increase in actual crimes.

We believe that our finding in this study can contribute to the current debates on MeToo. Sexual violence and its under-reporting have been a social problem in the United States for a long time, and MeToo reveals how prevailing sexual harassment is and how traumatic its consequence can be to survivors, predominantly women. Our study confirms that MeToo has indeed empowered women to come forward, but mainly those living in a less hostile environment. Although it would be difficult to reverse such a hostile social norm in the high sexism areas within a short time, we can empower women through certain concrete actions such as supplementary legal assistance.

The conclusions in this study, of course, are subject to a few caveats. For one, we cannot completely exclude the possibility that the majority of the more documented incidents is from police recording. Besides designating a complaint as unfounded, police can also intentionally misclassify an incident as a less severe offense or, under certain extreme circumstances, ignore a complaint without any written record. If so, the relative more incidents in low sexism areas should be primarily attributed to the change in recording behaviors of police after MeToo. Second, although sexual harassment in workplace is the focal point of MeToo, it is still severely under-reported due the lack of evidence, and is not clearly defined in the NIBRS data. Finally, police agencies in many large cities on the east and west coasts still do not report to the NIBRS. A comprehensive analysis requires filing information requests to these local police agencies individually to collect data on sexual offenses, and we leave this study to future research.