Pricing risk in prostitution: Evidence from online sex ads

DeAngelo, Gregory; Shapiro, Jacob N.; Borowitz, Jeffrey; Cafarella, Michael; Ré, Christopher; Shiffman, Gary

doi:10.1007/s11166-019-09317-1

Pricing risk in prostitution: Evidence from online sex ads

Published: 05 February 2020

Volume 59, pages 281–305, (2019)
Cite this article

Download PDF

Access provided by Autonomous University of Puebla

Journal of Risk and Uncertainty Aims and scope Submit manuscript

Pricing risk in prostitution: Evidence from online sex ads

Download PDF

Gregory DeAngelo¹,
Jacob N. Shapiro²,
Jeffrey Borowitz³,
Michael Cafarella⁴,
Christopher Ré⁵ &
…
Gary Shiffman⁶

930 Accesses
5 Citations
5 Altmetric
Explore all metrics

Abstract

The movement of many human interactions to the internet has led to massive volumes of text that contain high-value information about individual choices pertaining to risk and uncertainty. But unlocking these texts’ scientific value is challenging because online texts use slang and obfuscation, particularly so in areas of illicit behavior. Utilizing state-of-the-art techniques, we extract a range of variables from more than 30 million online ads for real-world sex over four years, data significantly larger than that previously developed. We establish prices in a common numeraire and study the correlates of pricing, focusing on risk. We show that there is a 15-19% price premium for services performed at a location of the buyer’s choosing (outcall). Examining how this premium varies across cities and service venues (i.e. incall vs. outcall) we show that most of the variation in prices is likely driven by supply-side decision making. We decompose the price premium into travel costs (75%) and the remainder that is strongly correlated with local violent crime risk. Finally, we show that sex workers demand compensating differentials for the risk that are on par with the very riskiest legal jobs; an hour spent with clients is valued at roughly $151 for incall services compared to an implied travel cost of $36/hour. These results show that offered prices in the online market for real-world sex are driven by the kinds of rational decision-making common to most pricing decisions and demonstrate the value of applying machine reading technologies to complex online text corpora.

Prostitution and Sex Work in an Online Context

Identifying human trafficking indicators in the UK online sex market

Article Open access 17 September 2021

Discover the latest articles, news and stories from top researchers in related subjects.

Artificial Intelligence

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

1 Introduction

The movement of many human interactions to the internet has led to massive volumes of text that contain high-value information for social scientists. For example, online illicit sex markets have yielded tens of millions of sex provider advertisements and over one million customer reviews of those providers. These online texts describe prices, locations, personal characteristics, preferences about the commercial encounter, and other information that is useful for social science but otherwise difficult to obtain at a large scale.^{Footnote 1} Unfortunately, these texts are intended for individual human, not analytical, consumption: they are casually-written and contain nonstandard usage and slang. To date, preparing such data for statistical analysis has required substantial human annotation, with the concomitant expense and necessary reduction in data size.

Recent advances in computer science enable macroscopic analysis of such data at finer resolution than previously possible by extracting high-quality structured analysis-ready information from text and images with minimal human annotation. In this paper we employ the DeepDive system to create a structured database of facts recovered from human-written source texts in the online illicit sex market. DeepDive uses large-scale probabilistic inference in a user-enabled feedback loop, thereby avoiding problems common to most standard annotation techniques, such as reliance on brittle rules or fixed grammars. In a number of applications, DeepDive obtains accuracy that is similar to that of a human annotator (Callaway 2015; Peters et al. 2014). We are thus able to obtain a very large and high-quality database about subtle concepts, derived from an extremely messy collection of documents (Appendix Table A1 provides precision/recall figures).

Data on this market are relatively easy to access in small quantities because much of the activity happens in public web fora (e.g. www.backpage.com). The postings on these sites are text advertisements written in informal English, akin to classified ads, often accompanied by images. As with classified ads, there is informal broad agreement about the kind of information to provide (prices, locations, etc) but diversity in both mode of expression (slang, colloquial usage, nonstandard usage) and in exactly which data values are provided. Figure 1 displays a full example of an online advertisement as well as two examples of ad text to illustrate the linguistic challenges in this space. Our text collection contains information from almost 30 million text ads for sex services, scraped from 19 distinct websites between early-2011 and January 2016 by IST Research.

Little is known about the market for sex services, but there are many reasons to want to know more. The sex services market is a two-sided market where buyers aim to connect with providers of sex services and providers wish to offer their services. Given the illegal nature of the services, both the service provider and John are concerned with the possibility that they could be matching with a law enforcement agent. Additionally, service providers also face a risk of encountering a violent John.^{Footnote 2} In fact, small scale surveys have found that as many as two in three sex service providers have been assaulted by customers or pimps (Weitzer 1999).

Sex services were traditionally solicited in outdoor spaces, which resulted in the creation of red-light districts (Hubbard and Sanders 2003). In the presence of mounting social pressure and threat of arrest, sex service workers were largely relegated to specific locations in urban areas where illegal activities were more tolerated. However, the introduction of the internet fundamentally changed the nature of the initial interaction between clients (the demand side of the market) and service providers (the supply side). Instead of face-to-face interactions, the initial interactions between potential clients and service providers began when a client responded to an advertisement, which offered sex services. The movement to arranging services online versus through face-to-face interactions is thought to result in more safety for service providers (Bass 2015), as service providers can screen potentially violent clients.^{Footnote 3} Additionally, the movement to online advertisements enabled service providers to coordinate appointments and have more control over the location where services are to be performed, rather than waiting at specific outdoor locations or propositioning potential clients in public locations (e.g. bars, casinos). So, service providers can now determine whether they are willing to travel to a potential client’s location (outcall), whether the client must come to the location of the service provider’s choosing (incall) or if the service provider can accommodate either situation. Despite these differences in the arrangement of services, payment for services has largely remained unchanged. Cash is still the currency of such transactions, which is most often paid upon completion of services. Violence against the service provider is thought to be most likely to occur at the point of payment, creating a market for additional security.^{Footnote 4}

While product differentiation existed in this market prior to online platforms, it was more difficult for such differentiation to occur. Service providers could place advertisements in alternative weeklies (e.g. The Village Voice) or they could work with an escort agency, which would coordinate service providers and potential clients for a fee. The movement to online advertisements enabled sex service providers to become more entrepreneurial and independent, enabling them to keep a greater share of the proceeds from the services that they offered. The movement online also enabled further horizontal and vertical differentiation of services. Vertical product differentiation was enabled, as idiosyncratic preferences for services (e.g. massage, erotic massage, escort, BDSM, “girlfriend experience”, etc.) could be catered to and advertised across providers who are willing to perform such acts. Horizontal product differentiation was also further enabled, as the costs of advertising and searching for one’s ideal variety were both reduced through online platforms. Moreover, advertisements could link to review web sites that enabled a client to see testimonials of the service provider’s quality. Thus, a service provider could generate a reputation, which enabled the service provider to command higher rates for services performed (Cunningham and Kendall 2016).

The movement of sex service advertisement online also resulted in significantly new knowledge about the market for sex services, as measuring the supply for these services was nearly impossible prior to its movement online. Although crude, we can now measure the number of sex service advertisements by city on a given day. Table 10.4 of Cunningham and Kendall (2011c), for example, reports the average number of sex services advertisements on one platform per MSA population across 31 of the largest municipalities in the US that are offered in a day range from 0.36 (Cleveland) to 18.34 (San Francisco). Unfortunately, such robust measures of the demand for sex services do not exist. In one of the only studies that attempts to estimate the demand for sex services, Roe-Sepowitz et al. (2016) estimate the demand for sex services in 15 large municipalities in the US. On average, the study finds that 1 out of every 20 males over the age of 18 years old in these jurisdictions was soliciting online sex ads.

From an academic perspective, the online market for sex is representative of the broader class of markets in which regulation and contract enforcement are decentralized because the underlying activity is illegal.^{Footnote 5} From a policy perspective, an increasingly large share of commercial sex transactions are coordinated through online markets (Cunningham and Kendall 2011b). The emergence of robust online markets for sex have been associated with a range of social ills including child prostitution (Hughes 2002; Mitchell et al. 2010), human trafficking (Latonero et al. 2011), and a drop in the average age of prostitutes (Cunningham and Kendall 2011b). At the same time, these markets may reduce transaction costs in the market for sex and enable better use of reputational mechanisms, both of which can be welfare enhancing for buyers and sellers (Cunningham and Kendall 2011b). Cunningham et al. (2018) also notes that the introduction of online sex service clearinghouses (namely, the erotic services section of www.craigslist.com) significantly reduced female homicides. Using the data from these markets to better understand the commercial sex trade therefore has great potential.

Our analysis makes a concrete methodological contribution as well. Because only some online fora are well-structured, and text ads have nonstandard content that is difficult for traditional natual language processing (NLP) methods, past sex market researchers have used relatively small amounts of data.^{Footnote 6}

As a result of these relatively small data sizes, relevant statistics must be aggregated into coarse geographic or temporal regions in order to be statistically useful. For example, a traditional small nationwide sample might yield only a few advertisements from a given city, forcing the analyst to aggregate advertisements at a state level in order to retain a minimal number of counts in each aggregated group. In contrast, our extracted database is significantly larger than even the largest previous effort. We extract price/location tuples for 4.5M ads, of which 2.1M occurred in locations for which we have the full set of covariates.^{Footnote 7} Elements in this large and high-accuracy dataset do not have to be aggregated into very coarse groupings in order to retain statistical validity: the data is higher-resolution than past efforts. For example, there may no longer be any need to aggregate advertisements to the state level; many cities will retain sufficient counts. This high resolution data enables us to control for local-level variation in contextual factors (such as local wages or commute times) that would have been impossible with data aggregated at coarser levels.

Using these unique data we find that pricing in the market is broadly rational from an economic perspective. This is not surprising; previous survey-based research has shown that prostitutes charge a premium for risky behaviors and that the size of the premium is greater for more attractive sex workers (Gertler et al. 2005). Exploiting within-period/within-city variation in the pricing structure across different service venues shows that services performed at a location of the buyer’s choosing (so-called ‘outcall’) earn an estimated 18% price premium, approximately $23 more for an hour-long session, controlling for a wide range of factors.

There could be several reasons for this premium. On the supply side, allowing the buyer to choose the location entails both additional physical risk and additional travel time. On the demand side, customers may be willing to pay a premium to reduce their risk and travel time. To assess the magnitude of these sources of variation in pricing we compare how prices vary as the physical size of the MSA for which services are offered expands and as the rate of violent crime in an area changes. Critically, the difference in those correlations across incall services, i.e. service at the provider’s chosen location, and outcall services, i.e. services at the customer’s chosen location, provides a way to sort out supply from demand elasticities. We find that prices for incall services are uncorrelated with MSA size and violent crime rates once some basic controls are added. Prices for outcall services, however, are strongly positively correlated with MSA size, though they are not correlated with violent crime rates once MSA fixed effects are accounted for.

These results are consistent with the incall and outcall markets being relatively segmented markets. If incall/outcall were one market then we should see prices moving in opposite directions regardless of whether supply is elastic, demand is elastic, or both. That is women living in larger areas who do not want to travel should compensate men to come to them by offering lower prices and they should charge more for travel. That we primarily observe movement in the outcall market across city size suggests that both supply and demand are fairly inelastic with respect to distance in the incall market but not in the outcall market.

The magnitude of the increase for outcall as city size increase indicates a price for providers’ travel time of $36 per extra hour of average commute time in a city, much smaller than the $151/hour mean price for incall time with a client, but much larger than the average female wage of $14/hour in our sample.^{Footnote 8} That difference is consistent with workers in this market demanding substantial compensation to make up for the distastefulness of time spent with clients, a compensating differential very large compared to the differentials that are easily measured.^{Footnote 9}

These results have a number of policy implications. Most importantly, improved labor market opportunities for women appear to change the composition of suppliers in the market but does not necessarily reduce the volume of activity, at least as proxied by ad postings. Secondly, the large difference between compensation required for travel and that for time spent with clients implies that many workers in the market would happily shift to other activities given the opportunity. Finally, with regards to some risks associated with prostitution (i.e. violence), it appears that sex workers advertising online may have sufficient market power to demand compensation for those risks in the outcall market, implying that the supply of workers in that market is inelastic.

The remainder of this paper proceeds as follows. Section 2 provides background on the online market for sex services. Section 3 outlines the technological innovations that enable this research. Section 4 briefly introduces the data. Section 5 analyzes the relationship between pricing and social conditions, including economic opportunities for potential providers. Section 6 concludes.

2 Background

This section provides basic background information on the online market for real-world sex.

2.1 Online ads for sex

Online advertisements for sex contain a wide variety of information, presumably whatever the seller deems necessary to drive demand for their services. The style of ads varies by website, some contain explicit language (e.g., “fetish friendly”, specify sex acts, or clearly discuss prices per hour), while others use more veiled language that almost implies dating (e.g., “Gf services always offered :)”). Ad postings include varying levels of information including age, ethnicity, height, weight, build, hair and eye color, and measurements. Some advertisements include links to the provider’s reviews by their past clients. Almost all advertisements include images, most of which are sexually suggestive or explicit, and phone numbers that allow clients to contact them. Sometimes, ads include guidelines for conduct on the phone with them; examples include ‘no texting’ and ‘no foul language.’ The market is clearly segregated by provider gender and we focus on ads for services by women.

2.2 Ad sources

Online ad content in this market comes from three distinct sources. First, there are individual providers who post ads representing themselves and pay on a per-ad basis. Second, there are content aggregators which repost content from backpage.com or other sites (see. e.g. Escortphonelist.com) and make money by selling ad space on their websites. Third, there are spammers who post ads in many locations seeking to drive traffic to other websites on which they sell space to advertisers or goods to those who click through (e.g. ‘click here to see my pics’ or redirects to other websites requiring payment for services).^{Footnote 10} In our analysis we focus on the first type by filtering out duplicate ads and by restricting the sample to ads whose contact information is not reused too frequently.

2.3 Differences between sites and users

Content on the 19 sites scraped for this analysis is as varied as in any other online market along two dimensions. First, there is site-level variation. Some sites are very formal and employ a standardized format (e.g. theeroticreview.com), while others allow ad hoc posts similar to craigslist.com. The largest website in our sample, backpage.com, had almost 16.8M ads posted between 2013 and early-2016, while the smallest, myredbook.com, had only 35,400 during the same period. The share of ads with prices varied between 2.1% on several sites up to 80.0% of the 35,400 ads scraped from myredbook.com. Most websites covered a wide geographic area, more than 200 unique MSAs per site, but the share of ads with location information varied widely. On average 68% of the ads had extractable location information, but the rate by website varied from a high of 87% for cityvibe.com to a low of only 20% for myproviderguide.com.

Second, there is variation in ad content due to the decisions of providers about how much to disclose. Prior research shows that providers who offer more information command higher prices (Logan and Shah 2013). Including more information also entails greater risks from law enforcement. The more potentially identifying information a provider offers the easier it is for law enforcement to track them down. While this may be a minor concern for independent voluntary providers, those who are underage or operating in jurisdictions where police pursue prostitution face a clear tradeoff.

3 Technology

High quality extraction from free-form text is challenging because of the massive amount of linguistic variation possible. Standard text processing approaches such as regular expressions are quite brittle and dependent on small changes in the source text. NLP methods are often effective at discovering linguistic information about the text (e.g., parse trees), but do not alone solve the extraction challenge.

DeepDive (Zhang 2015) is a system for extracting relational databases from unstructured text. It is distinctive when compared to previous information extraction systems in its ability to obtain very high quality databases for a reasonable engineering cost.^{Footnote 11} The system ingests raw documents and emits a structured database.

The most important component of DeepDive is a novel developer framework that allows an engineer to reliably and rapidly improve extracted data quality, until the output database is as good as the downstream application requires. Internally, DeepDive includes a high-performance engine for statistical inference and learning, allowing it to handle data that is noisy and imprecise. It is enabled by a number of recent innovations in scalable machine learning and data management (Shin et al. 2015; Zhang and Re 2014; Recht et al. 2011). Figure 2 outlines the overall DeepDive processing pipeline.

As described in detail in Zhang (2015), the system entails a three-phase process. For each phase, the engineer writes a short piece of program code, usually in Python.

The candidate generator is an engineer-written function that is applied to each input document and yields candidate extractions. The goal of the candidate generator is to eliminate “obviously” wrong outputs (e.g., non-numeric prices). Its output should be high-recall, low-precision.
The extraction features are user-defined functions that are applied to each candidate emitted in the previous step. An extraction feature is intended to encode a user-understandable piece of evidence about each candidate, useful when deciding whether the candidate is a correct extraction or not. For example, does a candidate for price have a $ symbol to its left? Obtaining high accuracy often means using many high-quality extraction features. Unlike some statistical frameworks in which features are largely or entirely synthetic, all DeepDive features are designed to be comprehensible by humans to permit manual debugging.
The distant supervision rules provide a positive or negative label to some of the feature-enriched candidates. For example, there might be extremely unambiguous prices that can be safely labeled as correct extractions. Alternatively, 9-digit zip codes that begin with a non-zero are unambiguously not prices for most applications, and can be safely labeled as incorrect price extractions. Despite the inevitable labeling errors that such rules introduce, we have found this approach to be preferable to the time-consuming process of manually providing labels.

After applying the above three steps, DeepDive constructs a large factor graph model that creates a random boolean variable for each candidate. The system infers a probability for each candidate, then applies a threshold to each inferred probability to determine whether the extraction will be placed in the output database (e.g., a given candidate is determined to be a one-hour price if the inferred probability exceeds 0.75). In this paper, DeepDive ingests a corpus of unstructured sex ads, then produces a high-quality output database which is used for all of our social science analysis.

4 Data

We analyze two unique data sets in our analysis. The first data set is derived from content in nearly 30 million online ads for sex across 19 different websites. The scrape was conducted as part of the Defense Advanced Research Project Agency’s (DARPA) Memex project by IST Research between early 2011 and January 2016. 51.2% of the data come from www.backpage.com, another 10.0% were scraped off sites which repost ads from www.backpage.com and other sources (including www.escortphonelist.com, www.escortsincollege.com, www.escortads.com, www.massagetroll.com, and www.escortsintheus.com), and the remainder come from various sections of www.craigslist.com which still hosted escort ads at the time of data collection (4.1%) as well as smaller focused sites such as www.cityxguide.com (2.9%), www.myproviderguide.com (2.3%), and a number of smaller sites.

The second data set contains information from approximately 1.1 million online reviews of sex services recorded at the web site www.theeroticreview.com, which was, until recently, the largest website hosting reviews of sex services. These data were scraped in March 2016. Since the website archives old content the reviews are from as early as 1998 through March 2016.

We describe each of these data sets in detail below.

4.1 Advertisement data

Each ad consists of a free-form text field, and — depending on the site — additional structured fields such as post date or location. When available, these fields are scraped using the HTML structure of the site and merged with other extractions. Most content, however, is only available in the free-form text. Scraped ads which had a specific service location were mapped to a Census Bureau Metropolitan Statistical Area (MSA), the smallest geographic unit for which reliable labor force data are available in time-series across the United States.

While the initial scrape contains 29.9 million online ads, our final data set is the subset of the initial data for which we could extract information on the full range of relevant covariates. Whether an ad is for incall, outcall or both services is extracted from all ads. Posting date information could not be accurately extracted from approximately 5 million ads. Location was unclear in another 13 million ads (i.e. the scraped text did not specify the locations for which services were offered). Prices charged by providers were missing or unclearly stated in another 8 million ads. Finally, 1.8 million ads with location information were for small towns that could not fit clearly within one of the Census MSA locations. Our final estimation data contain all 2.1M ads for which we have information on the full set of covariates identified below, do not represent ads for providers working in massage parlors, and are not posted at unrealistically high rates.

Table 1 provides summary statistics of the ad-level data for all ads that did not appear to be spam (Panel A), for the 2.1M ads that could both be linked to an MSA for which we have the full set of covariates (Panel B), as well as the MSA/month (Panel C) and MSA level covariates Panel (D).

Table 1 Summary statistics - Ad data and MSAs

Full size table

4.2 Review data

Review data from www.theeroticreview.com (TER) are broken into two components. First, each service provider has a top page where specific structured text about the provider can be obtained. For example, the provider’s age range, hair color, email, phone number, preferences for performing services at a location of their choice or the John’s choice, average performance rating, and appearance statistics can be found on this page. Second, on separate subpages the specific reviews from each John that reviewed the provider can be accessed. Each review contains a series of additional information about the provider, including appearance and performance ratings that are specific to that review, whether the service was an incall or outcall, type of intercourse, and precautions taken by the provider such as the use of a screening agency.

Table 2 provides summary statistics for the 1.1 million reviews from TER. Figure 3 shows the average incall price per hour for each of the 82 most populous MSAs in the United States. Circles are sized by ads per capita. Darker shading indicates higher median ad prices within the MSA.

Table 2 Summary statistics - reviews

Full size table

4.3 Control variables

To analyze the relationship between prices in the ads and other factors we include a broad range of MSA-level covariates from the following sources:

Opportunity costs. We obtain measures of unemployment from the American Community Survey (ACS) data files for each MSA at an annual level. To assess travel time providers of outcall services can expect, we generate average commute (which is asked as a categorical variable) by weighting the midpoint of response times for each bucket of respondents by the proportion of respondents for that bucket.
Wages. To control for economic opportunity we generate a Bartik-style wage instrument that isolates variation in the MSA-level wage time-series due to national level industry trends (Bartik 1991). To generate our Bartik wages we compute average wages by gender, industry, and MSA using the 2000 census data. We then computed the share of employment by industry and MSA for each month of the sample period using the Bureau of Labor Statistics Quarterly Census of Employment and Wages (QCEW). We calculate instrumented wages by assuming that each job pays the estimated wage for the given industry and MSA from 2000. Because QCEW is a census of wages based on unemployment insurance reporting, we get a wage estimate that varies at the MSA month level. As a robustness check we also include data on monthly median rental prices.^{Footnote 12}
Law enforcement risk. We utilize data from the Law Enforcement Management and Administrative Statistics (LEMAS) databases to determine the number of full time law enforcement employees in each MSA.
Abuse risk. We use data from the Uniform Crime Reports (UCR) to determine the total annual violent crimes per year in a given MSA as a proxy for abuse risk. We provide a placebo test that the coefficient is not just proxying for overall crime by using the rate of property crime per capita. These data are both cross-validated with information in the National Incident Based Reporting System (NIBRS).

5 Results

Our core results focus on understanding how risk and labor market opportunities interact. We use straightforward multivariate regression to identify the conditional correlations between various risk factors and pricing.

5.1 Crime, travel time, and pricing

Different service locations expose sex workers to different risks (Spice 2007; Maticka-Tyndale et al. 1999; Taylor 2003). Workers in massage parlors are generally considered the safest from abusive customers. Workers who set the service location, i.e. “incall” are the next safest as they can and often do make sure the service takes place in a location with security. Workers offering services in a location of the customer’s choosing, i.e. “outcall,” are the most exposed to risk of abuse by customers or law enforcement stings.

Consistent with workers requiring compensation for risk, there is clear variation in prices by service location. Figure 4 clearly highlights this fact, plotting the distribution of prices for all 2.1M ads with prices which specify a service venue, do not appear to be spam, and are linked to an MSA for which we have the full set of covariates.^{Footnote 13} Mean and median prices are drastically lower for massage parlor ads than for other service locations. Within other venues both mean and median prices are statistically significantly lower among incall providers than outcall providers, but there is substantial overlap in the price distributions.

Of course, different service locations also require providers to engage in different levels of travel. In particular, providers offering outcall may have to spend more time traveling than those who provider service at a venue of their choosing (incall), and buyers face the opposite costs. We therefore decompose the price premium into two components, physical risk and travel time, by estimating how the relationship between pricing and service venue varies across a proxy for the physical size of the service area, average commute time in the MSA as measured by the American Community Survey.

Specifically, we estimate the following regression at the ad level, dropping all ads by providers who average more than 6 ads per day to remove spam from the sample:

$$ \begin{array}{@{}rcl@{}} P_{i,j,m} = \alpha + \beta_{1} \text{outcall}_{i} + \beta_{2} \text{both}_{i,j} + \beta_{3} \text{unclear}_{i,j} + \beta_{4} C_{j} + \beta_{5} (\text{outcall}_{i,j} \times C_{j})\\ + \beta_{6} (\text{both}_{i,j} \times C_{j}) + \beta_{7} (\text{unclear}_{i,j} \times C_{j}) + \tau_{m} + \mathbf{\Delta} X_{j,m} + w_{j} + \epsilon_{i}. \end{array} $$

(1)

Here ads are indexed by i and each occurs in an MSA j and month m. P_i is the price for an ad and the first three variables in the regression are indicators for whether the ad lists outcall as a location, offers both incall and outcall, or is unclear as to the service venue. β₁ captures the price premium for outcall over incall in an MSA with zero commuting time, β₂ does so for ads offering outcall or incall, and β₃ estimates the price premium for ads which are unclear as to the service venue. β₄ reports how much the price for incall varies as the average commute time in the MSA increases. β₅ through β₇ report how those costs change for other service venues. We include two fixed effects in all regressions, τ_m is a month fixed effect to capture any nationwide secular trends in pricing for sex services in online ads and w_j is a website fixed effect to account for consistent differences in pricing across sites. X_{j, m} is a vector of MSA/month-level traits such as unemployment rate, number of ads posted, and number of unique providers posting ads, as well as MSA-level variables such as population and racial composition, all of which might be correlated with pricing. We cluster standard errors at the MSA level in all regressions to allow for within locality correlations in the errors when assessing statistical significance.

By looking at how much of the price premium for outcall comes from travel costs at different levels of the commute time variable we can assess the relative importance of risk vs. average commute time in pricing. The variation in pricing reflects potential differences in risks and differences in travel costs for both buyers (men) and sellers (women). To be clear, commute time reported in surveys is a proxy for travel time for the average outcall or incall event, but since there is no reliable way to measure the average travel time for an outcall service event we use it as a proxy. The relationship between the estimated coefficient in Table 3 and the value of time to providers (the actual quantity of interest) depends on the ratio of commute time to outcall travel time. If that number is substantially greater than 1, then we understate the value of time and therefore overstate the risk premium providers demand for outcall appointments.^{Footnote 14}

Table 3 Commute time and prices in the online market for sex

Full size table

As Table 3 shows quite clearly the majority of the outcall premium comes from travel time. Column 1 shows the simple differences in costs across service venue, and the price premium is quite clear with outcall services commanding a $24 per act premium, roughly a 17% premium on the median incall price of $140. Column 2 adds a control for travel time and various MSA/month and MSA level covariates, showing that commute time is positively correlated with pricing, and that the MSA-level covariates do not affect the price premiums. Column 3 adds in the full set of interaction terms (column 3) to see how incall prices and the outcall premium varies with MSA size. Incall prices are positive but statistically insignificantly correlated with average commute time.^{Footnote 15} However, each additional minute of average commute time predicts a statistically significant increase of $0.53 in the price of outcall services. At the 50th percentile of commute time, roughly 30 minutes, the outcall premium is $22.50. The estimates on the interaction terms are quite robust. They change little when we add an MSA fixed-effect (column 4) to account for all time-invariant characteristics of the MSAs or when we include Bartik (1991) instruments for male and female wages in the licit market as controls along with their interaction with commute time (column 5) to account for any correlations driven by secular trends in local labor markets that are not captured by controlling for unemployment rates. ^{Footnote 16}

So what do these prices reveal about the components of pricing? In column 3, our preferred specification because it allows us to directly estimate the role of commute time, we examine the nonlinear effects of commute time on prices. At the 1st percentile of commute time, roughly 20 minutes, the outcall price premium is $18.3, of which roughly $11 comes from travel time. At the 50th percentile of commute time, roughly 28 minutes, the outcall premium is $22.5. And at the 99th percentile of commute time, roughly 38 minutes, the outcall premium is roughly $28. If we treat the distance-invariant outcall premium in column (3) as the risk component, then the additional risk from performing these services at the buyer’s chosen location accounts for roughly 34% of the outcall premium in the median-sized MSA.^{Footnote 17} Thus, travel time appears to be the main driver of outcall pricing but there is another component.

To further assess whether risk is indeed driving price we estimate a series of regressions adding in interactions of service venue with measures of violent crime rates, which we believe proxy for risk to providers and buyers, as well as property crime rates, which are arguably less correlated with risk to providers. As before we show the regression with controls and with MSA fixed-effects and cluster standard errors at the MSA level. If risk is driving a large share of the outcall premium and if incall providers can do more to shield themselves from abuse risk, then we should see that outcall prices are positively correlated with violent crime rates, but incall prices are not. We present the regressions with the interactions of the different types of crime rates with service venues separately (columns 1 and 2 without MSA fixed-effects and columns 4 and 5 with MSA fixed-effects) and jointly (columns 3 and 6) so that we are estimating the correlations with violent crime rates conditional on property crime rates and vice versa.

Table 4 shows that prices for outcall services and for services offered in either venue are positively correlated with violent crime rates when controlling for property crime rates (column 3), but the correlation is not robust to controlling for MSA fixed-effects (columns 4 and 6). Conditional on violent crime rates, property crime rates appear to be negatively correlated with pricing for outcall services (columns 3 and 6). The magnitude of the relationship is modest. A one standard deviation increase in the number of violent crimes (roughly 310 per 100,000 people per year) predicts a $4.23 (95% CI of $0.67 to $7.8) increase in the outcall premium (using the estimates from column 3). This represents a 0.064 standardized treatment effect and an 2.8% percentage point increase from the mean incall price of $151/hour. There is no similar positive relationship with property crime rates.

Table 4 Commute time and prices in the online market for sex

Full size table

These results are consistent with the incall and outcall markets being segmented. If incall/outcall were one market then we should see prices moving in opposite directions across the markets, that is as conditions favor incall services outcall pricing should drop and vice versa. With respect to commute time, for example, women living in larger areas who do not want to travel (or who live in high-crime areas and fear crime) should pay men to come to them by offering lower prices and they should charge more for travel. That we primarily observe movement in the outcall market across crime rates and city size suggests supply and demand are fairly inelastic with respect to distance in the incall market but not in the outcall market.

Under the assumption that the markets are segmented, then these regressions enable one more decomposition of interest. Sex work is generally considered a distasteful job for which workers require compensation well above the market wages they could earn in other occupations (Rao et al. 2003). The premium that workers in so-called ‘dirty’ jobs receive above what similarly skilled people earn in less unpleasant occupations are known as compensating differentials and can take the form of job amenities (e.g. more time off, flexible hours, etc.) or higher wages (Viscusi 1993; Lavetti 2015).^{Footnote 18} Using our data we can break the cost of a session into the travel time component, for which the worker should require no special compensation as it is equivalent to the individuals ‘normal’ work options, and the service time component, for which they would expect special compensation. Assuming that average travel time for an average outcall is similar to the average commute time, then for a median sized city a provider will charge roughly $18 for 30 minutes spent in the car, implying an hourly wage of $36/hour, which is small compared to the $151 mean price for an hour spent with the client in incall services. This is a massive compensating differential, implying either that sex workers are in great demand or that the occupation is quite distasteful.^{Footnote 19}

As an additional robustness check we include a number of other time-varying controls measured at the MSA-month level in Appendix Table A2.^{Footnote 20} Column 1 presents the core estimating equation from Table 3 Column 4 to enable easy comparison. Column 2 adds in controls for the number of ads per capita in that MSA-month to control for the possibility that changes in competition in the ad space differentially affect prices in the outcall vs. incall market. Column 3 adds in linear, quadratic, and cubic terms in the number of providers advertising in the MSA-month for the same reason. Column 4 accounts for the concentration of advertising in the MSI by including a term for the Hirschman-Herfindahl Index (HHI) of the number of providers in the MSA-month. Column 5 includes fixed-effects at the MSA-year level and clusters standard errors at the same level. Column 6 adds controls for the median rental price in the MSA-month to better control for local economic conditions for the subset of 170 MSAs where that variable is available. Column 7 includes all controls for the same subset. Column 8 restricts our core estimating approach to (a) ads that are priced between the 5th and 95th percentile of pricing in all ads and (b) MSA-months that are between the 5th and 95th percentiles in terms of ads per capita. In all cases the core estimate of the interaction term between outcall prices and commute time remains statistically strong and substantively large. In addition we estimate models controlling for rates of prostitution arrests and sex offenses measured at the MSA-month level using UCR data. Although none of the results change with these controls we do not include them in the table because of the significant reporting biases that may be present in using reported crime to measure sexual assault risk and arrest risk.^{Footnote 21}

To account for potentially differential trends across MSA, Appendix Table A3 compares our baseline specification (Column 1) to a model that includes MSA-specific linear time trends (Column 2) and one with quadratic time trends (Column 3). In both cases the core results remain substantively unchanged. Overall the results are quite stable once standard two-way fixed effects are included.

5.2 Controlling for provider characteristics

Our core data lack detailed information on provider traits such as appearance or specific sex acts offered. It is possible, though unlikely, that such traits correlate with MSA size and service location in a way that generates spurious results. As an additional robustness check we re-run the core analysis focusing on prices for providers listed in The Erotic Review which provides detailed data on individual providers as described in Section 4.2. These data represent the subset of providers advertising online who have chosen to pay for a page on the review site, presumably because doing so enables them to establish a stronger reputation with customers and their average pricing is significantly higher than in the overall sample. If we see a similar outcall premium in this subset where we can control for unobservables in the main data that should provide greater confidence in the core estimates reported in Tables 3 and 4.

Specifically, we estimate the premium associated with offering either outcall or both outcall and incall on the average price charged by a specific service provider:

$$ \begin{array}{@{}rcl@{}} P_{i,j,m} &=& \alpha + \beta_{1} \text{outcall}_{i} + \beta_{2} \text{both}_{i,j} + \beta_{3} \text{App}_{i} + \mathbf{R}\text{Race}_{i} + \mathbf{D}\text{Desc}_{i} + \mathbf{A}\text{Act}_{i} + \beta_{4} \text{Perf}_{i} \\ &&\tau_{m} + \gamma_{j} + \epsilon_{i,j,m}. \end{array} $$

(2)

Here providers are indexed by i and each occurs in an MSA j and month m. P_i is the price for a review and the first two variables in the regression are indicators for whether the provider offers outcall or both incall and outcall services. β₁ captures the price premium for outcall over incall services, while β₂ does so for providers offering outcall or incall services. A series of additional controls are included that account for specific features of the provider. β₃ captures the effect of controls for a provider’s appearance, which is measured on a 1-10 scale. The vector R identifies the impact of race of the provider, which are measured with a vector of indicator variables of black, Asian, Hispanic, or other non-white categories. D captures the price premia associated with a vector of variables describing the provider’s appearance and build, which includes indicators for the provider’s build (e.g. “average” or “thin”), tattoos (e.g. “a few” or “many”), breast appearance (e.g. “average” or “perky”), and breast implants (“yes”, “no”, or “don’t know”). A identifies the price premium associated with specific acts performed by the provider, which include indicators for oral sex, oral sex without a condom, anal sex, or multiple orgasms per session. β₄ captures the price premium associated with the provider’s performance rating, which is determined on a 1-10 scale. We also include city and month fixed effects, in addition to a provider specific idiosyncratic error term. As before we cluster standard errors at the MSA level.

As Table 5 indicates, there is a considerable outcall premium. This premium is identifiable regardless of whether the provider offers outcall services exclusively or provides flexibility in their willingness to provide services at the client’s location or at the providers location. Interpreting column (1), we note that relative to provider’s that exclusively provide incall services, those that offer outcall or both incall and outcall services receive a price premium of approximately $57 and $50, respectively, above the average incall price of $242.50. In columns (2) - (7) we gradually increase the number of controls that are included in the estimation. In our most saturated model —column (7) —the price premia is considerably smaller than in our most naive specification (approximately half the size), but a sizable and statistically significant effect remains. In effect, after controlling for both observable and a wide range of normally unobservable features of providers, the price premium associated with service providers that are offering higher risk services by traveling to the client’s location receive approximately $20-$25 more per session, which translates into an approximate 10% price premium above providers that exclusively offer incall services

Table 5 Provider traits and prices in the online market for sex

Full size table

6 Conclusion

Women selling sex services online appear to engage in rational pricing behavior. The risk associated with traveling to the client is rewarded with a significant price premium that goes beyond the costs associated with commuting. The costs associated with abuse risk borne by direct involvement with the client at a location of their choosing appears to be compensated. Most importantly, the magnitude of the compensating differential workers in this market demand for service time compared to travel time, roughly $115, is a significant share of the $151/hour mean price for incall time with a client.

These results imply that many workers in the market would happily shift to other activities given the opportunity and that sex workers advertising online for outcall services have sufficient market power to demand compensation for risks, suggesting the labor supply for this market is fairly inelastic.

More broadly our analysis demonstrates there is great potential for learning about behavior in online markets by combining cutting-edge machine reading technology with established econometric approaches. Importantly, though, if pricing is broadly rational then pricing anomalies should be detectable. Organizations involved in trafficking women for sex work are unlikely to internalize labor market conditions in the same way that voluntary providers do, and so are likely to shift prices in different ways. Such price-based anomaly detection is an important avenue for future research.

Notes

Other arenas of human interaction producing massive corpora of online text include weapons sales, labor markets, and small-cap stock fraud, among others.
As noted in Potterat et al. (2004), the estimated female homicide rate is 204 per 100,000, which is 51 times higher than the second most dangerous occupation for females (liquor store employee).
A digital footprint of the arrangements of the services to be provided, which can include date, time, location, services to be performed and cost of services, are also generated when arrangements are made by email, which is also thought to deter violent clients.
Dank et al. (2014) provide a very thorough review of market structure and size estimates for eight major American cities.
Such markets include gray markets devoted to goods that are not illegal, but rather are sold secondhand not through the original manufacturer, as well as black markets in illegal goods.
Cunningham and Kendall (2011a) use data only from www.craigslist.com for select cities, while Cunningham and Shah (2014) use information from www.theeroticreview.com as well as local advertisements in newspapaers in Rhode Island. Edlund et al. (2009) analyze 40,000 posts from one high-end escort site. Delap (2014) uses data on 190,000 profiles of escorts from a review site where escorts pay to post profiles and are asked to fill in specific fields when setting up their site, making the scraping and analysis relatively straightforward. Moffatt and Peters (2004) use data on 998 complete reports filed by users on one website between January 1999 and July 2000.
The full corpus is just under 30M ads, but many lack content on price or define a geographic area that cannot be reliably matched to a region for which other measures are available (i.e. crime rates or employment statistics).
95% confidence intervals on the estimated premium ranges from $11 to $73.
Lavetti (2015) estimates slightly smaller compensating differentials in fisheries that have exceptionally high annual mortality risks.
Content aggregators account for roughly 38% of the ads in our corpus. Sometimes these sites include old ads not still present on other sites and so we include them in our collection.
The output database has high precision (i.e., it contains facts that very accurately reflect the source documents) and high recall (i.e., most of the facts in the source documents appear in the output database).
We thank an anonymous reviewer for suggesting this control.
Although ad content is available through 2016, we are constrained by the availability of MSA-level crime information from the FBI’s uniform crime reports and other covariates from the American Community Survey.
We thank one of our anonymous reviewers for highlighting this important point.
The confidence interval for the relationship between incall pricing and commute time range from -9.4 to 62 at the 99th percentile of commute time.
We additionally ensure the results do not mask a non-linear relationship between pricing and travel time by estimating a model in which we interact indicator variables for each decile of commute time with indicators for the different service venues (incall, outcall, both, unclear). This approach allows for the possibility that the outcall premium is flat until very high levels of commute time and other non-linearities. We find that both the outcall premium and the premium for those offering both incall and outcall are generally increasing in commute time. Results available on request.
The time invariant outcall premium in Column 3 is $7.58 which is 34% of the $22.50 outcall premium at the median travel distance.
There is also within-job variation in wages in response to risk. Prostitutes in Mexico, for example, earn a substantial compensating differential for unprotected sex (Rao et al. 2003).
As a point of comparison, Lavetti (2015) estimates the compensating differential for fishermen in Alaska (a job carrying an annual risk of death around 1%) of around 4 times higher than other jobs for which individuals in his sample are qualified, which is comparable to our estimate of about 4 times higher wages here.
We thank our anonymous reviewers for these suggestions.
Results available from authors.

References

Bartik, TJ. (1991). Who benefits from state and local economic development policies? Books from Upjohn Press, W.E. Upjohn Institute for Employment Research. https://ideas.repec.org/b/upj/ubooks/wbsle.html.
Bass, A. (2015). Getting screwed: Sex workers and the law. University Press of New England.
Callaway, E. (2015). Computers read the fossil record. Nature, 523(7558), 115–116.
Article Google Scholar
Cunningham, S, DeAngelo, G, Tripp, J. (2018). Craiglist’s effect on violence against women. Working Paper.
Cunningham, S, & Shah, M. (2014). Decriminalizing Indoor Prostitution: Implications for Sexual Violence and Public Health. Working Paper 20281, National Bureau of Economic Research. http://www.nber.org/papers/w20281.
Cunningham, S, & Kendall, TD. (2011a). Men in transit and prostitution: Using political conventions as a natural experiment. The B.E. Journal of Economic Analysis & Policy 11(1).
Cunningham, S, & Kendall, TD. (2011b). Prostitution 2.0: the changing face of sex work. Journal of Urban Economics, 69(3), 273–287.
Cunningham, S, & Kendall, TD. (2011c). Prostitution, technology and the law: New data and directions. Edward Elgar Publishing Limited chapter Research Handbook in the Law and Economics of the Family.
Cunningham, S, & Kendall, TD. (2016). Examining the role of client reviews and reputation within online prostitution. In Cunningham, S, & Shah, M (Eds.) The Oxford handbook of the economics of prostitution. Oxford University Press, Chapter 2.
Dank, M, Khan, B, Downey, PM, Kotonias, C, Mayer, D, Owens, C, Pacifici, L, Yu, L. (2014). Estimating the size and structure of the underground commercial sex economy in eight major US cities. Urban Institute.
Delap, J. (2014). More bang for your buck. The Economist, August 9.
Edlund, L, Engelberg, J, Parsons, CA. (2009). The wages of sin. Economics Discussion Paper 0809-16 Columbia University.
Gertler, P, Shah, M, Bertozzi, SM. (2005). Risky business: The market for unprotected commercial sex. Journal of Political Economy, 113(3), 518–550.
Article Google Scholar
Hubbard, P, & Sanders, T. (2003). Making space for sex work: Female street prostitution and the production of urban space. International Journal of Urban and Regional Research.
Hughes, DM. (2002). The use of new communications and information technologies for sexual exploitation of women and children. Hastings Women’s Law Journal, 13, 129–148.
Google Scholar
Latonero, M, Berhane, G, Hernandez, A, Mohebi, T, Movius, L. (2011). Human trafficking online: The role of social networking sites and online classifieds. Report USC Annenberg Center on Communication Leadership & Policy.
Lavetti, K. (2015). Estimating preferences in hedonic wage models: Lessons from the Deadliest Catch. Working paper, Ohio State University.
Logan, TD, & Shah, M. (2013). Face value: Information and signaling in an illegal market. Southern Economic Journal, 79(3), 529–564.
Article Google Scholar
Maticka-Tyndale, E, Lewis, J, Clark, JP, Zubick, J, Young, S. (1999). Social and cultural vulnerability to sexually transmitted infection: The work of exotic dancers. Canadian Journal of Public Health, 90(1), 19.
Article Google Scholar
Mitchell, KJ, Finkelhor, D, Jones, LM, Wolak, J. (2010). Use of social networking sites in online sex crime against minors: An examination of national incidence and mean of utilization. Journal of Adolescent Health, 47(2), 183–90.
Article Google Scholar
Moffatt, PG, & Peters, SA. (2004). Pricing personal services: An empirical study of earnings in the UK prostitution industry. Scottish Journal of Political Economy, 51(5), 675–690.
Article Google Scholar
Peters, SE, Zhang, C, Livny, M, Ré, C. (2014). A machine reading system for assembling synthetic paleontological databases. PLoS ONE.
Potterat, JJ, Brewer, DD, Muth, SQ, Rothenberg, RB, Woodhouse, DE, Muth, JB, Stites, HK, Brody, S. (2004). Mortality in a long-term open cohort of prostitute women. American Journal of Epidemiology, 159(8), 778–785.
Article Google Scholar
Rao, V, Guptab, I, Lokshina, M, Janac, S. (2003). Sex workers and the cost of safe sex: The compensating differential for condom use among Calcutta prostitutes. Journal of Development Economics, 71(2), 585–603.
Article Google Scholar
Recht, B, Re, C, Wright, SJ, Niu, F. (2011). Hogwild: A lock-free approach to parallelizing stochastic gradient descent. In Advances in Neural Information Processing Systems 24: 25th Annual Conference on Neural Information Processing Systems 2011. Proceedings of a meeting held 12-14 December 2011, Granada, Spain (pp. 693–701).
Roe-Sepowitz, D, Hickle, K, Gallagher, J, Smith, J. (2016). Invisible offenders: A study estimating online sex customers. Journal of Human Trafficking: 272–280.
Article Google Scholar
Shin, J, Wu, S, Wang, F, De Sa, C, Zhang, C, Ré, C. (2015). Incremental knowledge base construction using DeepDive. PVLDB, 8(11), 1310–1321.
Google Scholar
Spice, W. (2007). Management of sex workers and other high-risk groups. Occupational Medicine, 57(5), 322–328.
Article Google Scholar
Taylor, D. (2003). Sex for sale: New challenges and new dangers for women working on and off the streets. Mainliners.
Viscusi, WK. (1993). The value of risks to life and health. Journal of Economic Literature, 31, 1912–1946.
Google Scholar
Weitzer, R. (1999). Prostitution control in america: Rethinking public policy. Crime, Law and Social Change 32(1).
Zhang, C. (2015). DeepDive: A data management system for automatic knowledge base construction. PhD thesis, University of Wisconsin-Madison.
Zhang, C, & Re, C. (2014). Dimmwitted: A study of main-memory statistical analytics. PVLDB, 7(12), 1283–1294.
Google Scholar

Download references

Author information

Authors and Affiliations

Claremont Graduate University, Claremont, CA, 91711, USA
Gregory DeAngelo
Princeton University, Princeton, NJ, 08544, USA
Jacob N. Shapiro
Sam Nunn School of International Affairs, Georgia Institute of Technology, 781 Marietta, NW, 30332, USA
Jeffrey Borowitz
University of Michigan, Ann Arbor, MI, 48109, USA
Michael Cafarella
Stanford University, Stanford, CA, 94305, USA
Christopher Ré
Giant Oak, Arlington, VA, 22201, USA
Gary Shiffman

Authors

Gregory DeAngelo
View author publications
You can also search for this author in PubMed Google Scholar
Jacob N. Shapiro
View author publications
You can also search for this author in PubMed Google Scholar
Jeffrey Borowitz
View author publications
You can also search for this author in PubMed Google Scholar
Michael Cafarella
View author publications
You can also search for this author in PubMed Google Scholar
Christopher Ré
View author publications
You can also search for this author in PubMed Google Scholar
Gary Shiffman
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Jacob N. Shapiro.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

We thank our anonymous reviewers and participants in seminars at Harvard University and Princeton University for helpful comments and feedback. All errors are our own. This research was supported in part by DARPA. Any opinions, findings and conclusions or recommendations expressed in this material are those of the author(s) and do not necessarily reflect the views of DARPA or Giant Oak.

Electronic supplementary material

Below is the link to the electronic supplementary material.

(PDF 274 KB)

Rights and permissions

Reprints and permissions

About this article

Cite this article

DeAngelo, G., Shapiro, J.N., Borowitz, J. et al. Pricing risk in prostitution: Evidence from online sex ads. J Risk Uncertain 59, 281–305 (2019). https://doi.org/10.1007/s11166-019-09317-1

Download citation

Published: 05 February 2020
Issue Date: December 2019
DOI: https://doi.org/10.1007/s11166-019-09317-1

Keywords

JEL Classifications

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

Pricing risk in prostitution: Evidence from online sex ads

Abstract

Similar content being viewed by others

Prostitution and Sex Work in an Online Context

Prostitution and Sex Work in an Online Context

Identifying human trafficking indicators in the UK online sex market

1 Introduction