1 Introduction

This paper reports on a series of technical activities undertaken with the long-term goal of securing workers’ rights and building worker power in digital labor platforms. Centered around the development of a rating system for digital labor platforms, and a website on which to display those ratings, the work was led by two practitioners (the authors) who have a background in academic HCI and CSCW research, but who were positioned, at the time of the work, as an independent contractor and union employee.

The relations between technology, work organization, worker power, workers’ rights, and workers’ experience of work have long been central concerns of CSCW. European CSCW research, especially, has a tradition of close collaboration with workers and trade unionists in which researchers aim to develop technologies and work processes that increase workplace democracy (see Part 2.2 for references). This paper contributes a practitioner perspective on this theme in a new context: the (sometimes global) labor markets enabled by digital labor platforms.

Neither a traditional empirical research study oriented around research questions, nor a systems building-and-evaluation study, this paper instead offers a detailed case study of CSCW praxis in this contemporary context. This work fits into a broad landscape of efforts among computing researchers, policy makers, social scientists, legal scholars, and trade unionists to understand the scope and nature of digital labor platforms, and to safeguard worker rights and build worker power in these new digitally constituted workplaces.

The paper offers two main contributions for the CSCW community. First, it shows, through a detailed case, that a human-centered computing perspective can be a resource for practitioner activities and contemporary policy debates about “the future of work,” and can have a positive impact on policy, practice, and public discourse. Second, it offers a set of artifacts and a description of the rationales behind their development: a survey used to collect information about platform workers’ experiences; a method used to calculate quantitative ratings based on workers’ answers to the survey questions; and a set of seventeen additional criteria for evaluating work processes on digital labor platforms from a workers’ rights perspective. These artifacts are not “the last word” on their respective topics but rather first or second efforts – prototypes. They have limitations and deficits that reflect the practical constraints in which they were developed. We present them here as an open invitation for others to further develop, critique, revise, adapt, and reuse.

The remainder of the paper proceeds as follows. Part 2 describes research within and outside of HCI, CSCW, and human computation on workers’ experiences and working conditions in digital labor platforms. Part 3 describes the questions, goals, and approach taken in the applied CSCW work described in the paper. Part 4 describes the redesign of faircrowdwork.org, with a focus on the development of a quantitative rating system for working conditions, including a reflection on the challenges of bringing a CSCW approach to an institution with its own ideas about what good “UX” and good “design” should look like. Drawing on prior HCI and CSCW scholarship, this section is of direct relevance to members of the CSCW community who are interested in how scholarship might gain traction in practical settings. This section of the paper concludes with a description of a supplementary scheme, still under development, for rating platform Terms of Service from a workers’ rights perspective. In part 5, we reflect on some key takeaways for a CSCW audience. We discuss the possibilities for “generative friction” arising from a committed CSCW praxis in an institutional setting. We reflect on the role of technical artifacts in facilitating public discussion, linking our practical work to existing academic conversations around the broadening of concepts such as use and interaction. We then highlight the need for CSCW expertise in policy conversations and public understandings about platform regulation and ‘automated management systems’ more broadly. Part 6 concludes with some further reflections on the possibilities and limits of technical systems development in the ongoing efforts to safeguard worker rights and build worker power. Drawing on our experiences both in this project and in our careers more generally, we reflect on what future work seems most pressing for those engaged in digital labor research and design.

2 Related work

2.1 Digital labor platforms and the future of worker rights

Digital labor platforms have become relatively widespread in society in a relatively short time. Uber was founded in 2009 but already in some countries “calling an Uber” is just as common as “googling”—and more common than “calling a taxi.” A wide variety of other platforms match customers and workers for a growing number of types of work; indeed today, nearly any kind of work that can be assigned and delivered online can be outsourced via a platform. “Microtask” platforms such as CrowdFlower, Spare5, and Scale API specialize in distributing small, ostensibly “low skill” work such as data entry and data processing. “Freelance” platforms such as Upwork and Freelancer.com distribute larger tasks or projects, including tasks in “high skill,” typically more highly-paid fields such as design, project management, and engineering. And a variety of platforms distribute different types of in-person work. TaskRabbit and Helpling, for example, are platforms for domestic work; Uber is a platform for transportation; and Ohlala is a platform for sex work.

As early as 2015, the World Bank was promoting freelance and microtask platforms as opportunities for developing countries to connect their citizens to global labor markets and fuel economic development (Kuek et al. 2015). Some governments have heeded this call and developed extensive programs to encourage their citizens to join freelance and microtask platforms (see e.g., Graham et al. 2017). Microtask platforms have become an integral part of global supply chains for scientific research (e.g., Bohannon 2016; Chandler et al. 2014; Gleibs 2017; Haug 2017; Marder and Fritz 2015) and artificial intelligence, including autonomous driving applications (e.g., Bradshaw 2017; Stewart 2017). Generally speaking, digital labor platforms appear to be a permanent fixture of a “new world of work” that is flexible, digital, and globally networked.

Yet this future of work poses challenges for worker rights; as a result, interest in safeguarding worker rights and building worker power in digital labor platforms has grown in recent years in Europe and North America. Uber, in particular, has become a legal battleground and a constant presence in discussions of “the future of work.” Responses to the company’s aggressive expansion have included protests, strikes, boycotts, lawsuits, and judicial reviews in Europe and North America (e.g., Chrisafis 2016; Court of Justice of the European Union 2017; Garden City Group n.d.; Hickey 2016; Mr Y Aslam, Mr J Farrar and Others -V- Uber 2016; Osborne 2016; Siddiqui 2017).

Trade unions and employer associations are heavily involved in European “future of work” debates, which address general trends such as “digitalization,” automation, the introduction of industrial robots, and networked manufacturing (“Industry 4.0” in Europe); specific technologies and applications such as autonomous driving and algorithmic personnel management; and new modes of work organization such as crowdsourcing and digital labor platforms. The EU-funded European Foundation for the Improvement of Living and Working Conditions (Eurofound), for example, published a report on “new forms of employment” with a chapter on “crowd employment” (Mandl et al. 2015) and a study of legal issues around Uber in Europe (Adam et al. 2016), and in 2018–19 will maintain an online resource to collect policy-relevant research on labor platforms. In 2016 the Comparative Labor Law and Policy Journal published a special issue on the “gig economy” and the law (De Stefano 2016a). The International Labour Organization (ILO) has made the organization and governance of work through digital labor platforms central topics in its Future of Work Centenary Initiative (e.g., International Labour Organization 2017)—and the ILO’s Global Commission on the Future of Work, which includes several heads of state, executives from global civil society organizations, labor ministers, and social scientists (but no computing researchers) is expected to issue an independent report which will discuss, among other things, measures to regulate digital labor platforms. The European Trade Union Institute’s journal, Transfer, featured a two-part special issue in 2017 on “the digital economy and its implications for labour,” with the first part focusing on “the platform economy” (Drahokoupil and Jepsen 2017). Policy makers and trade unionists have commissioned a series of—mostly country- or region-specific—studies (e.g., Dølvik and Jesnes 2017; Herr 2017; Huws et al. 2016; Schmidt 2016). In 2016, the European Commission released non-binding guidance on applying existing EU law to the “collaborative economy” (European Commission 2016), including digital labor platforms; in June 2017, the European Parliament adopted a resolution on the same topic (European Parliament 2017). In December 2017, a comprehensive report on social protection of workers in the platform economy (Forde et al. 2017), commissioned by the European Parliament’s Employment and Social Affairs Committee, was published. Generally speaking, these studies and discussions revolve around the need to ensure that workers in digital labor platforms are not denied access to basic worker rights (e.g., freedom from discrimination, or the right to free association and collective bargaining) or social protection systems (e.g., health insurance and pension systems) as a result of lax enforcement of existing regulations or as a result of regulations or social protection systems that were designed with assumptions about the employee-employer/worker-customer relationship that do not hold for these workers.

In 2016, former US National Labor Relations Board Chair Wilma Liebman noted that digital labor platforms have “triggered a conversation in the US that we really should have had for the past few decades” about the growth in “alternative” work arrangements (Liebman 2016). These workers, she noted, often work a series of short “gigs” (or multiple “gigs” simultaneously), may earn more precarious incomes than “traditional” workers (Berg 2016) and are often required to accept that they will be classified as self-employed persons rather than employees (See, e.g., Amazon Mechanical Turk 2014, especially Sec. 3a; Upwork 2016, especially Secs. 3.1, 3.2, 6.7, 8.1). As a result, these workers are excluded, sometimes unfairly, from many rights and benefits afforded to employees, including minimum wage; paid sick leave, vacation, and parental leave; overtime pay; protection from unfair dismissal; compensation in event of work-related illness or injury; employer contributions to health insurance and retirement; and the right to organize and collectively negotiate with employers or platform operators for improved rights and working conditions (De Stefano 2016b; Liebman 2016).

If digital labor platforms continue to grow, many rights won by the labor movement through centuries of struggle may be lost (e.g., Austrian Chamber of Labour et al. 2016; Benner 2015). Because declining worker power is linked to increasing economic inequality (see, e.g., Gordon and Eisenbrey 2012; Jaumotte and Buitron 2015)—and because inequality may threaten democracy (see, e.g., Houle 2009)—questions of workers’ rights and worker power in digital labor platforms raise concerns about the future of democratic institutions in an age of increasingly technologically mediated labor relations (see also, Ostrom 2000).

While trade unionists and policy makers understand these stakes, many discussions about digital labor platforms in policy settings and worker organizations are rather general, with differences between different platforms and sometimes even different types of platforms largely glossed over. Additionally, questions of measurement, enforcement, and interface and work process design are thus far underdiscussed. It is exactly here where empirical research grounded in deep understanding of both digitally enabled work processes and workers’ needs—the “sociotechnical expertise” of HCI and CSCW researchers—could richly complement the prevailing economic and regulatory focus of the trade unionists, economists, policy analysts, and policy makers whose voices currently dominate “future of work” discussions. A deeper interlinkage between, on one hand, CSCW research on digital labor platforms, and, on the other, policy research and civil society and policy discourse, holds the potential for significant mutual benefit.

2.2 CSCW’s interest in labor and technology

The implications of technology design for workers’ experience and workers’ rights has been a central concern of CSCW since its beginnings in the 1980s. European CSCW research especially has a long tradition of both theory-building and practical work aimed at improving working conditions and “increasing workplace democracy,” often in collaboration with both unions and employers (Bjerknes 1987; Ehn 1990; Greenbaum 1988; Greenbaum and Kyng 1991). In the mid-1990s, some CSCW researchers expressed concern that some aspects of this frame had been lost. Greenbaum (1996) for example called for renewed attention in CSCW to “labor issues such as wages, working conditions and division of labor” as well as the economic context of work such as ongoing “downsizing, outsourcing and insecurity about jobs.” Bjerknes and Bratteteig (1995) put the matter bluntly: “The challenge for future research [in system development] is to contribute to democracy in a changing working life and workplaces.” Since the mid-1990s the world of work has changed dramatically, with major economic and organizational trends such as globalization, flexibilization, and privatization (Stone 2006)—enabled in significant part by information technology—bringing both benefits and challenges for workers, employers, and policy makers. This project takes up these topics of long-standing interest in CSCW in a new context: the—sometimes global—labor markets created the by the technologically-enabled digital labor platforms which have emerged in the last decade.

Despite CSCW’s history of involvement in workers’ rights and concerns, much early research on ‘new’ digital labor platforms within HCI, CSCW, and human computation mainly took a customer’s or manager’s perspective, focusing primarily on reducing cost; increasing speed, reliability, or quality of work output; and expanding the kinds of tasks that could be outsourced to platforms (e.g., Dow et al. 2012; Downs et al. 2010; P. G. Ipeirotis et al. 2010; Jung and Lease 2012; Kochhar et al. 2010; Mao et al. 2013; Rao et al. 2013; Snow et al. 2008). These goals continue to be active topics of research in HCI and CSCW (e.g., Retelny et al. 2014; Valentine et al. 2017).

Since at least as early as 2010, however, researchers in these fields have also investigated workers’ experiences in digital labor platforms. Summarizing earlier results (including P. G. Ipeirotis 2010), Silberman and colleagues (Silberman et al. 2010a, 2010b), called attention to the reliance of a significant fraction of Amazon Mechanical Turk workers on the income they earned over the platform, as well as to frequently occurring problems for workers, including: uncertainty about payment, unaccountable and seemingly arbitrary rejections (i.e., non-payment), fraudulent tasks, prohibitively short time limits, long pay delays, uncommunicative requesters (i.e., customers) and administrators, costs of requester and administrator errors borne by workers, and low pay. Bederson and Quinn (2011) noted that underlying problems facing both workers and customers on Mechanical Turk is the anonymity of the participants and the relative lack of consequences for cheating behavior. Drawing on earlier work, they added to the existing list of concerns the question of “the nature and purpose of the work”: “Some tasks,” they noted, “extract small pieces of larger jobs and present them to workers out of context so that workers do not understand the purpose of the work they are doing. This is [ethically problematic] because the work might be objectionable if the worker understood its context.” In addition to discussing difficulties and complexities around non-payment and wages, Bederson and Quinn offer nine guidelines for requesters, and argue that platform operators should limit anonymity and provide a community-run grievance process for workers to request review of work they believe was unfairly rejected.

HCI and CSCW researchers have also developed third-party software for workers on digital platforms—in particular on Mechanical Turk. Irani and Silberman, for example, documented Turkopticon, a requester reputation system that collects reviews from workers, in a series of publications (Irani and Silberman 2013, 2014, 2016; Silberman and Irani 2016). In a series of independently conducted experiments, Benson et al. (2015) found that Turkopticon ratings both predict requester non-payment and to some extent “discipline” requesters (specifically, incentivize them to pay), a finding that confirms hypotheses from informal conversations with workers and requesters. Callison-Burch (2014) documented the Crowd-Workers extension, later used by Hara et al. (2018) to collect the first task-level data on worker wages on Mechanical Turk. Importantly for the discussion of workers’ experiences—in particular the question of wages—workers in Hara et al.’s sample earned an average wage of approximately USD 2 per hour, and only 4% earned more than the US minimum wage of USD 7.25 per hour, confirming and clarifying the findings of previous studies based on self-reported data.Footnote 1 And Hanrahan et al. (Hanrahan et al. 2015) documented TurkBench, a tool that provides workers with a personalized visualization of the tasks available on Mechanical Turk. The goal of the visualization is to support workers in quickly finding well-paying tasks they are interested in and in reducing unpaid time (e.g., search time, time on tasks that are ultimately not completed). Yet Hanrahan et al. (this volume) find that the proliferation of third-party tools leads to practical and technical problems: “serious” Mechanical Turk workers sometimes rely on dozens of browser extensions simultaneously to navigate the market effectively, and these extensions sometimes interfere with each other in terms of their interface elements, or by repeatedly “scraping” task information from the site. Users running many scripts that repeatedly scrape the site may be automatically throttled by the server, preventing them from completing further work—yet scripts are crucial tools that help “serious” workers find the well-paying tasks they need to earn decent wages. Hanrahan et al. note that solving these problems may require more “holistic”—i.e., centralized—solutions rather than further third-party software development.

In addition to the work in which computing researchers have developed software to investigate and intervene in existing markets, described above, the last few years have seen a remarkable growth in a diversity of computing research studying workers’ experiences on digital labor platforms (e.g., Gadiraju et al. 2016; Gupta et al. 2014; Lease et al. 2013; Lee et al. 2015; Martin et al. 2014, 2016; Raval and Dourish 2016), enumerating criteria for desirable platform work (e.g., LaPlante and Silberman 2015; O’Neill and Martin 2013; Sarasua and Thimm 2014), and exploring worker ownership and governance of forums (e.g., LaPlante and Silberman 2016) and platforms (Gaikwad et al. 2015, 2016, 2017; Sriraman 2016; Whiting et al. 2017a, b). Yet the challenges identified by Silberman et al. (2010a, b), Bederson and Quinn (2011), and others (e.g., P. Ipeirotis 2012; Kittur et al. 2013) largely remain—and indeed can now be identified on new platforms other than the intensively-studied Mechanical Turk (and, more recently, Uber). And while the technical interventions developed by HCI and CSCW researchers have produced real material benefits for workers, this mode of engagement has its limits—as Hanrahan et al.’s findings (this volume) make grimly clear. Third-party developers simply have much less scope for changing platform working conditions than platform operators themselves. But platform operators’ first priority is often to stay in business, and, in some cases, to deliver a return to investors—and this may mean prioritizing the needs of paying customers ahead of workers’ needs. If working conditions in digital labor platforms are to be meaningfully and sustainably improved, other actors, such as trade unions and regulators, may need to find ways to enable and encourage platform operators to make design and business process changes that improve workers’ experiences.

3 Practical questions and project goals

In reading a paper published in an academic venue, one often asks, “What are the research questions motivating this study?” We should be clear that because of the institutional context in which the work described in Part 4 was initiated, the work did not begin with research questions, nor was it organized as a single study or a series of studies.

Rather, co-author Silberman arrived in the “Crowdsourcing Project” at the trade union after the project was already underway and had already produced some internationally publicized outputs—namely, the edited volume Crowdwork—zurück in die Zukunft? Perspektiven digitaler Arbeit [Crowd Work: Back to the Future? Perspectives on Digital Labor] (Benner 2015), which touched off the public debate about digital labor platforms in Germany, and the first version of faircrowdwork.org. Inquiries from a variety of internal and external stakeholders had already begun to flow in: inquiries from journalists and policy makers about the union’s activities and positions on the topic; inquiries from within the organization and from other trade unions and worker organizations for support and training; requests for collaboration from researchers; and, with less frequency, requests for information or support from workers. The project’s main stated goals, however, were to develop fair standards for working conditions in digital labor platforms, to organize German “platform workers,” and to make steps toward implementing the standards at a national level—and preferably quickly.

This strategy was motivated by union leadership’s overall approach to platform work in Germany, which could be summarized roughly as, “We welcome it, but it must be socially responsible. We will tell you what we mean by ‘socially responsible.’” (This was in contrast to many workers’ and platform operators’ expectations of what the union’s attitude would be; many expected the union to be extremely critical and seek to prevent platform-based work from growing in Germany at all.) While the project team had some general ideas at the beginning about “fair standards” for platform work, we had a shortage of detailed knowledge about specific platforms and about German platform workers’ situations, motivations, and experiences. This situation led quickly to the emergence of a collection of questions (see Table 1). These questions motivate the work described in Part 4. Some of them can be understood as research questions in the classical scientific sense—i.e., as questions to be resolved through careful investigation with well-understood methods—while others are questions to be answered in practice and in particular rather than in the abstract.

Table 1 Union questions and concerns.

The global nature of the internet complicated matters even further. We knew from the beginning that action limited to the national level would have limited effect in the long term. Even in a scenario of “wild success” within Germany, the question remains of what is happening outside of Germany. Despite the fact that most labor regulation, even in Europe, is national, in a globally connected economy, no nation is an island. Cost is a major factor driving outsourcing generally; if clients, including German clients, can get work done more cheaply outside of Germany, it could be quite challenging to persuade them to outsource to “socially responsible” German platforms. Yet we knew that we needed to “act locally” first.

Through the work described in Part 4 we have arrived at rough working answers to the questions above—for workers and platforms operating within Germany. Yet there is much room for more research still even within Germany—and definitely within Europe and globally.

4 Rating working conditions

In 2015, Vanessa Barth, Christiane Benner and colleagues at the German Metalworkers’ Union (IG Metall) launched faircrowdwork.org, a website intended to let workers on digital labor platforms review various aspects of the working conditions on the platforms on which they worked. The general goal behind the website’s design was to create a place for workers to post and read reviews of digital labor platforms. The target users were current or prospective platform workers who would like to make better-informed decisions about which platforms on which to work. The platform reviews on this original site had two main sections: worker reviews and a “terms of service check.” The worker reviews section contained workers’ ratings of various labor platforms along five dimensions (pay, tasks, communication, evaluation, and technology) using a five-star system. The site also allowed users to leave free-text comments, but these were displayed separately and were not associated with the ratings. The terms of service check consisted of a legal review by union lawyers of platforms’ official Terms of Service with regard to whether they met a set of five criteria about workers’ rights.

Although the original 2015 platform launch attracted a great deal of interest from media, policy makers, and other trade unionists, the platform faced three major challenges. First, the platform launch attracted a great deal of interest from media, policy makers, and other trade unionists — that is, not workers. Second, it was impossible—at scale—to verify that every user submitting a review for a particular labor platform had actually worked on that platform. Third—like the initial design of Turkopticon (see, e.g., Irani and Silberman 2013), by which it was inspired—faircrowdwork.org asked workers to rate platforms by assigning “star ratings” to different aspects of their experiences on the platform (e.g., pay, communication, tasks). However, as Silberman and Irani (2016) wrote seven years after Turkopticon’s launch, the meaning of a particular numerical rating is often unclear. A worker relying on platform income to pay rent often has a very different idea of what counts as “five stars” for pay than, for example, a casual worker who only uses platform income to buy “nice extras.” Disagreements over the meanings of numerical ratings risk turning reviews into sites of goodwill- and solidarity-destroying arguments between workers who otherwise share common interests.

Drawing together these and other “lessons from Turkopticon” (Silberman and Irani 2016) with a stakeholder engagement process that included input from labor unions, international worker organizations, journalists, and workers themselves, we redesigned both the rating scheme and the larger Fair Crowd Work website. The new site includes a Crowd Work FAQ, “know your rights” pages for German and Austrian workers, and a news section, in addition to the redesigned platform reviews. The star ratings are one component of the platform reviews, which now have three main sections: “Platform Details,” “Worker Reviews,” and “Terms of Service Check.”

This broader redesign speaks to the first ‘problem’ – a site ostensibly for workers was, in fact, attracting the most attention (and use) from non-workers. At the outset of the redesign, the union, focused on serving workers’ interests, was still intent on making a technology whose users were understood to be workers themselves. Inspired by their acquaintance with the Turkopticon system, there was a general desire that the new website could function in the same way, but at a larger scale, and directed towards platform operators rather than requesters. A CSCW-inflected stakeholder engagement process,Footnote 2 and series of goal-elicitation activities, goal-stakeholder-maps, and planning documents helped to expand and redefine the potential ‘user’ base to include a broader set of stakeholders, and to then actually carry out the design and implementation work to better serve the diverse audiences of the site (see, e.g., Figure 1). The process also allowed the union to more explicitly situate and link the project within its own broader program of activities in support of improving online labor conditions (Silberman et al. 2017). No longer a small project ‘just’ for a few workers, the website could now be understood as directly connected to ongoing outreach to platform operators and political lobbying.

Figure 1
figure 1

Strategic goal mapping document, November 2016.

Ultimately, this resulted in a reshaping of the website from a ‘site for workers’ to a ‘portal about working conditions on platforms’—something more useful to the union than that which was originally envisioned. At a practical level, the site’s homepage foregrounds a series of launching points for different audiences: “Advice” and “Know Your Rights” sections for workers, and a “Learn about Crowdwork” section for journalists and policymakers. Responding to the practical union goals of aiming to increase individual worker membership – and raising awareness that it is even possible to join a union as an independent worker – a large “Did you know there are unions for crowdworkers?” header and “Join a Union” button take center stage on the landing page.

Taking a step back to actually make system goals explicit – perhaps a ‘standard’ ‘practice’ if one is reading a user-centered-design textbook, but often skipped in actual practice – was crucial for the site’s transformation. While a large amount of “User Experience” work in industry centers on graphic design work more than anything else, a commitment to broad stakeholder engagement was valuable in this case. We reflect on this further in the discussion, but, in short, the site is now able to serve a crucial role as a something rather different than a tool for workers’ immediate use. Instead it often plays the role of conversation starter. As an artifact that can circulate and spur discussion, it becomes something to which one might point in making a policy or other argument—and indeed we have seen it referred to in policy debates about “the future of work” in Berlin, Brussels, Stockholm, Vienna, Geneva and so on. In this way – and in contrast to a tool designed for individual workers to use directly – the site seems likely to contribute more indirectly, but with broader impact, to the cause of improving working conditions.

Despite the broadening and shifting of site goals, the platform reviews still serve as a central feature of the site. Major changes to the platform review process respond to the two remaining problems that emerged with the first version of the Fair Crowd Work website. Instead of asking visitors to our site to assign numerical “star” ratings to different aspects of a given platform (between one and five stars each for pay, evaluation, communication, tasks, and technology), the information presented in the “Worker Reviews” section for a given platform is now based on data collected through a detailed 95-question survey that we distributed to workers in 2016 and 2017. First, to prevent people who had never worked on a platform from reviewing that platform (and to prevent individual workers from reviewing one platform multiple times), we distributed the surveys through the platforms being reviewed. Review data for any given platform is collected from workers recruited through that platform, rather than through our own site. Absent active deception from platform operators, we can therefore expect that people providing review data are workers, and that each worker is only providing one review. Second, to avoid asking workers to assign potentially ambiguous numerical ratings to the platforms, we ask workers concrete questions about their personal experiences working on the platform. We then developed a fixed process to transform survey responses into numerical ratings.

In the next three sections, we describe in more detail our survey method, rating computation scheme, and broader re-design of the platform review page. This descriptive case includes two reflections on the practical and ethical challenges of conducting research via online platforms, and an account of our own strategies and recommended practices for surveying platform-based workers.

4.1 Designing and conducting the survey

In order to develop ratings about working conditions based on concrete worker experiences rather than ambiguous star ratings, we designed a 95-question survey comprising 7 sections: one for each of the five dimensions, and two sets of questions about more general experiences and opinions:Footnote 3

  1. 1.

    General Experiences as a Worker: length of time working on the platform, reasons for working on the platform, hours worked per week, worker sense of autonomy and control over work process and scheduling.

  2. 2.

    Platform Tasks: quality and character of tasks on the platform rated on six dimensions: meaningful, physically dangerous, interesting, psychologically harmful, satisfying, ethically questionable, fun.

  3. 3.

    Pay: current week’s and typical week’s pay, including tips and bonuses.

  4. 4.

    Communication: assessment of communication with management, clients, and other workers — is it possible? If so, is communication prompt, useful, and respectful?

  5. 5.

    Ratings & Evaluations: perceived fairness of ratings or evaluations of workers; workers’ experiences with contesting unfair evaluation and rejection of work.

  6. 6.

    Technology on the Platform: usability and reliability of platform technology (website or mobile app).

  7. 7.

    General Likes, Dislikes, and Comments: a free-response section for workers to highlight their own primary concerns and likes.

In conducting a survey using online platforms, two challenges – each with practical and ethical dimensions – are important to discuss: worker attention and buy-in and recruiting and paying workers.

4.1.1 Worker attention and buy-in

Workers’ lack of attention to survey details on online platforms is a significant issue for anyone using these platforms to conduct research. While attention checks are one standard way of attempting to handle this (see, e.g., Hauser and Schwarz 2016), experienced “Turkers” (as many experienced Mechanical Turk workers call themselves) occasionally point out that they are wise to many of these tactics, and know how to identify attention checks quickly (see, e.g., Brian (@xyderias) et al. 2017). Methodological research suggests that attention checks may have a net effect of degrading rather than improving data quality (Vannette 2017). Some workers say that seeing multiple attention checks in a survey leads them to feel that the researcher does not trust them, leading to reduced morale and engagement with the surveyFootnote 4. Perhaps most importantly, however, workers completing surveys on online labor platforms are engaged in piecework. Working more quickly (and less carefully) is economically rational and efficient. While platforms may appear to researchers primarily as a convenient way to collect large data sets quickly with apparently diverse populations (by typical demographic measures), they are workplaces for the people completing their tasks. In contrast to some rhetoric that workers on online platforms are only there to make a little ‘extra spending money,’ research has consistently shown that many workers are, in fact, dependent on platform-based work as a key source of income to meet their basic needs. A 2016 Pew Research study, for example, found that “More than half of platform workers describe the income they earn as being either essential to meeting their basic needs (29%) or as an important component of their overall budget (27%)” (Smith 2016). Research by the International Labour Organization found that crowdwork was the primary source of income for 37% of Mechanical Turk and CrowdFlower workers (Berg 2016). In a larger and more geographically diverse 2017 ILO study of five platforms, this fraction was almost half (Rani 2017).

This situation thus fundamentally reconfigures the participant-researcher relationship in contrast to both psychology student pools as well as randomized phone calls or mailers. It is not just a relationship grounded in generosity and a desire or sense of obligation to contribute to science; instead, the research instrument is embedded in what is fundamentally a labor relation for many participants. It is crucial to take this relationship into account when using these platforms for research, as several issues arise directly from it. We believe that taking this labor relation seriously is one way to help mitigate attention issues and the validity concerns that arise from them.

We therefore decided to pay €10 per surveyFootnote 5 for a survey with an average completion time of 25 min. Our goal was to pay no less than €15/h even for those who spent the most time answering thoughtfully. We received several free text comments from workers thanking us for respecting their time and paying a good amount—especially on some of the lower paying platforms. We surmise that this respect works both ways and believe that it positively influenced the quality of answers we received in our survey. By first demonstrating our respect for workers’ time by paying appropriately to the task, we were in a better position for workers to respect us in return by answering questions carefully and fully. By paying appropriately for workers’ labor time, we also aimed to mitigate some of their need to speed through the survey and get on to the next money-earning task as quickly as possible.

Moreover, there is an ethical component to the question of how much to pay workers for their participation in a platform-based survey. It might seem acceptable to pay nothing at all for a short survey, or to give a $5 gift card as a token of gratitude for participation in a 60-min interview in traditional research settings. However, this is less acceptable if one recognizes that the online platform is fundamentally a place of work. Paying anything under a local minimum wage is wholly inappropriate in a labor context (Silberman et al. 2018); and for researchers who otherwise might advocate for higher minimum wages (e.g., who are in support of the “fight for fifteen” campaigns in the United States), it would follow that surveys posted on online platforms should also pay a comparable amount.

With regard to the survey we discuss here, it is also notable that we were conducting a survey about platform work itself. Many workers indicated in their responses that they were happy and excited to voice their opinions and share their experiences as workers on online platforms. We think this personal concern for the topic also increased workers’ willingness to read questions more carefully and answer them more thoughtfully.

We also offered workers the option to subscribe to a mailing list for information about the survey results, and interest in this was fairly high.Footnote 6 Given the positive responses to this offer, we would recommend communicating results as another way to increase worker buy-in for a survey on any topic.

Despite our efforts, we still received some obviously bad responses. For example, some workers reported working zero hours in one question but getting paid a non-zero amount in another question. Other workers reported a non-zero number of hours worked, while reporting that they were paid zero dollars, but also had never experienced ‘non-payment.’ These logically false responses were easy to weed out after the fact, and, as with any survey, some data cleaning was necessary. However, in the case of online platform-based surveys, it is especially important to keep the labor relation in mind as a force that fundamentally shapes the participants’ actions. Many workers on online platforms are clicking through surveys quickly—and who can blame them? They are trying to string together a living wage from what is often very low-paid piece work. Faster completion is the only way to better pay.

4.1.2 Creative strategies for recruiting and pay

Some platforms are easy to use for running surveys. However, on three platforms – Upwork, Mechanical Turk, and CrowdFlower – we developed more creative strategies to either recruit or pay workers:

Upwork

Upwork is a freelance platform, not a microtask platform. Typical jobs available through Upwork are project-based individual tasks like software development, web design, and writing. From the client’s perspective, the entire Upwork platform is designed around finding, selecting, hiring, communicating with, and paying individual freelancers. This contrasts sharply to microtask platforms, where clients can programmatically recruit hundreds or even thousands of mostly anonymous workers to a task. On microtask platforms, recruitment, worker screening, review, and payment are handled “en masse”: clients do not recruit workers individually, but rather specify criteria which workers must meet to be eligible to perform a particular task. Any worker meeting the criteria can in principle perform the task; the question of which workers actually end up doing the task is decided simply on a “first come, first serve” basis. Microtask platform clients specify the number of respondents to be recruited to a survey simply by typing the desired number into a text field.

On Upwork, however, the client must recruit each respondent individually, explaining the nature, background, and motivation of the task to each potential freelancer before they accept it. (And freelancers are of course free to decline to participate.) After a client posts a job, it appears in the job listings and job search results and freelancers can apply to it. Additionally, the client may browse and search for individual freelancers and invite them to apply. The client may hire any freelancer who applies for the job, and may make use of any selection process or criteria they like. Upwork provides a built-in messaging function with a chat-like interface. Clients may use this function to communicate with freelancers who have applied to a job and can therefore “interview” freelancers via chat before hiring them. Clients may also use the messaging function to ask freelancers to communicate outside the platform, for example via Skype. After a freelancer is hired, clients can use the messaging function to support or direct them while they do the job. After the job is done, the freelancer may make use of a ‘request payment’ function built into the platform, but they may additionally notify the client through the messaging function. Clients receive in-platform notifications as well as emails when they receive a message through the platform’s messaging function.

We leveraged the flexibility in this task posting and client-freelancer negotiation process to adapt the site’s features to our need to recruit multiple workers for a single survey task. Our recruitment of Upwork workers for our survey task was greatly aided by documentation provided by Vili Lehdonvirta and colleagues, who have recruited hundreds of Upwork workers for interviews (Lehdonvirta, pers. comm.; see also Wood et al. 2016). Briefly, we created an invite-only job on Upwork with a fixed payment of $10 and the following description:

We invite you to participate in an online survey about your experiences working on Upwork. The survey will take 20 to 30 minutes. We will pay you US $10 for your participation. Your personal information will be kept confidential. This task is being carried out as part of a research project into online work funded by several European worker organizations.

We searched for workers by region, focusing in our small sample of 23 Upwork workers on Germany, Austria, Sweden, and eastern Europe. We attempted to ensure that these four geographical “subsamples” were approximately balanced by gender. Most workers who replied to our invitations asked for more information; we replied with additional information about the project and a link to the survey. All workers except two who ended up completing the survey had earned at least $1000 on the platform and all but one had “job success” ratings of 92% or more. We paid each worker after they completed the survey, and gave bonuses to several who offered useful feedback.

Amazon Mechanical Turk

In contrast to Upwork, it is very easy to run large-scale surveys on Amazon Mechanical Turk (AMT). In fact, it is almost too easy. We posted the survey from an account that has a good history of paying workers fairly and promptly for surveys that we have conducted previously. To no surprise, then, all HITs (“Human Intelligence Tasks” – the name for a task posted on Mechanical Turk) for our relatively high-paying task were picked up very quickly, within a few minutes of posting it on the platform. We knew that this meant our responses were likely to be biased towards workers who were some of the most professional Turkers – people using browser add-ons and scripts to locate and claim the best-paying tasks from trustworthy requesters.

In order to capture a broader set of experiences of working conditions on the platform, we decided to conduct a second deployment to target less professional Turkers. Creating a less-appealing task was fairly easy – we just cut the posted pay down to $2, and then gave each worker an $8 bonus after completion to bring their pay up to the same amount as the original participants. In the spirit of transparency to workers, we explained this bonus scheme – and the reasoning behind it – in the description of the HIT. This HIT took several hours to get fully picked up, suggesting that the strategy did work to filter out those Turkers using automated scripts to quickly identify high-paying tasks.

However, screening out prior participants from the new lower-paying HIT was less straightforward. This is a common problem for researchers using AMT, and we have heard of several strategies for overcoming it, including the most straightforward – blocking workers who have completed a task for you previously. However, blocking workers is not a good strategy to use from a workers’ rights perspective. Blocking is described by Amazon as a way to prevent workers who “aren’t performing to your standards” from completing your tasks. Footnote 7 While it is a common belief that blocking has no ill effect on workers – e.g., does not affect their ability to complete tasks by other requesters – the message Amazon sends to blocked workers indicates otherwise.Footnote 8 Instead of blocking, the most worker-friendly way we know of to filter workers from a specific task is to make use of Amazon’s qualification feature.Footnote 9 As a requester, one can grant a custom qualification to all Turkers who complete a particular task – in our case, our first round of the survey. The setup for this can be done after the first task is completed and does not need to be planned in advance. For the benefit of workers, the qualification should be named and clearly described to workers so they understand what it represents. We were able to grant qualifications to all workers in a batch by using Amazon’s CSV export/import feature with the first set of survey respondents.Footnote 10 When listing the survey a second time, we simply used the standard requester interface to screen out all workers with the qualification we had just created.

CrowdFlower

CrowdFlower presents the somewhat peculiar problem of limiting the total task pay to only $5. In order to attempt to pay all survey respondents the same $10 amount, we used the manual bonus feature to go through and grant each survey taker a $5 bonus after the fact. Notably, however, CrowdFlower does not pay all workers in dollars, instead sometimes rewarding workers with points or other forms of non-currency-based payment. Depending on what “channel” a worker uses to access CrowdFlower, it is possible that they are not even paid the full amount of the task. To our knowledge there is no way around these non-standard payment mechanisms, and CrowdFlower recommends that you not mention how much you are paying for a task in the description because all workers might not receive that amount. Nonetheless, in an attempt to respect the employment relationship, we stated our intentions and our bonus scheme clearly in the description of the task on CrowdFlower.

4.2 Computing the ratings

Once the survey was complete, we could then use the data to compute the star ratings for the platform reviews. The calculation scheme we describe here codifies a set of values about what might constitute ‘good’ working conditions on an online labor platform that has been developed in line with existing labor union policies on good working conditions, as well as more specific input from crowdworkers in prior research.

The remainder of this section goes into extensive detail as to how these ratings are calculated. Though we realize this detail may only be of interest to a small set of readers, we describe the process thoroughly for several reasons. First, we hope that anyone attempting to reproduce or revise this method, might be able to do so from reading this paper. Second, a political value of transparency is central to our own CSCW praxis. We provide this same documentation on the website itself as an explicit choice that stands in marked contrast to the ways that many familiar rating or recommender systems work. They do not give the details about the ‘algorithm’ or process by which a rating or recommendation is computed, instead aiming to derive value in some way through the secrecy of the process (or perhaps, by the allusion to magic). However, we have no trade secrets to protect, here; we would rather like it if others took up – and improved upon – this process. What variables are we not taking into account that we should? What data points are we over-emphasizing? We also believe it is of paramount importance that the ratings are tied to the specific and concrete data points which they attempt to reflect. A rating of working conditions is not an arbitrary value judgement; it is an attempt at a meaningful abstract representation of a specific set of conditions under which people work. These conditions are important, and carefully chosen, and changing them is the target of our work. If a platform operator, for example, wanted to ‘game’ this system by starting to guarantee a higher minimum wage to its workers, then we are completely okay with that.

The current design produces six ratings on a 1- to 5-star scale (with “half-star” ratings possible): one for each “dimension” (pay, communication, evaluation of work, tasks, and technology) and an overall rating computed from a weighted average of the “dimension” ratings.Footnote 11

The overall rating is a weighted average of the five aspect ratings. Specifically, the pay rating is weighted twice as heavily as the other aspect ratings. The overall rating R is calculated as follows:

$$ \mathbf{R}\kern0.5em =\kern0.5em \left(\mathbf{2}\mathbf{P}\kern0.5em +\kern0.5em \mathbf{C}\kern0.5em +\kern0.5em \mathbf{E}\kern0.5em +\kern0.5em \mathbf{K}\kern0.5em +\kern0.5em \mathbf{T}\right)\kern0.5em /\kern0.5em \mathbf{6} $$

where P is the pay rating, C is the communication rating, E is the evaluation rating, K is the tasks rating, and T is the technology rating.

Each “dimension” rating is computed based on responses to the survey of workers’ experiences, with multiple questions that correspond to each dimension (i.e., there is a set of questions about pay, another about tasks, another about communication, etc.). The calculations for the “dimension” ratings are described in detail below.

Pay rating

The pay rating is based on the median hourly wage reported by survey respondents. Table describes the pay rating for a range of median hourly wages.

Table 2 Pay ratings corresponding to median hourly wages.

Communication rating

The communication rating is a simple average of three “sub-ratings”: communication with management, communication with clients, and communication with other workers.

If half or more of workers say that communication with management is “not possible,” or that management “never” responds to their communications, the “communication with management” sub-rating is 1 star. Otherwise, it is the simple average of the responses from workers to two questions:

  • “How often does management answer your questions?” and

  • the three-part array question “When management responds to your questions, how often are their answers...?”

In all cases, “Never” answers are coded as 1 star, “Always” as 5 stars, and the answers in between as the numbers in between. Specifically, all answers are added together and divided by 4 N, where N is the number of workers responding to the survey on this platform.

The “communication with clients” sub-rating is calculated with roughly the same scheme as the “communication with management” sub-rating. Specifically, if half or more of workers say that communication with clients is “not possible,” or that clients “never” respond to their communications, the “communication with clients” sub-rating is 1 star. Otherwise, it is the simple average of responses to “How often do clients respond?” and the following three-part array question.

Similar to the previous two sub-ratings, if half or more of workers say that communication with other workers through official channels is not possible, this sub-rating is 1 star. Otherwise, it is the average of the answers to the three-part array question about worker communication in official channels.

Evaluation rating

If half or more workers say that clients cannot evaluate workers’ work, or half or more of the answers from workers to the four-part array question “How often would you say that clients’ reviews, ratings, or evaluations are...?” are “I don’t know,” then there is no evaluation rating. In this case the evaluation rating is not displayed as 0 stars but as the text “N/A” or similar. In this case the overall rating is calculated as:

$$ \mathbf{R}\kern0.5em =\kern0.5em \left(\mathbf{2}\mathbf{P}\kern0.5em +\kern0.5em \mathbf{C}\kern0.5em +\kern0.5em \mathbf{K}\kern0.5em +\kern0.5em \mathbf{T}\right)\kern0.5em /\kern0.5em \mathbf{5} $$

Otherwise, the evaluation rating is calculated in five steps. First, a “base” evaluation rating is calculated. It is the simple average of the answers from workers to the four-part array question “How often would you say that clients’ reviews, ratings, or evaluations are...?” “Never” is coded as 1 star and “Always” as 5 stars. Then:

  • If workers cannot give feedback about clients to management (i.e., if more than half of workers reply “no” to the question “Can you give feedback about clients to management?”), one star is deducted.

  • If clients do not have to give good reasons for leaving negative ratings (i.e., if more than half of workers reply “no” to the question “Do clients have to give good reasons for leaving negative ratings?”), one star is deducted.

  • If more than half of workers answer the question “If you contest a wrong evaluation through official channels, how often are you taken seriously?” with “Never” or “Less than half the time,” one star is deducted.

  • If the rating is less than zero, it is set to zero.

Tasks rating

The tasks rating is the simple average of the responses from workers to the eight-part array question “How often is the work you do...?” “Positive” sub-questions (“meaningful,” “interesting,” “satisfying,” and “fun”) are coded with “Never” as 1 star and “Always” as 5 stars. The other, “negative” sub-questions use the reverse coding (“Never” is 5 stars).

Technology rating

The technology rating is the simple average of the responses from workers to the three technology questions (“Would you describe the technology as...?”). “Yes” is coded as 5 stars, “It’s complicated” as 3 stars, and “No” as 1 star.

4.3 Designing the review format: putting the ratings in context

In the new Fair Crowd Work website, the star ratings are embedded in a larger ‘Worker Review’ section, which contains two other sections: a Terms of Service Check, a carryover from the previous site (see below for discussion); and a Platform Details section, which is new. In contrast to the recent redesign of TurkopticonFootnote 12 – which abolished star ratings – we chose to continue using them in the new version of Fair Crowd Work, in order to leverage their political potential for making a clear and iconic statement about online working conditions, interpretable by multiple audiences at a glance. However, by also situating these ratings in the larger context of the review and platform profile, we hope to mitigate some of the issues with star ratings as a sole communication tool.

In the case of Turkopticon 2.0, abolishing star ratings made sense because the Turkers using the site are experts. They know what they are looking for and have their own ideas about what is a good task and what is not a good task. They come to the website to get just the ‘facts’ about tasks. However, in the space of rating platforms more generally, ideas about ‘what is good’ and ‘what is not good’ are not yet widely agreed upon. The salient criteria are not even yet agreed upon, much less ‘thresholds’ or ‘values’ for ‘good’ and ‘bad’ for the different criteria.

In our case, by mapping a complex set of questions to something simple like a five-star rating system, we are making a political statement. We are defining what is good and what is not good. The design of the rating scheme is intentionally a political act, an attempt to leverage star ratings as a kind of ‘charismatic’ metric (Pine and Liboiron 2015). We do not claim that our rating scheme is a perfect definition of good or bad platform processes and conditions; rather, it is a reflection and an articulation of our current understanding and a provocation for improvement and further discussion.

In order to contextualize these ratings, and add transparency and detail to the reviews, the star ratings themselves are presented within a ‘worker review’ section of the more general platform review. Thus, they serve both to give a quick at-a-glance overview of working conditions, and also to index five subsections of the worker review, which correspond to each of the rating dimensions (see Figure 2). Within each subsection, we draw attention to the specific reasons for a particular rating (see e.g. Figure 3), and highlight quotes from surveyed workers which give context to the rating and draw attention to the real stories of workers’ experiences which motivate our own political work. The worker review section also contains two summary subsections titled “Things Workers Like” and “Worker Concerns” in which we expand on key benefits and drawbacks to each platform. The addition of these two sections allows the review to highlight key points raised across the survey responses that may not have fit into the five “dimensions” (pay, evaluation, communication, tasks, and technology).

Figure 2
figure 2

Displaying the platform working condition ratings in context of the platform profile.

Figure 3
figure 3

Additional information about the pay rating, based on responses to the worker survey, as displayed in the platform profile.

Two other sections round out the review – a platform details section, and a terms of service check. The Platform Details section was added to provide background information about each platform (e.g., year of founding, history, work process, business model) in a structured manner. This also provides space within the platform review to include concerns from a worker rights and labor perspective that may not have been surfaced in the worker surveys. For example, we include citations to academic research and journalistic work examining working conditions on labor platforms. The Terms of Service check allows us to further assess the working conditions of an online platform from a labor and union perspective. Assessing the terms of service provides a way to analyze the worker-friendliness of the official specifications about the work process as well as to assess the fairness of platform rules for workers. Currently, the Terms of Service check has been carried over directly from the previous site; however, revising this portion of the platform reviews is part of our planned future work.

4.3.1 Ongoing work: evaluating the terms of service

We have begun to refine and expand the criteria for evaluating platform terms of service. The existing criteria, carried over from the 2015 version of faircrowdwork.org, consider whether nonpayment is allowed; whether platforms may unilaterally change their terms of service; whether a worker has the right to try to improve rejected work; and whether contact between workers, or between workers and clients, is prohibited. Just as the new version of the rating system seeks to engage more fully with the diversity of work and work processes on various labor platforms, we are working to expand the terms of service criteria to capture the impressive variety of legal strategies platform operators use to influence workers’ and clients’ behavior. From a trade union perspective, these strategies range from clauses that appear well-designed to protect workers’ rights to clauses that seem frankly exploitative.

Like the new working conditions rating scheme, the new terms of service rating scheme is more complex than the old one but is designed to allow non-expert users to quickly “get a sense” for how “worker friendly” a platform’s terms and conditions are. As with the star ratings, the definition of “worker friendly” in use is our definition; the scheme itself is an intentionally political act, a public articulation of our current understanding of what is good and what is not good. Below we describe the rating scheme “backwards,” beginning with the end results.

At the end of the rating process, each “set” of terms receives a numerical score between 0 and 100 and a “human-comprehensible” rating of “Exemplary,” “Good,” “Acceptable,” or “Bad.” These terms correspond to numerical ranges within the overall range (0 to 100). For example, “Bad” may correspond to the range 0 to 40, “Acceptable” to 41 to 65, “Good” to 66 to 95, and “Exemplary” to 96 to 100. If they accept our definition of “worker friendly” (or if they trust us), a non-expert user can “get a sense” for how worker friendly a platform’s terms are with the “word rating” alone. The score adds a small amount of additional information for the numerically inclined.

The score is calculated by checking the set of terms in question against a rubric. The rubric contains over a dozen criteria, such as:

  • conditions under which the platform may close or delete a worker’s account

  • conditions under which a client may choose not to pay for submitted work

  • time limits for review and payment of submitted work

  • conditions under which a worker may withdraw funds from their account

  • processes for changing the terms of service

  • processes for potentially psychologically harmful tasks (e.g., review of violent or sexually explicit content)

  • confidentiality requirements

  • prohibitions or limitations on subcontracting

  • processes and conditions for the transfer of intellectual property

Each set of terms is assigned a rating of “Exemplary,” “Good,” “Acceptable,” or “Bad” for each criterion. The rubric explains the conditions under which a set of terms receives a given rating for a given criterion. For example, the following is the draft rubric for the criterion evaluating rejection or nonpayment of submitted work:

  • The terms receive a rating of “Bad” for this criterion if any of the following are true:

  • the platform or client is not required to give a reason for rejection

  • there is no process for redoing or resubmitting rejected work

  • there is a process for redoing rejected work, but the time limit is less than one day

  • The terms receive a rating of “Acceptable” for this criterion if all of the following are true:

  • The platform agrees to ensure that the worker receives a clear and reasonable explanation for rejected work

  • In general, workers may attempt to resubmit rejected work at least once

  • In general, workers have at least one day to resubmit rejected work

  • Exceptions to the above two points (for example, due to tight project timelines) are clearly labeled in the task instructions

  • The terms receive a rating of “Good” for this criterion if all of the following are true:

  • All requirements for “Acceptable” are met

  • Workers may contest rejection; contested rejections are reviewed by a human platform employee or qualified worker

  • The terms receive a rating of “Exemplary” for this criterion of all of the following are true:

  • All requirements for “Good” are met

  • If a worker contests the platform’s rejection decision, the case is reviewed (a second time) by a neutral third party, who makes a binding decision

  • The platform agrees not to punish the worker in any way if the third party decides in favor of the worker

Some criteria are more important than others; out of the total possible score of 100, more points are allotted to more important criteria, and fewer to less important ones. The most points are earned for an “Exemplary” rating; slightly fewer for “Good,” and so on. A set of terms which earns an “Exemplary” rating on all criteria would earn a perfect score of 100, and thus earn an overall rating of “Exemplary.”

When completed, the rubric and scoring system will be published on the Fair Crowd Work site, along with evaluations of several platform terms.

5 Discussion

Beyond the practical outcomes of this work – the artifacts and processes developed – this project highlights several important themes at the intersection of CSCW research and technical practice. First, we discuss the ‘generative friction’ that results from attempts to enact a CSCW praxis in the context of a large established labor organization – a formidable institution! Second, we reflect on the value of technical artifacts for precipitating discussion. Finally, we call for CSCW scholars to bring their expertise to policy conversations and public debate about the regulation and understanding of labor platforms – and automated management systems more generally.

5.1 The generative friction of CSCW praxis in organizations

Neither the human-centered computing perspective nor the “accepted” processes of human-centered design were taken for granted within the trade union. For example, time pressure within the organization led to the previous version of faircrowdwork.org being designed and developed without systematically consulting potential users (i.e., workers). Moreover, the extreme time (or personnel) shortages and organizational pressures mitigated against a human-centered approach. The previous version was designed in dialogue with an external consultant who is an expert in communication strategy and has worked with the trade union for many years. Technical development was contracted further away from the organization, to a developer in the communication consultant’s network. The developer was not in regular direct contact with the union project managers or with potential users of the site (e.g., platform workers). Project managers inside the organization had too many other responsibilities to have time, for example, to create processes by which the site designer and developer could receive feedback from users while iterating on the site design. Rather, a requirements specification was agreed upon between the project managers inside the organization and the designer; the specification was then passed to the developer.

Time pressures persisted during the redesign, but we argued for more time to talk to workers – and more time to explicitly articulate and consider the views of a much broader set of stakeholders: journalists, policymakers, and other unionists. During the redesign, we had not only to create an alternative process but first to understand what had occurred during the development of the previous version, to articulate possible alternative processes, and then to articulate the case for them. While other simultaneous activities within the “Crowdsourcing Project” meant time was still short and it was not possible to carry out a “fully” human-centered design process with crowd workers, our “human-centered sensibilities” inclined us to make steps in that direction.

The ongoing interaction between organizational goals and time pressures, on one hand, and our inclination, stemming from our training in academic HCI and CSCW, to—to oversimplify the matter a bit—try to ‘talk with everyone and figure out what they needed,’ on the other hand, created a generative friction that led to an outcome that we as practitioners, union leadership, and external stakeholders recognized as a significant improvement over the previous design, and an important contribution to the union’s ongoing efforts.

5.2 Precipitating discussion

One of the biggest challenges at the start of the redesign process was to articulate and build consensus around a shared idea of the technical artifact: what, exactly could and should a website do? For whom could it be useful? How would it have impact? As discussed in section 4, the original version of site was envisioned as a tool for workers; yet, these were not, ultimately, its primary users. Should we redesign the site to better serve workers directly? Or, could the sociotechnical artifact actually be put to use in a way that might better impact working conditions through more indirect ways?

Like Turkopticon, faircrowdwork.org is—among other things—a review site. Like Turkopticon, it has influenced the behavior of the “reviewed parties”—here platform operators rather than platform users (i.e., customers or, in MTurk jargon, “requesters”).Footnote 13 Unlike Turkopticon, faircrowdwork.org is not relied on daily by workers. But like Turkopticon, it is a sociotechnical artifact—understood by most nontechnical stakeholders as a technical artifact—that “stands for” the topics of workers’ experiences and rights and working conditions in digital labor platforms in a variety of discourses—especially European trade union and policy maker discourses on “the future of work.” (On Turkopticon’s role in public discourse, see Irani and Silberman 2013, Irani and Silberman 2016.)

As CSCW luminaries like Suchman and Bødker inspire the broader field of HCI to think about what comes next, “after interaction” (Taylor 2015), our work in this project underscores the importance of understanding technical systems as artifacts that precipitate dialogue. Beyond the obvious ‘use case’ – for reading reviews – the Fair Crowd Work website serves an arguably more important role as a visible and real artifact around which people and agendas can coalesce. It precipitates a conversation and a new set of relations among individuals and institutions that might together form the necessary foundation for social change.

Today, journalists, policy makers, other trade unionists, and researchers contact us and ask, “What is this Fair Crowd Work site?” This question creates space for a dialogue about the diversity of digital labor platforms and platform work; the diversity of platform workers; the diversity of workers’ motivations, circumstances, and experiences; what we understand to be the benefits and challenges of this way of working for workers; our activities on the topic generally; our proposals for clients, platform operators, and regulators; and open questions. As with Turkopticon (see especially Irani and Silberman 2016), the story about faircrowdwork.org that first piqued journalists’ and other trade unionists’ interests was the technological story—the story about design—even if that story was neither quite factually correct nor the story we were most interested in telling. In the case of faircrowdwork.org, however, rather than “innovative researchers build innovative technology” (unsurprising but nevertheless quite sellable) the story was “union builds innovative technology” (surprising!). In the case of faircrowdwork.org, journalists’ thirst for the technological story about “our platform” is often followed shortly by a thirst for the “union criticizes new technologically enabled ways of working” story. As the research shows, there is plenty to criticize about working conditions in digital labor platforms. But the first message that the trade union wants to convey is that it is not “generally against” digital labor platforms—the question is simply, as always, are the working conditions good—for the trade union’s quite specific interpretation of “good”? If they are not good, how can they be made good?

5.3 Regulation and design

Regulation may be coming to digital labor platforms, at least in Europe (see, e.g., European Parliament 2017; Forde et al. 2017). Yet many questions remain unanswered—and some remain largely unasked. Is regulation needed? Is regulation sufficient to solve the problem(s) reported by workers? Are the problems reported really problems with digital labor platforms, or are they bigger problems? If regulation is needed, what kind of regulation? National regulation of platform processes and interface details? European regulation? An international labor convention? Some other, as yet to be determined, kind of global “regulation”?

Many policy makers and participants in the policy discussion say regulation is necessary but that it is also necessary to make sure regulation does not stifle innovation. This argument is however sometimes in practice simply interpreted to mean that one should advance a policy proposal halfway between the strongest proposals and the proposal of no regulation. But there is little discussion of exactly what value “innovation,” especially technological innovation, brings and to whom. Policy discussions tend instead simply to assume that (a) technological innovation brings economic growth, (b) economic growth is generally good, and (c) regulation can protect things that economic activity in a market whose policies are oriented toward growth alone may disregard or threaten but (d) we have a “zero sum” tradeoff between regulation and social protection, on one hand, and innovation, growth, and dynamism, on the other.

It is however possible, instead of speaking in generalities such as “innovation” and “regulation,” to consider who exactly benefits from novel technology development and how, and how those benefits might be preserved even under a policy regime that protects worker rights. But to make these considerations in the context of a policy discussion around digital labor platforms in particular, much more nuanced knowledge of really existing working conditions—knowledge of work processes, workers’ and clients’ circumstances, and so on (see e.g. Table 1 and the working conditions rating scheme)—as well as of the broader socioeconomic consequences of platform-based work is needed than is currently being brought to bear.

To answer the question, “Do we need regulation to protect worker rights?” we can consider why we have existing labor regulation such as minimum wages, protection from unjust dismissal, or the right to collectively bargain: in the employment relation, employers usually have more power than workers, and labor regulation exists to protect the less powerful party. In democratic societies, it is broadly accepted that workers are not employers’ property, and that employers therefore do not have unlimited rights, and should not have unlimited power, over their workers. Employers should not, for example, have the right in theory or the ability in practice to require that workers work more than eight hours a day. In a future of work in which digital labor platforms play a growing role, it is unlikely that a single employer will exercise the kind of totalitarian power over workers that employers in industrial production commanded in Europe and North America in the 19th and 20th centuries exercised, and in some places still do. Yet the loss of worker bargaining power catalyzed by the technological enablement of global labor markets has already created economic dynamics in which workers must work much more than 40 h per week in order to make a living (see especially Graham et al. 2017; Hara et al. 2018). From a trade union perspective, this is undesirable; policy makers and voters may agree.

Even if policy makers choose to intervene to regulate digital labor platforms, the appropriate form for the regulation is unclear. The proposal advanced by the Swedish trade unions (Söderqvist 2016) calls only for European regulation to establish a framework allowing trade unions to organize workers (i.e., for the situation with respect to competition law to be clarified), and for the detailed regulation of platform work processes and interface design to be left to the unions to negotiate in collective agreements and enforce in practice. In this proposal, national governments and certainly European policy makers should stay out of the details; regulation would reduce the incentive for platform workers to join the unions.

For workers with strong unions, the advantage of this approach over national or international regulation is that unions can have a more nuanced understanding of workers’ needs and work processes on particular platforms, and can, for example, negotiate exceptions to broadly accepted best practices that may be beneficial in some cases. The potential disadvantage is that it may be impracticable: not all unions have the kind of power and expertise necessary to bring even nationally-based digital labor platforms to the negotiating table. And the biggest platforms act globally and are typically ‘located,’ legally speaking, in the United States—but most trade unions have national organizing mandates. Perhaps trade unions will coordinate across national borders to organize workers on global platforms and negotiate international agreements (for example, international framework agreements) with platform operating companies. Or, especially given that organizing many types of platform workers—especially workers doing online work—may lay outside the “core competences” of many trade unions, maybe trade unions will choose to centralize this function in a “competence center” with an international mandate.

In any case—whether trade unions work in a centralized or decentralized way to organize platform workers, and whether national or international governments regulate platform standards directly or leave this work to the trade unions—continued investigation of work processes, working conditions, and workers’ experiences is needed, and CSCW researchers have the capabilities to do this more than anyone else. The trade unions, generally speaking, do not—yet; nor do the labor ministries that will be tasked with enforcing standards in the event that regulation is promulgated. CSCW researchers will be called on perform these investigations and to disseminate knowledge among trade unions, labor ministries, researchers in other fields, and journalists—and, perhaps, among workers.

5.4 Automated management systems: beyond digital labor platforms

Digital labor platforms are relatively easily recognized as a “new form of work organization” because work is organized mostly or exclusively through digital means and workers are typically required to agree that they will be legally classified as self-employed rather than employees, even if the facts of the work relationship would indicate otherwise. But key to digital labor platforms’ functioning, as Irani (2015) first identified, is the principle of automated or algorithmic management. While the major focus in the discussion of Uber, for example, is the potential for the replacement of millions of taxi drivers with autonomous vehicles, the business model at present—of Uber and other digital labor platforms—is to a large extent not the automation or even deskilling of service provision but rather the automation of management and the concomitant centralization of business administration decisions, market power, and (potential) profit. Uber replaces thousands or tens of thousands of taxi dispatchers working all over the world, mostly for small companies, with a few hundred or a few thousand programmers, working mostly for Uber, working mostly in San Francisco, tasked with automating management. The principle and practice of automated management is not limited to digital labor platforms or freelance arrangements; it can be, has been, and is being, implemented and experimented with in various degrees in even the largest global concerns—for managing both blue-collar workers and high-skilled and thus far relatively well-paid workers. The expertise of CSCW researchers will be needed by worker organizations and policy makers to make sound policy as this principle expands into more and more workplaces.

6 Conclusion

As digital labor platforms become increasingly prominent actors in labor markets and commodity value chains, many questions remain for researchers and practitioners working at the intersection of technical systems development and economic and social life. What will be the consequences of the deployment of this new technology and this new principle of management? CSCW researchers will certainly be called on to participate in answering this question through empirical and theoretical research. But we also, if we so choose, have the opportunity to answer the question by co-creating the answer; and, we can do this not only alone but also in collaboration with other stakeholders—workers, labor organizations, and policy makers.

There is much ‘collective sensemaking’ to do, as a society, as we try to figure out the economic and social meaning of these new computing technologies. This paper documents one effort at the interface of HCC knowledge making and technology craft practices, on one hand, and civil society and policy discourses, on the other, to contribute to this process of collective sense making. Our primary argument is that “more CSCW is needed here” and our primary contribution is to give CSCW researchers a sense of how they might be able to contribute, not only “to CSCW,” but also, simultaneously, to this bigger social process. Our question in concluding is not “what are we contributing to CSCW?” nor, “what does this mean for [already accepted theoretical and methodological questions in] CSCW?” but rather “what does, or could, CSCW research mean in the bigger world? What can CSCW contribute to everyone else?”

In this project, we have built on the prior work of those politically-invested CSCW researchers who have offered critiques of online platforms and developed tools in service of the immediate functional needs of workers. This project demonstrates one way to build bridges across the disciplinary zones of computing research and labor organizing and activism through the production of a different kind of artifact – a website designed to communicate both the results of a research project, and a particular (if provisional and evolving) political point of view about online work.

The details of platform work processes revealed through the worker surveys and our ongoing detailed readings of platform terms of service serve as a foundation for ongoing discussions with European (for now, mostly, but not only, German) platform-based workers and platform operators. The worker surveys and star rating system, along with the terms of service scoring scheme, embody and make legible and public our own current beliefs about what is relevant, important, desirable, and undesirable in terms of worker rights and worker power in digital labor platforms. The website makes these beliefs public so that they can be contested, clarified, corrected, or affirmed by workers, clients, and platform operators with whom we are in direct discussion—and perhaps understood and taken up by other actors as well, including policy makers, researchers, journalists, designers and operators of other platforms, and other trade unionists, both in Germany and elsewhere. As we—workers, unionists, clients, researchers, and platform operators—continue to collectively refine and extend our understanding of what is relevant, important, desirable, and undesirable in digital labor platforms, these understandings will influence design and operational practice, and—in the not too distant future—become codified in “platform-appropriate” versions of the classical tool for safeguarding worker rights: collective agreements.