Introduction

Body-worn cameras (BWCs) have been suggested as a technological intervention that could address police-community relation problems by enhancing police legitimacy through improved transparency, accountability, civility, and lawfulness of police-citizen encounters (Stanley, 2015; President’s Task Force on 21st Century Policing, 2015). Surveys generally document that citizens support the adoption of BWCs by police departments (see, e.g., Culhane et al., 2016; Crow et al., 2017; White et al., 2018a, b). However, studies suggest that BWCs have not produced consistent impacts on how citizens view the police (Lum et al., 2019). There is some suggestive evidence that the presence of BWCs may improve perceptions of procedural justice and legitimacy among citizens who experience a police encounter (McCluskey et al., 2019; Demir et al., 2020). However, it remains unclear whether BWC adoption produces meaningful changes in broader community perceptions of the police.

The New York City Police Department (NYPD) adopted BWCs as part of a series of reforms emanating from litigation over their stop, question, and frisk practices. During the late 2000s and early 2010s, the City of New York vigorously defended the use of stop, question, and frisk practices as an effective crime prevention strategy (Bloomberg, 2013; Kelly, 2015). While there is some evidence that increased stops produced modest crime control impacts (MacDonald et al., 2016; Weisburd et al., 2016), NYPD stop, question, and frisk efforts were criticized for producing unconstitutional racial disparities in stops, frisks, and searches of citizens (Fagan & Davies, 2000; Gelman et al., 2007), and federal class action lawsuits were filed against the City of New York. In 2013, the City of New York reached a court-approved settlement with the plaintiffs after a federal judge ruled that the NYPD was liable for a pattern and practice of unconstitutional stops of citizens that violated the Fourth and Fourteenth Amendments. The litigation settlement mandated a series of reforms including changes to NYPD stop policies, training, and auditing; modifications to the processing of civilian complaints and officer discipline procedures; the adoption of new measures to evaluate organizational performance; and the establishment and evaluation of a pilot BWC program. The settlement also required the appointment of an independent monitor to evaluate NYPD’s progress in making the agreed-upon reforms.

In the remedial order, then US District Judge Shira Scheindlin ordered the pilot BWC program to determine whether the technology created objective records of police stops, encouraged lawful and respectful police-citizen interactions, offered a way to help determine the validity of accusations of police misconduct, and improved distrust between the NYPD and the public.Footnote 1 While the NYPD and the court-appointed monitor team was planning the BWC pilot program, the New York University (NYU) School of Law Policing Project (2016) surveyed New York City residents and found most respondents were overwhelmingly in favor of BWCs, and 82% expressed the hope that the technology would improve fractured police-community relations. At the April 2017 press conference announcing the BWC program launch, New York City government leaders suggested the placement of BWCs on NYPD officers would increase police accountability and transparency with Mayor Bill DeBlasio further noting that “body worn cameras will bring police and communities closer together” and then New York State Senator Marisol Alcantara observing “body worn cameras will be a fabulous tool to increase trust between the community and the police.”Footnote 2

This article presents the results of the court-ordered evaluation of the impact of BWCs on community perceptions of the NYPD based on telephone surveys of representative samples of residents in 20 treatment and 20 control precincts. The article begins by briefly reviewing the existing literature on public perceptions of the police and studies evaluating BWC impacts on community perceptions of the police and procedural justice during police-citizen encounters. We then summarize the NYPD BWC pilot program implementation and describe the cluster randomized controlled trial design, survey data collection methods, and the empirical models that estimate the effect of the BWC program on perceptions of the police. We find that the deployment of BWCs on officers did not produce any meaningful changes in overall community perceptions of the NYPD or officer behavior among those who reported recent encounters with the NYPD. The policy implications are considered in the concluding section.

Literature review

Public perceptions of the police

Most Americans have positive views of their local police departments. Gallup poll data has found that 86% of whites, 80% of Hispanics, and 73% of blacks were either “very satisfied” or “somewhat satisfied” with the police department in the respondents’ city (Weitzer & Tuch, 2006: 41). At the same time, there exists a racial hierarchy in the degree of satisfaction with the police, with nearly half of all whites reporting they were “very satisfied” as compared to 36% of Hispanics and only 22% of blacks responding to the survey. Differences between blacks and whites were shaped by varying perceptions of their treatment during prior encounters with the police, feelings of safety in their own neighborhood, crime control efficacy and police use of community policing strategies, and exposure to police misconduct. Unfortunately, when compared to white subjects, black respondents were more inclined to believe that the police abuse citizens, treat minorities more harshly relative to whites, and were not held accountable for misconduct (Weitzer & Tuch, 2006). Numerous studies have consistently found that blacks express lower levels of trust in and satisfaction with the police (e.g., Taylor et al., 2001; Hurst & Frank, 2000; Leiber et al., 1998; MacDonald & Stokes, 2006; MacDonald et al., 2007).

The May 25, 2020, death of George Floyd during an arrest by Minneapolis Police Department officers sparked nationwide protests and renewed a national conversation on race and policing, stemming from persistent concerns over the deaths of unarmed black men during encounters with the police and racial disparities generated by police enforcement efforts. These events have had negative impacts on public perceptions of the police. Gallup public opinion poll data shows that confidence in the police has fallen in recent years, and for the first time in three decades, less than 50% of Americans expressed “a great deal” or “quite a lot” of confidence in the police in 2020.Footnote 3 The decline in confidence follows the same hierarchy, with the black-white gap in confidence in the police being the largest recorded since 1993.Footnote 4 The City of New York does not conduct ongoing public opinion polls of the NYPD. However, the ad hoc studies that do exist suggest that community residents in general hold favorable opinions of the NYPD, non-white residents have less positive perceptions of the NYPD when compared to white residents, and young non-white males that experience high levels of police contacts have very negative perceptions of the NYPD.

The Vera Institute of Justice surveyed nearly 2,000 residents of five NYPD precincts on their opinions of and experiences with the NYPD (Miller et al., 2004). From the survey data, the Vera researchers used factor analyses to develop underlying latent variables measuring respondent perceptions of police effectiveness and misconduct. Overall, respondents ranked the NYPD high on their perceptions of effectiveness and low on their perceptions of misconduct (Miller et al., 2004). However, these assessments showed a racial hierarchy with white residents (.77 effectiveness, .31 misconduct) reporting more positive assessments than those of Hispanic (.70 effectiveness, .40 misconduct) and black (.65 effectiveness, .56 misconduct) residents. The Vera study noted that negative prior personal and vicarious encounters (i.e., the experiences of a friend or family member) with NYPD were highly predictive of lower effectiveness and higher misconduct ratings (Miller et al., 2004). Respondents with negative direct and vicarious encounters with the NYPD tended to be minority residents living in economically disadvantaged police precincts. Consistent with other studies noting the asymmetric impact of negative encounters with the police on public perceptions (e.g., Skogan, 2006), positive and neutral interactions with officers did not enhance respondents’ assessments of the NYPD.

During the late 1990s and continuing through the 2000s, the NYPD engaged in crime control strategies that dramatically increased the number of misdemeanor arrests and stop-question-and-frisks of citizens (Kelling & Coles, 1996; Fagan et al., 1998). Studies generally found that these strategies generated concerning racial disparities in who was being stopped, frisked, and arrested by the NYPD (Fagan & Davies, 2000; Gelman et al., 2007; Harcourt & Ludwig, 2006; but see Ridgeway, 2007). Fagan et al. (2010) estimated that 80% of black residents between the ages of 16 and 17 had been stopped one or more times by the NYPD in 2006. This intensive law enforcement scrutiny of minority residents in NYC appears to have increased negative perceptions of the NYPD. A recent Vera Institute of Justice survey of 500 NYC residents between the ages of 18 and 25 who had been stopped at least once in high-crime neighborhoods found that 88% of these individuals believed that neighborhood residents did not trust the police (Fratello et al., 2013). Additionally, a representative sample of interviews with 1,261 NYC residents between 18 and 26 found that nearly half had been stopped in a car or on the street and, the more experiences respondents had with the police, the less likely they were to indicate that they were treated fairly by the police and that the encounters were lawful (Tyler et al., 2014).

BWCs and citizen perceptions of the police

BWCs have been rapidly adopted by US police departments, in part, as a response to ongoing police-community relation problems and persistent concerns over police shootings of unarmed black citizens (Lum et al., 2015; White, 2014). The President’s Task Force on 21st Century Policing (2015) suggested BWCs might be an important technology to enhance police legitimacy and community trust in the police. Under the Obama Administration, the US Department of Justice (2015) launched a $20 million funding initiative to support local police departments in outfitting their officers with BWCs. The BWC grant program continued under the Trump Administration. In 2016, the US Bureau of Justice Statistics reported that about 80% of large police departments and sheriff’s offices with 500 or more full-time sworn officers had acquired the BWC technology and 75% of these large law enforcement agencies reported improved community perceptions as a key reason for adopting BWCs (Hyland, 2018). Other reasons for adopting BWC technology included enhanced officer safety, increased evidence quality, reduced civilian complaints, and decreased agency liability.

The available research suggests that citizens support the adoption of BWC by police departments and hold high expectations for the technology in improving accountability and enhancing citizen confidence in the police (for a summary, see Lum et al., 2019). Using data collected from interviews and focus groups in two cities, Todak et al. (2018) found that judges, prosecutors, mental health workers, city leaders, civilian oversight members, victim advocates, and other key external stakeholders were highly supportive of BWC implementation by their police departments. Detained suspects of crime also favor the deployment of BWCs on police officers (Taylor et al., 2017). However, consistent with the broader literature on public perceptions of the police, minority citizens generally view the potential benefits of BWCs with less enthusiasm and are more skeptical than whites of its efficacy in holding officers accountable for misconduct (see, e.g., Crow et al., 2017; Sousa et al., 2018; Kerrison et al., 2018).

It is worth noting here that BWC advocates do not articulate a well-developed “theory of change” that identifies the specific mechanisms through which outfitting officers with the technology will improve community perceptions of the police. Governmental officials, civil liberty organizations, and community activists tend to describe more proximate outcomes of BWC adoptions, such as increased transparency, enhanced accountability, and more civil police-citizen interactions, that will influence public perceptions and, in turn, result in improved police-community relations. Very few studies have attempted to measure the impacts of BWCs on community-wide citizen perceptions of the police before and after the technology was implemented. And only a handful of evaluations have examined the effects of the technology on the perceptions of citizens who have had interactions with BWC officers.

Ellis et al. (2015) conducted a mail-in survey of Isle of Wight (UK) residents before and after BWCs were placed on police officers. Public approval of the police was very high before and after BWC adoption, and public confidence in the police changed very little after the technology was implemented. Ariel (2016a) conducted a quasi-experimental evaluation of the influence of BWCs on citizen crime reporting behaviors in Denver (CO). Based on an analysis of calls for service data in treatment and control areas, Ariel (2016a) found that BWCs increased the number of calls to the police in low-crime residential street segments but did not influence the number of calls in high-crime street segments. He concluded that the greater willingness of citizens from treatment low-crime residential places to report crimes to the police was a likely result of improved police-community relations stimulated by the placement of BWCs on officers.

Other research has examined the influence of BWCs on citizen perceptions after specific encounters with the police, tending to focus on police legitimacy and procedural justice issues (Lum et al., 2019). Deterrence theory suggests that the presence of BWCs during police-citizen encounters increases the risk that inappropriate behavior will be detected; public self-awareness theories suggest BWCs will cause police officers and citizens alike to be more aware of their behaviors during interactions and stimulate more respectful and courteous behavior (Ariel et al., 2015; but, more generally, see Duval & Wicklund, 1972; Zimring & Hawkins, 1973). Ariel (2016b) suggests that the purported behavioral changes associated with the deployment of BWCs on police officers are consistent with Tyler’s (2003) process-based model of police legitimacy: allowing voice, making neutral decisions, showing trustworthiness, and treating citizens with respect and dignity.Footnote 5 Studies provide some indirect evidence of behavioral changes that supports the link between BWC presence and enhanced procedural justice. Randomized experiments generally find that BWCs improve the civility of police-citizen encounters as evidenced by reduced citizen complaints against treatment officers relative to control officers (e.g., Ariel et al., 2015; Braga et al., 2018; Jennings et al., 2015). The impact of BWC on police use of force remains unclear (Lum et al., 2019; White et al., 2018a, b).

The available evidence on the impacts of BWCs on citizen perceptions of the police following encounters with officers wearing the cameras is mixed. In Spokane (WA), telephone interviews with 249 citizens who recently had an encounter with the police found that their perceptions of procedural justice during the encounter improved when they were aware that the officer was wearing a BWC (only 28% of the subjects reported being aware of the BWC; see White et al., 2017). However, a randomized controlled trial of BWCs in Arlington (TX) found no differences in perceptions of legitimacy, satisfaction, and police professionalism by citizens who recently had an encounter with officers who were equipped with cameras compared to control officers who did not have cameras (Goodison & Wilson, 2017). In Anaheim (CA), a randomized controlled trial that surveyed respondents after encounters with the police reported that the presence of a BWC combined with the use of a procedural justice script to guide officer behaviors during encounters generated larger impacts on citizen satisfaction relative to the presence of the BWC alone (McClure et al., 2017).

Two studies suggest that outfitting officers with BWCs may stimulate procedurally just behaviors during encounters with citizens. In Los Angeles (CA), McCluskey et al. (2019) used systematic social observations of police-citizen encounters to conduct a pre-post analysis of the effects of BWCs on officer behaviors. Officers equipped with BWCs were observed engaging in significantly more displays of procedurally just behaviors during their interactions with citizens compared to officers without BWCs. In the Eskisehir province of Turkey, a quasi-experimental evaluation concluded that drivers stopped by police wearing BWCs reported improved perceptions of procedural justice during the stop and enhanced perceptions of the legitimacy of the traffic officers and the police in general relative to drivers stopped by non-BWC comparison officers (Demir et al., 2020).

This study intends to advance our knowledge on an important gap in the existing research literature – whether the deployment of BWCs improves broader community perceptions of the police. New York City is an important research site to test the relationship between BWCs and police-community relations given the well-known concerns expressed by minority community members over the NYPD’s excessive use of stop, question, and frisk encounters, the federal court ordered pilot testing of BWCs to improve the constitutionality of police-citizen interactions, and the hope of politicians that BWCs would help “bring communities and police together.”

Methods

A cluster randomized controlled trial (RCT) was used to evaluate the effects of BWCs on a series of police work activity, civility, lawfulness, and community perception outcomes (Braga et al., 2021). Forty precincts with the highest numbers of Citizen Complaint Review Board (CCRB) complaints against NYPD officers were identified and then matched into 20 pairs based on demographics, socioeconomic characteristics, crime, and police activity (Appendix 1, Table 5). Within each pair, a computer algorithm was used to randomly select one precinct to have BWCs deployed on its officers (the treatment precinct) and the other to be without BWCs (the control precinct). Over the course of the 1-year intervention period, 3,889 NYPD uniformed officers working the third platoon (3 PM to midnight shift) and plainclothes officers working anti-crime unit assignments were equipped with BWCs. Subsequent analyses confirmed that officers in the treatment and control groups were on average similar in terms of demographics, length of service, rank, work activities, and number of citizen complaints (Braga et al., 2021; see Appendix 2, Table 6).

All five boroughs had at least one precinct included in the treatment group.Footnote 6 The resident populations of precincts are akin to small to midsize US cities with the smallest treatment precinct serving some 47,418 residents (25th Precinct in Manhattan, mostly located in East Harlem) and the largest treatment precinct serving some 188,666 residents (105th Precinct in Queens in the easternmost part of the borough). Within specifically matched pairs, there were treatment and control precincts that both had larger ambient populations of work commuters and tourists generated by business districts and tourist attractions. While the BWC intervention could plausibly benefit residential and ambient populations alike, the key community perception data collection effort was limited to residents of the treatment and control precincts included in the experiment.

The complexities of adopting BWCs in a very large police department that provides services to citizens in a diverse set of communities across a sprawling metropolitan area required the NYPD to develop and execute a careful deployment plan. Core implementation activities included ensuring the BWC treatment precincts had the appropriate information technology resources to handle uploading videos to cloud data storage as well as training participating line-level officers and supervisors in BWC policies and technological operations.Footnote 7 To manage these requirements appropriately, BWC implementation in the 20 treatment precincts was staggered over 7 months beginning in April 2017 and ending in November 2017 (see Fig. 1). Despite varying start dates, all treatment precincts used the BWCs for one full year with the experimental intervention period ending in November 2018.

Fig. 1
figure 1

Implementation of BWC treatment in 20 matched pairs included in cluster randomized controlled trial

It is important to note that this experiment was executed during a department-wide effort to outfit all NYPD uniformed patrol officers and selected specialized unit officers with BWCs between December 2017 and August 2019.Footnote 8 To avoid threats to the integrity of the experimental design, the NYPD first deployed BWCs on uniformed officers in the 37 precincts that were not included in the evaluation. As each pair of the 40 experimental precincts with the most CCRB complaints completed the 1-year intervention period, control precinct officers then became eligible to wear BWCs as part of the citywide implementation after the experiment ended. As a result, this RCT tests the effects of BWC deployment on citizen perceptions in precincts with the most contentious relationships with the NYPD. BWC deployment on specialized unit officers started in March 2019, well after the last matched pair of precincts completed the 1-year treatment period (November 2018).

The cluster RCT found that the placement of BWCs on NYPD officers generated two statistically significant impacts on the included outcome measures (Braga et al., 2021). CCRB citizen complaints against BWC treatment officers were reduced by 21% when compared to non-BWC control officers. However, stop reports completed by BWC treatment officers increased by almost 39% relative to the number of stop reports completed by non-BWC control officers. Supplemental analyses suggested that the presence of BWCs during stop encounters made NYPD officers more likely to comply with departmental policies mandating the completion of stop reports relative to control officers who did not have their stop encounters recorded on video.

Telephone survey data collection methods

The evaluation of the impact of the body-worn camera program on NYC resident perceptions of the NYPD primarily involved a telephone survey. Hart Research Associates, an independent professional survey research firm, was selected by the NYPD Monitor to administer telephone surveys prior to the introduction of the BWCs (pre-intervention) and after cameras had been in use for a 1-year intervention period (post-intervention). The survey questionnaire measured the demographics of respondents and a series of 26 close-ended items, including questions on specific experiences of being stopped by police or other interactions with police occurring in the prior 12 months. The survey included outcome questions borrowed from prior surveys on individual experiences during police encounters and attitudes towards the police more generally (e.g., see Tyler & Huo, 2002; Reisig et al., 2007; Braga et al., 2014; Sahin et al., 2017). In addition to using pre-validated questions, the survey was piloted to ensure that the questions adequately captured the information needed.Footnote 9

The pre-intervention and post-intervention samples were designed to be representative of the adult populations in the control and treatment precincts, including young minority males who are harder to reach in telephone surveys. A dual-frame random selection approach, involving both landlines and cellphones, was used (see American Association for Public Opinion Research [AAPOR], 2010). TargetSmart, a political data surveying firm with 6.2 million subjects in its New York City database, was provided shapefiles of the 40 experimental precincts to develop a sampling frame of residents and their phone numbers. The US Census Bureau’s 2010 American Community Survey (ACS) was used in conjunction with the shapefile data to set allocations by age, gender, and race for the control and treatment samples. Additional telephone numbers, including contract and prepaid mobile phone numbers, were supplied by a second vendor, Link2Tek. Prospective interview subjects were randomly selected from the lists of phone numbers of residents in the 40 experimental precincts provided by TargetSmart and Link2Tek.Footnote 10 No incentives were offered to prospective interview subjects.

The interviewing firm American Directions fielded the pre-intervention survey in English and Spanish (by respondent choice) during March 2017 and April 2017 (Fig. 1). Pre-intervention surveying was completed in matched pair 5 before the treatment precinct officers were outfitted with BWCs in April 2017. The post-intervention surveys were conducted as the 1-year treatment period expired in each of the matched pairs of treatment and control precincts, beginning in May 2018 and continuing through December 2018. As such, the time between pre-intervention and post-intervention surveys varied across pairs (ranging from 12 to 19 months) but did not vary for the treatment and control precincts included in each matched pair. Since the time between surveys was the same in each matched pair, our experimental analyses were not biased by the varying time periods between pre-intervention and post-intervention surveys across the matched pairs. During pre-intervention and post-intervention survey data collection periods, eligible numbers were repeatedly called at varying times of the day until prospective subjects answered the phone and either agreed or declined to participate; this continued until survey sample quotas were reached.

Eighteen to 34-year-old men in both the treatment and control precincts were oversampled to obtain a sufficient number of respondents who had interactions with the NYPD. However, the treatment and control samples were weighted separately by age and gender to reflect the population level demographics of these precinctsFootnote 11 and minimize any nonresponse bias that may have been present in the survey data collection. The demographic distributions in the weighted survey samples closely matched those reported for adults in the 40 precincts by the 2010 ACS (see Appendix 3, Table 7 for pre-intervention comparisons).

The pre-intervention survey involved live telephone interviews with a total of 3,000 residents in the 20 treatment precincts (25.8% response rateFootnote 12) and 3,000 residents in the 20 control precincts (26.6% response rate). The post-intervention survey involved live telephone interviews with 3,037 residents in the 20 treatment precincts (28.9% response rate) and 3,020 residents in the 20 control precincts (29.3% response rate). The response rates did not significantly differ between the treatment and control groups during the pre-intervention (two-sample difference of proportions |z| = 0.7046, p = 0.4810) and post-intervention periods (two-sample difference of proportions |z| = 0.344, p = 0.733), suggesting that any existing nonresponse bias would be similar in the treatment and control groups and not impact the ability to detect differences in resident perceptions over time. These response rates are also consistent with other recent telephone surveys of citizens who have interacted with police (34.4% in Rosenbaum et al., 2015; 25.0% in Malm et al., 2016; 27.8% in White et al., 2017). The proportion of respondents who were surveyed via cellphones relative to landlines also did not differ significantly across experimental groups in the pre-intervention (treatment = 71% cellphone, 29% landline; control = 72% cellphone, 28% landline; two-sample difference of proportions |z| = 0.858, p = 0.391) and post-intervention periods (treatment = 85% cellphone, 15% landline; control = 84% cellphone, 14% landline; two-sample difference of proportions |z| = 1.070, p = 0.284).

Limiting treatment contamination of control group respondents

Research suggests that officers with BWCs influence the behavior of officers without cameras if they work simultaneously in the same area and interact with the same people (Ariel et al., 2019; Braga et al., 2020). Similarly, BWC exposure through a subset of officers in an area could influence how civilians in that area interact with the police more broadly. Such contamination undermines the ability to detect intervention effects because both treatment and control officers (and civilians) could be modifying their behaviors due to the presence of BWCs. This is known more formally as a violation of the stable unit treatment value assumption (SUTVA) that assumes that the effect of a treatment on a given observational unit is not related to the intervention assignments of other observational units (Rubin, 1980). SUTVA violations undermine the internal validity of randomized experimental designs (Shadish et al., 2002). In a recent systematic review of BWC evaluations, Lum et al. (2020) noted that many BWC studies had some form of contamination between treatment and control groups and this phenomenon was particularly difficult to avoid and manage in BWC studies.

In this study, we attempted to address the potential contamination of the control group by design. Cluster randomization is often used to prevent contamination between treatment and control groups in public health and medical trials (see, e.g., Torgerson, 2001; Magill et al., 2019). The random allocation of clusters of officers, rather than randomizing individual officers, who work in distinct precincts to the BWC treatment condition or not limits the extensiveness of the contamination problem. There was very little contamination between treatment and control officer assignments over the course of the intervention period.Footnote 13 Nevertheless, it remained possible that BWC officers working in treatment precincts may have responded to relatively rare emergency dispatches involving officer safety concerns that mandated additional resources in adjacent control precincts. Further, it is also possible that residents of control precincts may have worked or visited treatment precincts and encountered BWC officers in public spaces. We limited all telephone survey outcome questions to respondent perceptions of and experiences with NYPD officers in their immediate neighborhood to diminish the risk of contamination of control respondents due to BWC exposure in treatment precincts they may have frequented.

Measurement outcomes and sample characteristics

Survey questions were intended to capture latent measures (Long, 1983) of perceptions of the NYPD in general, distrust of neighborhood police officers, procedural justice, and negative assessments of interactions with the police that took place in the past 12 months as a result of a car, pedestrian stop, or request for assistance by the NYPD. Respondents were asked to indicate on Likert scale survey items on four- and five-point scales their general feelings or level of agreement with questions about the NYPD.

Table 1 shows the covariance of the responses collected from these sets of questions that were analyzed to develop seven outcome measures that capture perceptions of the NYPD. Confirmatory factor analysis was used to test whether the telephone survey data collected for this study fit the hypothesized measurement models applied in previous research (Tyler & Huo, 2002; Reisig et al., 2007; Braga et al., 2014; Sahin et al., 2017). All outcome measures had Cronbach’s alphas (Cronbach, 1951) that exceeded .70, suggesting overall good internal consistency. Confirmatory factor analysis showed that survey items representing outcome measures had strong intra-item correlations (Kim & Mueller, 1978).Footnote 14 The confirmatory factor analyses of the unweighted data further suggested a very good fit between the hypothesized models and the true covariance matrices underlying the data.Footnote 15 All comparative fit index (CFI) values were larger than .90, and all standardized root mean residual (SRMR) values below were below .05 (Brown, 2015). The lowest goodness-of-fit statistics were produced by the unweighted perceptions of neighborhood officers latent variable with CFI = 0.956 and SRMR = 0.035.

Table 1 Pre-Intervention Latent Variable Outcomes: Item Correlations, Cronbach's Alpha, and Factor Loadings

Table 2 shows the similarity between respondents in the BWC treatment and control group precincts on key demographic variables and pre-intervention survey items. Standardized mean differences |d| between groups on measures did not generally exceed .20 (a small effect size) (Cohen, 1988).Footnote 16 Treatment and control respondents in the telephone survey had very similar perceptions of the NYPD and neighborhood safety during pre-intervention period. During the pre-intervention period, very few telephone survey respondents reported being subjected to a car stop while driving in their neighborhood (11.9% of treatment subjects, 12.9% of control subjects) or being subjected to a pedestrian stop while in their building or other public places in their neighborhood (5.9% of treatment subjects, 4.7% of control subjects) in the prior 12 months. Some 19.3% of treatment telephone survey subjects and 20.1% of control telephone survey subjects reported contacting the police for emergency assistance during the year preceding. Cohen’s |d| results also show that treatment and control telephone survey respondents reported the same perceptions of police behaviors and procedural justice during the pre-intervention period. The only exception (|d| = .29) was that a higher share of telephone survey respondents in the control precincts who were subjected to a pedestrian stop reported being “patted down on the outside of their clothing” (51%) relative to respondents in future BWC treatment precincts who were subjected to a pedestrian stop (33%).

Table 2 Pre Intervention Characteristics and Respondent Outcome Measures: Treatment v. Control

The similarity between respondents in background and perceptions of the NYPD across treatment and control precincts suggests that the cluster randomization process was successful in creating comparable precincts prior to the BWC intervention. Survey data analyses were also weighted to ensure that the comparisons and inferences were generalizable to resident populations in the 40 precincts included in the cluster randomized controlled trial.Footnote 17

Analytical approach

The impact of BWCs on resident perceptions in the treatment precincts relative to the control precincts was estimated through a difference-in-difference (DID) estimator (Card & Krueger, 1994). The DID estimator evaluated the difference in resident perceptions in the treatment precincts during the post-intervention period compared to during the pre-intervention period, relative to the same difference for residents in the control precincts.Footnote 18 The general equation for our regression models was:

$$ {\mathrm{Y}}_{\mathrm{i}\mathrm{t}}={\upbeta}_0+{\upbeta}_1 BWCi+{\upbeta}_2{\mathrm{Period}}_{\mathrm{t}}+{\upbeta}_3\kern0.5em \left( BW{C}_{\mathrm{i}}\times {\mathrm{Period}}_{\mathrm{t}}\right)+{u}_{\mathrm{i}} $$

In this model, Yit represents our outcome measure for residents (i) during a specific observation period (t). The regressor BWC is a dummy variable identifying whether residents (i) were in the treatment precinct receiving body cameras (1) or not (0). The reference group comprises control residents in the experiment. The regressor Period is a dummy variable for whether the resident perception outcome was measured during the time (t) of the post-intervention (1) or during the pre-intervention (0) period. Our primary interest is in coefficient β3, which represents the DID estimate of the product of the BWC group and the post-intervention period. Standard errors were clustered by matched pairs to guard against unmeasured dependence within precincts biasing the estimates of BWC impact on public perceptions of the NYPD.Footnote 19

Given the mild imbalances in treatment and control group subjects noted above, the DID models of the impact of BWCs on public perceptions of the NYPD were estimated with subject demographic covariates to adjust for these differences and improve precision in our estimates. Probit regression models were used to estimate the DID when outcome variables involved binary conditions (e.g., do you know any of the police officers that work in your neighborhood by name? 0 = No, 1 = Yes). Ordered probit regression models were estimated when outcomes variables involved Likert scales (e.g. capturing resident perceptions ranging from “very satisfied” to “very dissatisfied”). Resident perception outcome variables measured using Likert scales were reverse-coded to facilitate interpretation of the DID estimator.Footnote 20 Finally, DID estimates were based on structural equation models when outcomes involved latent variables.Footnote 21 We excluded cases with missing values on outcome variables in each regression model.Footnote 22

The statistical power of an experimental design indicates the likelihood that the null hypothesis will be rejected by a statistical test when a particular alternative hypothesis is in fact true (Lipsey, 1990). The probability of making a type II error of failing to reject a true null hypothesis of “no difference” between treatment and control groups decreases as statistical power increases (estimates range from 0 to 1). In general, a statistical power of .80 is recognized as a desirable level to detect a small effect size (Cohen’s |d| = .20). Statistical power in cluster randomized controlled trials is determined by the number of clusters, sample size of clusters, and the intraclass correlation coefficient measuring the degree to which outcomes are correlated within clusters (see Hemming et al., 2017). In this study, the telephone survey had 12,057 observations clustered in 20 precinct pairs. Our power analyses suggested sufficient statistical power to detect small differences between the treatment and control groups in pre-intervention and post-intervention outcomes in the full sample and somewhat diminished power in the smaller subsamples that experienced police contact. This design had statistical power of .80 (alpha = .05) to detect small effect sizes in the telephone survey outcomes for the full sample (|d|= .035 to .139)Footnote 23 and small to medium effect sizes in the subsamples of respondents who contacted the NYPD for assistance stop (N=1,968; |d|= .08 to .39; ) or experienced a car stop (N=1,727; |d|= .07 to .26) or pedestrian stop (N=992; |d|= .10 to .32).

Statistical analyses that involve multiple comparisons run the risk of reporting “false discoveries” as multiple simultaneous statistical tests are conducted (Miller, 1981). As the number of comparisons increases, it becomes increasingly likely that the two groups being compared will differ on some particular outcome. When compared to analyses that involve only a single outcome as a comparison, confidence in analyses that involve multiple comparison outcomes is generally weaker. Using a single comparison and a conventional two-tailed p <.05 statistical significance level, there is only a 5% chance of incorrectly rejecting the null hypothesis when it in fact is true (also known as a “false positive” or type I error). In this study, however, there are 26 simultaneous comparisons made between treatment and control respondents. At the p <.05 level, we would expect at least one false-positive test result (26 * .05 = 1.3) by chance alone. This relatively large number of comparisons was required to test the multitude of ways that BWCs could impact general citizen perceptions of the NYPD and citizen perceptions of specific NYPD officer behaviors during encounters that included car stops, pedestrian stops, and requests for officer assistance in the past 12 months.

There are many techniques that can be used to correct multiple comparison problems by recalculating probabilities obtained from a statistical test which was repeated multiple times. The traditional Bonferroni method and other family-wise error rate approaches to correct for multiple comparisons have been suggested to be too conservative (Benjamini, 2010). These methods risk missing many true findings by imposing stringent safeguards which control the probability of making at least one type I error. In this analysis, we used the false discovery rate (FDR) approach (Benjamini & Hochberg, 1995). FDR procedures control the expected proportion of false discoveries (incorrectly rejected null hypotheses). The FDR method generates an adjusted p value known as the q value that assesses false positive rates and allows for an interpretation of risk levels when rejecting null hypotheses (Newsom, 2010).Footnote 24 Like p values, q values range from 0 to 1, with q <.05 suggesting a bona fide statistically significant difference between treatment and control groups. A q value = 1 suggests the result is not statistically significant under any circumstances. For all outcome measures, the FDR procedure was used to determine whether any significant results calculated through traditional p values generated by the DID estimators in our models were actually “false discoveries.”

Results

Table 3 presents the results of the regressions comparing differences in survey subject responses over the course of the pre-intervention and post-intervention time periods in the BWC treatment precincts relative to the control precincts. The q values associated with the DID estimates show that presence of BWCs in the treatment precincts did not generate any statistically significant changes in telephone survey subject responses between the pre-intervention and post-intervention periods when compared to the control subject responses. Relative to control subjects, subjects in the BWC treatment areas did not report any differences over time in their perceptions of neighborhood safety, how complaints would be handled by the NYPD, their knowledge of NYPD officer names in their neighborhood, contacted the NYPD for assistance, and whether they personally had been stopped by the NYPD between the pre-intervention and post-intervention periods. For those subjects who had experienced a pedestrian and/or car stop, no significant differences were reported by subjects in BWC areas relative to control areas over time in terms of officer behaviors during the stop. Relative to control subjects, treatment subjects did report that members of their households (other than the respondent) were less likely to be stopped by the NYPD after the adoption of the BWCs in their precincts, but this difference had a q value of 1 indicating this difference was not statistically meaningful and a false discovery.

Table 3 Probit, ordered probit, and generalized structural equation models of outcomes on differences-in-differences and control variables

Table 3 also presents the results of the model estimating the differences in survey subject responses for the items comprising the outcome variables assessed as latent measure perceptions of the NYPD. The generalized structural equations suggested there were no bona fide statistically significant changes in the perceptions held by telephone survey subjects in the BWC treatment precincts relative to subjects in the control precincts. While the p value associated with the DID estimates suggested that their negative perceptions of police officer behaviors decreased with the introduction of BWCs (i.e., became more positive), the FDR analysis suggests this finding is a false discovery (q = 1).

Table 4 shows the results from an exploratory analysis of responses by the race/ethnicity of the respondents (white, black, and Hispanic). As the q values in Table 5 reveal, there were no genuine statistically significant differences in citizen perceptions of the NYPD generated by the deployment of BWCs on NYPD officers in the treatment relative to control precincts among varying racial groups of respondents.

Table 4 Probits, ordered probits, and generalized structural equation models by respondent race

Conclusion

This controlled experimental evaluation tested the impacts of outfitting NYPD officers with BWC cameras on broader community perceptions of the police and their experiences during their most recent encounter with NYPD officers. Telephone surveys were administered to representative samples of adult residents in the 40 NYPD precincts (20 matched pairs) with the highest number of civilian complaints. The results indicate that the deployment of BWCs on NYPD officers working in treatment precincts did not produce any statistically significant differences in resident perceptions of the NYPD and their experiences with NYPD officers over the course of the trial relative to the perceptions held by residents in the control precincts.

Existing studies have generally not tested whether BWC deployments change broader community perceptions of the police (Lum et al., 2019). As discussed earlier, media coverage and survey research suggest community members support the adoption of body-worn cameras by police departments and have high expectations for the technology in increasing police accountability and improving citizen confidence in the police (Crow et al., 2017; Taylor et al., 2017; Todak et al., 2018). However, the one existing study on changes in public perceptions of the police associated with BWCs employed a nonexperimental evaluation of a UK police department that had high levels of citizen support and satisfaction for the police before BWCs were adopted (Ellis et al., 2015). In NYC, residents expressed similar enthusiasm for the adoption of BWCs by the NYPD (NYU Policing Project, 2016). During the pre-intervention period, our research found that surveyed residents in the experimental precincts held mixed opinions of the NYPD: less than 60% of telephone survey respondents reported somewhat favorable or very favorable feelings towards NYPD officers in their neighborhood. Similar to the UK study (Ellis et al., 2015), our experimental evaluation found that the BWC deployment did little to change these pre-existing resident perceptions of the NYPD and their encounters with NYPD officers.

This research is limited by the relatively low response rate by prospective telephone survey subjects. While our analyses suggest that nonresponse bias is not a problem in comparing public perceptions between respondents in BWC treatment and control precincts, we do not know whether nonrespondents differed between the treatment and control precincts in some unmeasured systematic way that could lead to different conclusions about changing perceptions of the NYPD. The relatively small number of treatment and control respondents who reported being stopped by the NYPD in the past 12 months also may not reflect the experiences of the broader population of people stopped by the NYPD during the pre-intervention and post-intervention periods. Due to New York State Criminal Procedure Law 140.50(4) prohibiting the entry of these data elements, the NYPD does not maintain computerized records of the names and dates of birth of individuals who were stopped by its officers. As such, it was not possible to design a data collection strategy that conducted follow-up surveys with the subjects of NYPD stop reports. The NYPD monitor team complemented the telephone survey with in-person community surveys of a much smaller number of respondents in five matched pairs of treatment and control precincts (one pair in each borough). These findings did not differ from the telephone survey results (and are available at http://nypdmonitor.org/wp-content/uploads/2020/12/12th-Report.pdf).

It is also important to note that our results may not be generalizable beyond the unique context of the 40 precincts in NYC. Our findings diverge from existing research suggesting that the presence of BWCs enhances citizen perceptions of procedural justice during their encounters with the police. We think these differences may, in part, be due to varying methodological approaches. Since NYPD policy requires officers to notify citizens that encounters are recorded, we did not explicitly ask whether treatment respondents noticed BWCs on officers during their encounters. As a result, we are not able to do a subgroup analysis of respondents who did notice BWC presence during their encounter.Footnote 25 Demir et al. (2020) conducted in-person interviews with stopped motorists immediately following their encounter with traffic police officers. Consistent with the US Bureau of Justice Statistics (2018) Police-Public Contact Survey methodology, our research asked respondents if they experienced a pedestrian stop, an automobile stop, and/or a contact for assistance within the past 12 months, and, if they affirmed, then we collected data on their perceptions of police behaviors during the encounter. While we did not ask how long ago their contacts with police occurred, it is possible that any positive perceptions of procedurally just behaviors by officers decayed in the time between the encounter and subsequent interview and did not exert much of an influence on their recollection of NYPD officer behaviors and general assessments of the NYPD. However, other studies suggest that citizens often have detailed memories of their interactions with the police, especially when those interactions generate negative perceptions of officer behaviors (Brunson, 2007; Rios, 2011).

Research underscores the importance of examining direct and vicarious associations between police contacts and appraisals of the police (Rosenbaum et al., 2005; Weitzer & Tuch, 2005; Brunson, 2007). Like the asymmetrical effects noted in other cities (Skogan, 2006), the Vera Institute of Justice survey found that positive experiences with the police were not associated with substantially higher levels of confidence in the NYPD while negative experiences were associated with low confidence levels (Miller et al., 2004). Further, across nine monthly surveys of NYC residents, the Vera study reported that citizen perceptions of the police, whether positive or negative, were quite durable over time (Miller et al., 2004). The placement of BWCs on NYPD officers seemed to improve the civility of police-citizen encounters as evidenced by a 21% reduction in citizen complaints against BWC treatment officers relative to control officers in the larger evaluation that complements this study (Braga et al., 2021). However, citizen complaints of poor police behavior during encounters are fortunately rare events (Ariel et al., 2015; Braga et al., 2018). NYPD officers in the treatment and control groups generated, on average, only one citizen complaint for every 4 years of service. While noteworthy, the reduction in a very low base rate event may not be a powerful enough change over a long enough time period to generate a meaningful shift in durable resident perceptions of the NYPD.

As a policy intervention, the results of this experimental evaluation suggest that police department adoptions of BWCs are not a panacea to police-community relation problems. Indeed, our exploratory analysis of the data did not detect any divergences in the perceptions of the police held by white, black, and Hispanic subjects in the treatment precincts relative to control precincts. The NYPD and other police departments may be better served by doubling down on other programs that have solid scientific evidence of enhancing community attitudes towards the police. For instance, evaluations generally show that citizen perceptions of police performance, satisfaction, and legitimacy are improved by community policing programs (Gill et al., 2014; National Academies of Sciences, 2018). While the growing evidence base is not yet strong enough to support causal assertions (Nagin & Telep, 2017), studies show that citizen perceptions of procedural justice during their encounters with the police are associated with increased perceptions of police legitimacy and cooperation with the police (Tyler, 2006; National Academies of Sciences, 2018). BWCs seem to be helpful in inspiring officers to be procedurally just during encounters with citizens (McCluskey et al., 2019; Demir et al., 2020); however, police departments should be formally training their officers to embrace procedural justice principles during all interactions with the public and not assume the BWC technology will generate positive interactions.

This research suggests that BWCs are unlikely to lead to short-term changes in public perceptions of the police. However, it remains possible that the BWC technology could produce longer -term benefits. When controversial events happen, the public expects to see video of the police-citizen encounter so they can judge whether officers acted lawfully and behaved appropriately. The absence of BWC video capturing a highly controversial police-citizen encounter could result in diminished police legitimacy, generate expensive overtime costs when managing associated citizen protests, and be particularly detrimental should protests become destructive riots. At the very least, the presence of BWCs on officers suggests to community members that mechanisms exist to ensure transparency and hold officers accountable when they misbehave. And, as a component of a broader set of evidence-based strategies to improve community perceptions, the placement of BWCs on officers could help to enhance the legitimacy of the police to the public they serve.