Abstract
Samples drawn from commercial online panel data (OPD) are becoming more prevalent in applied psychology research, but they remain controversial due to concerns with data quality. In order to examine the validity of OPD, we conduct meta-analyses of online panel samples and compare internal reliability estimates for scales and effect size estimates for IV–DV relations commonly found in the field with those based on conventionally sourced data. Results based on 90 independent samples and 32,121 participants show OPD has similar psychometric properties and produces criterion validities that generally fall within the credibility intervals of existing meta-analytic results from conventionally sourced data. We suggest that, with appropriate caution, OPD are suitable for many exploratory research questions in the field of applied psychology.
Similar content being viewed by others
Avoid common mistakes on your manuscript.
An Examination of the Convergence of Online Panel Data and Conventionally Sourced Data
“I have recommended reject on every paper I’ve reviewed using this technique. I hope that it is a passing fad, because it is already hurting the integrity of our journals and quality of our science.” –Review Board Member
“This is a great survey tool! I look forward to seeing more papers using such a survey technique.” –Review Board MemberFootnote 1
We live in turbulent times for survey research methods. Social scientists in general, and survey researchers in the areas of applied psychology in particular, are finding it more difficult to access high-quality survey data. In response, applied psychology researchers have increasingly turned to commercial firms that recruit pools of potential respondents to participate in survey and opinion research, usually for compensation. Because recruitment and access to subjects is largely conducted through the internet, data provided by companies such as MTurk, StudyResponse, and Qualtrics have come to be known as online panel data (OPD). OPD services typically recruit a large pool of respondents who agree in advance to participate in survey studies on a variety of different topics. Essentially, anyone with internet access can volunteer to become a panel member or “opt in” and can choose to participate in a given task or not. Many online panels provide payment for participation in the form of cash incentives, gift cards, or charitable contributions, sometimes as little as $ 0.25 for a short survey. However, questions exist about the suitability of OPD for applied psychology research.
Researchers have used OPD in a range of fields since the 1990s (Postoaca, 2006). Goodman and Paolacci (2017) note that 43% of the behavioral studies published in the Journal of Consumer Research from June 2015–April 2016 were conducted on MTurk. As such, much of the research regarding the reliability of OPD comes from the consumer research field (e.g., Goodman & Paolacci, 2017; Sharpe Wessling, Huber, & Netzer, 2017). The adoption of OPD in applied psychology, although less pervasive, has grown considerably in the last 5 years. To demonstrate this point, we manually reviewed the last 10 years of six highly cited applied psychology journals (i.e., Academy of Management Journal, Journal of Applied Psychology, Journal of Management, Journal of Organizational Behavior, Organizational Behavior and Human Decision Processes, and Personnel Psychology). We found only 31 samples that used OPD in the 5 years from 2006 through 2010, but 307 samples in the 5 years from 2011 through 2015, an almost tenfold increase. Although we can glean some insight from the consumer research studies, it is important to consider the suitability of OPD for empirical studies explicitly in applied psychology.
Two main concerns with OPD revolve around the measurement properties of OPD and the characteristics of OPD samples (Landers & Behrend, 2015; Paolacci, Chandler, & Ipeirotis, 2010). Regarding measurement properties, the key question is the extent to which OPD respondents provide data that is reliable and meaningful. Regarding characteristics of OPD samples, the key question is how different OPD respondents are from “typical” respondents. A number of studies have examined demographic and employment characteristics of OPD samples relative to other, more traditional sampling techniques, such as student or organizational samples (Behrend, Sharek, Meade, & Wiebe, 2011; Paolacci et al., 2010; Gosling, Vazire, Srivastava, & John, 2004; Sprouse, 2011). However, this approach has both empirical and conceptual limitations. Demographic comparisons do not address the extent that constructs’ relationships for OPD samples differ from conventional applied psychology samples (Shadish, Cook, & Campbell, 2002). We attempt to address this question of generalizability by comparing relations among constructs based on OPD with established population estimates of these same construct relationships.
The Current Study
The purpose of our study is to examine evidence regarding the extent to which online panel samples produce psychometrically sound and criterion-valid research results in the field of applied psychology. The strategy we adopt is to identify a set of frequently examined relations in studies using OPD, including such independent variables as leadership, personality, and affect and their relationship with outcome variables including job satisfaction, organizational commitment, organizational citizenship, and counterproductive work behavior. We then conduct a set of meta-analyses on published and unpublished studies in the field of applied psychology that have used OPD and compare the scale reliabilities and the effect size estimates from these studies with meta-analytic estimates already established in the existing literature. If the reliability and effect size estimates based on OPD studies fall within the credibility intervals provided by established meta-analyses (based on conventionally sourced data), we infer that OPD is not substantively biased relative to conventional samples currently in use. As described previously, others have used primary data to examine the demographic characteristics of OPD as a means of assessing external validity. This paper is the first to focus directly on the extent to which observed results using OPD are consistent with population estimates in the field. Our strategy, based meta-analytic estimates, complements previous approaches that are based on primary data alone.
Theoretical Concerns with Online Panel Data
Landers and Behrend (2015) suggest reviewers often dismiss OPD as a sample source due to a variety of assumptions that remain largely untested and perhaps even unstated. Fortunately, several scholars have expressed their concerns with OPD explicitly and systematically in published form (Harms & DeSimone, 2015; McGonagle, 2015; Feitosa, Joseph, & Newman, 2015). Below, we review issues of external validity and internal consistency as they relate to OPD and develop the research questions of the study.
External Validity and Online Panel Data
Some scholars question the external validity of OPD because the variety of recruitment methods used result in a nonprobability respondent population (e.g., Harms & DeSimone, 2015). This means that the total pool of potential online respondents is not a representative sample of the US or world working population, the population to which most applied psychology researchers at least implicitly wish to generalize (Landers & Behrend, 2015). Indeed, evidence suggests that OPD samples are more diverse, younger, more educated, but more poorly paid than the general US population (Paolacci et al., 2010; Gosling et al., 2004; Sprouse, 2011) and, at the same time, more diverse, older, and more work experienced than a typical undergraduate research sample (Behrend et al., 2011). However, representative sampling or stratified random sampling is rarely used in applied behavioral science research, including applied psychology research (Fisher & Sandell, 2015; Shadish et al., 2002). Rather, samples of convenience are used, most often employees drawn from a single work organization. Such samples are unlikely to be representative of the entire US working population or even less, the worldwide working population (Highhouse & Gillespie, 2009; Landers & Behrend, 2015). For example, Bergman and Jean (2016) showed that, in the aggregate, samples in top I–O journals over-represent salaried, managerial, professional, and executive employees and under-represent wage earners, low- and medium-skilled employees, first-line personnel, and contract workers, relative to the US and international labor pool. Does the lack of representative sampling techniques and the resulting non-representative samples mean that the vast majority of the survey research in the field of applied psychology lacks external validity? Not necessarily.
Methodologists have long argued that the importance of representative sampling depends on the purpose for which the research sample is drawn (Fisher, 1955; Highhouse & Gillespie, 2009; Gillespie, Gillespie, Brodke, & Balzer, 2016). For example, public opinion pollsters as well as consumer behavior researchers typically seek to generalize a sample statistic (e.g., the sample mean) to the larger population in order to predict the voting or buying behavior of that population. They typically rely on representative sampling because a non-representative sample will lead to an inaccurate point estimate of a given attitude or behavior in the general population. Applied psychologists, on the other hand, are typically interested in theoretical generalizability. Theoretical generalizability concerns the extent to which presumed causal relationships among constructs can be expected to hold across other times, settings, or people (Cook & Campbell, 1976; Sackett & Larson Jr, 1990; Shadish et al., 2002). Sackett and Larson (Sackett & Larson Jr, 1990) argue that reasonable sacrifices of representative sampling are justifiable if the primary question is whether the presumed causal relationship under investigation can occur and if the purpose of the study is to falsify a theory through null hypothesis significance testing, circumstances that are typical of the applied psychology field. According to Sackett and Larson Jr (1990), under these circumstances, the sole criteria for selecting a setting and sample is that the sample be a relevant sub-group of the general population to which one wishes to generalize.
The logic of theoretical generalizability thus justifies the use of convenience samples for specific scientific purposes even when they do not strictly represent the population to which one wishes to generalize, so long as they may reasonably be seen as a sub-population of the larger population (Sackett & Larson Jr, 1990; Shadish et al., 2002). Several scholars have in fact argued that OPD are more generalizable than typical organizational samples precisely because they are more diverse and because demographic and other characteristics can be screened for in advance to compose samples with the desired characteristics (Bergman & Jean, 2016; Landers & Behrend, 2015). However, some scholars suggest that OPD samples are so different they essentially do not form a sub-group of the population to which the researcher wishes to generalize. Demographic and other characteristics are self-reported and respondents may have financial or other reasons to provide inaccurate information regarding, for example, their nationality or employment status (Feitosa et al., 2015; McGonagle, 2015). Although the typical organizational sample may not be representative of the working population or even of the entire organization from which it is drawn (Bergman & Jean, 2016; Landers & Behrend, 2015), at least the researcher has some confidence respondents are indeed employed workers at the organization (McGonagle, 2015).
These authors suggest that OPD samples differ from traditional samples of convenience on key demographic and employment characteristics and, further, we can never know for certain how much they differ due to the potential for false reporting. However, as our review of the external generalizability suggests, the critical question is not if samples of convenience differ from the general population. Rather, the question is whether these differences are substantial enough to have a systematic influence on the theoretical relationships of interest to the researcher (Highhouse & Gillespie, 2009; Gillespie et al., 2016; Sackett & Larson Jr, 1990). Fortunately, we can compare the effect size estimates produced by OPD samples with those produced by conventional data without knowing anything about the underlying characteristics of the samples. Therefore, failure to find substantive effect size differences suggests, indirectly, that either sample characteristics do not differ substantially across these two types of data sources or that they differ on characteristics that do not have a significant influence on the effect size estimates.
The strategy we use in this paper is based upon comparisons of cumulative results using meta-analysis rather than a single primary sample. We conduct an omnibus test for differences between OPD and conventionally sourced data, assessing overall differences in effect size resulting from all factors that might differ between the two types of data. If OPD samples do differ from traditional samples used in applied psychology to such an extent that they do not derive from the same general population, we should expect to find the effect size estimates based on studies using OPD to differ significantly from those using traditional organizational samples. If we find substantial differences in effect size estimates, generalization from OPD samples to the general working population will be unjustified without serious consideration of the way these characteristics moderate or limit OPD results. If, on the other hand, we fail to find substantive differences, the field can be more confident that, although OPD samples may be different in a variety of ways, they make up a sub-population of the full population to which we wish to generalize. We might then treat them as we would any other sample of convenience, as the source of tentative theoretical generalizations to the broad working population but with observed effects open to further exploration for moderation in different or less range restricted samples. This logic leads us to our first research question.
Research question 1: Do relationships among independent and dependent variables derived from online panel data differ from the same relationships found in conventionally sourced data?
Measurement Error and Online Panel Data
The second concern with OPD relates to measurement error. Measurement error occurs when individuals’ answers are not accurate or “true” (Dillman, Smyth, & Christian, 2014). One of the primary reasons measurement error may occur is that respondents pay little attention to survey items in anonymous or low-stakes responding situations. Huang, Curran, Keeney, Poposki, and DeShon (2012) have defined insufficient effort responding (IER) as a response set in which participants answer survey questions with little motivation to comply with survey instructions, correctly interpret item content, or provide accurate responses. The effects of such careless responding has generally been assumed to be the introduction of more random measurement error and thus weaker observed relationships with criterion variables (Schmidt & Hunter, 2014; Nunnally, 1978). However, patterned responding (e.g., pick 4 for all questions) may inflate internal reliability if scales items are grouped together and no reverse items are used (Huang et al., 2012) or may inflate observed correlations when the IER response set biases means in the same direction across multiple variables (Huang, Liu, & Bowling, 2015). Researchers have suggested a number of techniques for detecting IER, such as response time, extreme infrequency or bogus items, and psychometric antonyms (Huang et al., 2012; Meade & Craig, 2012).
A number of scholars have suggested OPD may be more prone to IER because respondents have a primarily monetary motivation for responding (McGonagle, 2015). Further, “professional” panel members, that is, members who participate in many surveys or belong to more than one panel, might maximize their income by speeding through surveys with little attention to the accuracy of their responses (Baker et al., 2010; Smith & Hofma Brown, 2006; Sparrow, 2007). Some research has examined the motivation of OPD responders and found that compensation is indeed a primary motivation of survey participation, but interest in the topic, self-insight, and altruism are also important motivators (Behrend et al., 2011; Brüggen, Wetzels, de Ruyter, & Schillewaert, 2011; Paolacci et al., 2010). Evidence linking frequent participation in surveys to IER is also weak. For example, Hillygus, Jackson, and Young (2014) showed that experienced survey takers complete surveys more quickly, but there was no relationship between participation frequency and poor responding. In fact, Hillygus et al. (2014) found less bias in the frequent responders than in the infrequent survey responders in the YouGov panel sample they examined relative to population benchmarks.
Other scholars have used detection techniques to directly examined IER in OPD sources. While evidence for IER is present, it is not clear that IER is more prevalent in OPD than in other types of samples. For example, Harms and DeSimone (2015) report 9.5% of their sample responded incorrectly to bogus items inserted in their survey and as much as 35% of their MTurk sample provided extreme outlier response patterns. However, Ran, Liu, Marchiondo, and Huang (2015) reported infrequent item responses ranging from 2.5 to 11.2% in four datasets based on MTurk data were similar to rates found in four of their student samples. Ran et al. (2015) concluded that OPD and student samples were equally prone to IER. Likewise, Fleischer, Mead, and Huang (2015) found 15–20% of OPD respondents identified as inattentive, rates only somewhat higher than student samples (Meade & Craig, 2012). Fleischer et al. (2015) suggested that features of some online panel sources, such as MTurk’s respondent quality ratings function, may render OPD less prone to IER than traditional samples if used properly.
Finally, researchers have directly examined the quality of OPD based on psychometric properties. These scholars typically conclude OPD is at least as high-quality as student and field samples. For example, Buhrmester, Kwang, and Gosling (2011) found Cronbach’s alpha and 3-week test–retest reliability of OPD to be good to excellent. Likewise, Behrend et al. (2011) found slightly higher internal consistency estimates in the OPD than in the student sample they examined. Behrend et al. (2011) also used item response theory analyses (Meade, 2010) and found minimal difference in the response characteristics of the OPD and student samples. Feitosa et al. (2015) assessed measurement equivalence (Vandenberg & Lance, 2000) of a measure of Big Five personality on an OPD (MTurk) sample, a student sample, and an organizational sample. They used the default settings for MTurk survey data collection, which includes workers with a 95% approval rate but no specified geographic origin. They found a lack of measurement equivalence with the student and organizational samples when using the whole MTurk sample. However, they found both configural invariance (i.e., the same pattern of factor loadings across samples) and metric invariance (i.e., factor loadings constrained to be equal across samples) when IP addresses were used to eliminate probable non-native English-speaking subjects from the MTurk sample. They conclude that OPD demonstrates measurement equivalence when data is collected from countries where English is the native language.
Thus, while a number of questions have been raised about OPD, previous empirical research suggests that the psychometric properties of OPD are not significantly worse than that of other sample sources. Each of the studies reviewed above is based on the analysis of primary data. Although meta-analytic data cannot be used to conduct item-level data quality analyses, it can be used to assess scale-level indicators of the psychometric quality of OPD, such as reliability. Use of meta-analytic techniques complements the work done with primary data because it allows us to draw more general conclusions about OPD. We therefore compare meta-analytically derived reliabilities based on OPD and traditional data sources in the literature. If the psychometric properties differ, we can conclude that OPD has more measurement error than traditional samples and researcher should give serious consideration to the use of IER techniques with such data. If, however, differences do not emerge, we may conclude that OPD and traditional samples have similar internal reliabilities.
Research question 2: Do the internal reliability estimates of samples using online panel sources differ from those of conventionally sourced data?
Methods
Identification of Studies
Our meta-analysis included 90 independent samples based on online panel data for 32,121 online panel participants. Of the 90 samples, 54 were published in academic journals and 36 were from dissertations or samples that were unpublished. To increase the likelihood of gathering available studies based on online samples, we first searched electronic databases (i.e., PsycINFO, Google Scholar, ABI Inform, and ProQuest Dissertations) for the following keywords and various combinations thereof: online panel, Study Response, StudyResponse, MTurk, Mechanical Turk, Qualtrics Panel, Survey Monkey, Zoomerang, online respondent, online study, internet sample, internet panel, and online sample. Combined there were over 25,000 studies that cited one or more of the search terms as of December 31, 2015. We also conducted a manual search of six top applied psychology journals that have published OPD (i.e., Academy of Management Journal, Journal of Applied Psychology, Journal of Management, Journal of Organizational Behavior, Organizational Behavior and Human Decision Processes, and Personnel Psychology) for the years 2006–2015. Finally, we posted calls for additional in-press or unpublished articles on two OB/HR listservs, HRDIV_NET and RMNET; we gathered six additional studies in this way.
Inclusion Criteria
Our initial search included over 25,000 total citations with one or more of the search terms. We were interested in finding empirical data from an online respondent pool (e.g., StudyResponse, MTurk, Qualtrics) which had included a common OB/HR relationship with existing meta-analytic data that could be used for comparison. Of the total citations that included one or more of the online panel search terms, 5463 also included mention of at least one key variable of interest (i.e., either an independent (IV) or dependent variable (DV) of interest). As our search included information from several databases, we then searched for any duplicate citations, which reduced the remaining number to 3158 citations. We then determined which of these studies included quantitative, statistical data resulting in 838 potential studies remaining. Of these 838 quantitative studies, only 107 contained a relationship (i.e., IV–DV relationship) of interest (e.g., conscientiousness to OCB). Many studies using online panels were experimental in nature and testing a new manipulation or intervention on a DV of interest, and not necessarily an IV–DV relationship of interest.
Of the 107 studies considered for inclusion, 23 studies provided data that was not useable for our purposes (see Appendix 3 for a full list of these studies). The following study types were excluded: studies which used an online webhosting service (e.g., Qualtrics) but collected data from a conventional sample (e.g., employees at a specific company, k = 10), studies which mixed conventional and OPD samples together (k = 9), data which used an online panel data that was designed to be unique to a specific, non-generalizable population (e.g., sample drawn from Craigslist in a given area, k = 3), and studies which used online panel participants and examined relationships of interest but did not report an effect size (k = 1). Furthermore, if a paper contained multiple studies, only data from studies using exclusively an OPD sample were included. The available OPD needed to consist of relationships that were comparable to existing conventionally sourced meta-analyses; only those relationships for which enough OPD studies were available (i.e., k ≥ 3) were analyzed and compared. We followed Wood’s (2008) detection heuristic to ensure that we did not include any duplicate study effects.
Following guidelines outlined by Schmidt and Hunter (2014), we averaged correlations obtained from samples using multiple measures of the same construct (e.g., OCB) so that each effect size reflected a unique sample. We corrected the variance of the averaged effect size using equations provided by Borenstein, Hedges, Higgins, and Rothstein (2009). Finally, there were no criteria regarding the publication date or sample nationality. The nationality of sample participants was not clearly reported for most of the samples (k = 50). Of the 40 samples whose participants’ nationality was reported, most were exclusively from the USA (k = 30). There was one exclusively Dutch sample. The remaining samples (k = 9) were of mixed nationalities with participants from the USA and other countries. Of those nine samples, seven samples included a majority of US participants and two samples included a majority of participants from India. Two members of the authorship team coded the studies. These individuals independently coded a random subset of the studies and the interrater reliability was high at 99.3% (868 cells/874 cells; Cohen’s kappa = .986). The discrepancies were resolved through discussion.
We coded the OPD studies for the type of data pre-screening and quality checks used by the original authors. Unfortunately, 34% of the samples provided no information about pre-screening of participants and 53% provided no information about data quality checks. Since non-reporting does not necessarily mean no checks were employed, we deemed this coding too “noisy” to analyze. Nevertheless, it may be instructive to know that 30% of the samples reported requiring participants to have a specific work status (e.g., full time or a minimum number of hours per week), 27% required other specific work characteristics (e.g., have a direct supervisor), and 24% required a specific geographic setting (however, only 16% reported using screening questions to ascertain these participant attributes). Further, some type of insufficient effort responding checks (e.g., bogus items or pattern responding) was used in almost 35% of the samples. Elimination of subjects for missing data was reported in 27% of the samples.
Selection of Comparison Conventional Meta-analyses
To determine whether the OPD population estimate falls within the 80% credibility interval of existing, conventionally sourced meta-analyses, we created a protocol to identify existing meta-analytic data to use. The decision rules agreed upon by the research team prior to one of the researchers searching for and identifying meta-analyses examining the common OB/HR relationships of interest are as follows. First, the researcher found all existing meta-analyses which had data for a given relationship. Then, if multiple meta-analyses were identified for a single relationship, the study with the highest k around which CVs could be constructed was chosen. It was important to use the point estimate and corresponding CVs with the highest k to provide the most accurate and reliable population estimate of conventionally sourced data. Furthermore, since we are comparing overall effects between OPD and conventional meta-data, the overall effect sizes were used when possible (i.e., data from “main effects” tables) instead of choosing effect sizes as part of moderator analyses. Thus, whenever possible, we compare main effects and corresponding CVs of conventional meta-data with main effects of OPD. When applicable, we used weighted averages to calculate an overall effect size for constructs. We noted instances of this at the bottom of Table 4 in Appendix 1. Finally, we ensured that the corrected scores for all meta-analytic results were as comparable as possible. All but one of the meta-analyses corrected for reliability in the independent and dependent variables and made no other corrections. One conventional meta-analysis (Chiaburu, Oh, Berry, Li, & Gardner, 2011) also corrected for range restriction in the predictor (personality) values using the estimated range restriction ration (ux) from Schmidt, Shaffer, & Oh, 2008
Meta-analytic Techniques
We used Schmidt and Hunter (2014) psychometric meta-analysis for analyzing the effect sizes of the OPD correlational relations. We performed the calculations using metatfor in R (Viechtbauer, 2010). To ensure that the OPD true score calculations were as comparable as possible, we corrected for reliability in the independent and dependent variables for all of our analyses. For those data missing reliability information, we used artifact distributions (Schmidt & Hunter, 2014). Additionally, we used the ux values from Schmidt et al. (2008) to correct for direct range restriction in the personality values when calculating the true score values between the Big Five personality traits and OCB (to be comparable with Chiaburu et al., 2011). The ux values used were as follows: conscientiousness .92, agreeableness .91, neuroticisim .91, extraversion .92, and openness to experience .91.
To compare scale reliabilities, we used reliability generalization, a framework developed by Vacha-Haase (1998) based on the concept of validity generalization, as a means to amalgamate the variability in reliability estimates that occurs across measurements. The goal of reliability generalization is similar to that of a traditional meta-analysis: to obtain a weighted mean alpha and estimate the degree of variability in alpha across different measurements and samples. Consistent with best practices (Botella, Suero, & Gambara, 2010), we performed all calculations on non-transformed estimates of alpha. We weighted the alphas by their inverse variance. We calculated the variance using derivations of the SE of alpha as explained by Duhachek, Coughlan, and Iacobucci (2005).
Moderator Analysis
Although the primary purpose of this research study was to compare the effects of OPD to those from conventional data sources, we performed some supplementary analyses to examine potential moderators that may influence the OPD effect sizes. We examined three potential moderators: publication status, OPD source, and publication date. Regarding publication status, it is likely that reviewers have more closely scrutinized data from published studies and therefore these data have undergone more data cleaning and integrity checks than data in unpublished studies. These additional integrity checks may moderate the examined relationships. Regarding OPD source, subjects from MTurk often have lower compensation rates than other paid OPD sources, such as StudyResponse or Qualtrics. Therefore, MTurk respondents may have systematic differences from the other OPD sources due to the lower compensation (e.g., they may speed through the survey randomly selecting choices which may attenuate relationships). Finally, it may be possible that the nature of OPD respondents has changed over time, as OPD has become more popular. Therefore, the data when OPD was collected may moderate relationships. We used the metafor program in R (Viechtbauer, 2010) with restricted maximum-likelihood estimation to examine whether or not these three moderators influenced the OPD relationships. For publication status and OPD source, we examined relationships where we had at least three studies in each group. For publication date, we performed the moderator analysis when there was at least one study published in three different years.
Results
Research Question 1: External Validity
Our first research question was whether relationships among variables derived from online panels differ from conventionally sourced data. We present the meta-analytic estimates from OPD samples in Table 1 and graphically in Fig. 1. We compare the results from the OPD meta-analysis to the meta-analytic estimates that we gathered from the existing literature, which we present in Table 4. Recall that our research question asks if ρ-OPD, the population estimate of the size of a given relationship based upon studies using online panel data, falls within the 80% credibility interval of the population estimate based on the conventionally sourced data. We found that 86% (37/43) of the IV-DV relationships fell within the 80% credibility intervals of conventionally sourced data.Footnote 2
Each of the relationships that fall outside the credibility interval tend to be stronger for the OPD sources than for the conventional sources, whether more positive or more negative. Three of the five relationships that were outside the credibility interval involved turnover intentions. The relationship between positive leadership and turnover intentions was more negative for OPD (ρ = − .50 than in conventional samples (80% CV − .40, − .06). The relationship between conscientiousness and turnover intentions was also more strongly negative for OPD (ρ = − .29) than in conventional samples (80% CV − .24, − .08). Finally, the relationship between openness to experience and turnover intentions was consistently negative for OPD (ρ = − .17; 80% CV − .28, − .07), whereas there was a less consistent relationship in the conventional samples (80% CV − .15, .17).
We also examined the confidence intervals to note any pattern of significant differences in the OPD versus conventional superpopulation effect sizes. Confidence intervals were reported in the conventional meta-analyses for 29 of the effect sizes (not all conventional meta-analyses reported confidence intervals). We found that ρ-OPD was within the 95% confidence interval of the conventional meta-analytic effect size in 10 of the cases, was outside the upper bound of the confidence interval in nine of the cases, and was outside the lower bound of the confidence interval in 10 of the cases. Of the 19 effect sizes that fell outside the confidence interval (either upper or lower bounds), 11 of the OPD effect sizes were stronger than the conventional effect sizes and eight of the OPD effect sizes were weaker than the conventional effect sizes. These results suggest that there is no systematic difference between the OPD effect sizes and the conventional effect sizes. This is not to say that there are not differences, rather the differences do not seem to follow any interpretable pattern. As a final check of the confidence intervals, we examined whether or not the 95% confidence interval from the OPD meta-analysis overlapped with the 95% confidence interval from the conventional meta-analyses. There were three cases where the confidence interval did not overlap: conscientiousness-turnover intentions, openness to experience-turnover intentions, and negative affect-CWB.
Moderator Results
We examined three potential moderators that may influence the OPD relationships of interest: publication status, OPD source, and publication date. Although a few differences emerged, these differences were generally small and no systematic pattern of differences emerged. Publication status (published versus non-published) moderated only three of the 18 relationships that we examined (neuroticism-job satisfaction, neuroticism-CWB, and negative affect-CWB). Two of the three relationships were attenuated by publication status (negative affect-CWB was strengthened). Source (MTurk versus other) moderated two of the 19 relationships that we were examined (conscientiousness-job satisfaction and negative affect-CWB). One of the two relationships was attenuated by source (negative affect-CWB was strengthened). Finally, publication date moderated four of the 39 relationships examined (extraversion-turnover intentions, extraversion-CWB, openness-job satisfaction, and negative affect-turnover intentions). One of the four relationships was attenuated by date (the relationship between openness and job satisfaction was weaker as the publication date increased). Because of the null findings, the results of these analyses are not included in the manuscript but are available from the first author upon request.
Research Question 2: Reliability Generalization
Our second research question asked whether the internal reliability estimates from online panel sources differ from those found in conventionally sourced data. The results for the reliability generalization are presented in Table 2 and, graphically, in Fig. 2. Here, we compare the results of the reliability generalization analysis using OPD sources to a comprehensive reliability generalization study conducted by Greco, O’Boyle, Cockburn, and Yuan (2015). We were able to compare the reliability point estimate of 12 constructs from the Greco et al. (2015) analysis to reliability generalization using the OPD sources. All 12 point estimates from the OPD analysis fell within the 80% credibility estimate from the larger reliability generalization study. These results suggest that the internal consistency of scales with OPD samples is similar to that of conventional sample sources.
Discussion
Online panel sources are increasingly being used to compose research samples in the field of applied psychology. The purpose of our research was to examine the external validity and measurement properties of OPD. We used meta-analytic techniques to aggregate the published and unpublished online survey data and compare the psychometric properties and criterion validity of this data to that found in conventional data sources. Our reliability generalization analyses showed that 100% (12 of 12) of the reliability generalization estimates from OPD samples were within the 80% credibility values of the reliability estimates based on conventional samples (Greco et al., 2015). Based on both the primary data analyses reported in previous work and our analyses using aggregate data reported here, it appears that OPD does not systematically affect internal consistency in applied psychology research.
Little previous research has examined the criterion validity of OPD in the field of applied psychology. To test external validity, we calculated meta-analytic effect size estimates for 43 IV–DV relations frequently found in OPD and compared them to these same relations based on conventional data. The OPD population estimate fell within the 80% credibility interval established in previous meta-analyses based on conventional data 86% of the time, suggesting differences between OPD and conventional data do not exceed chance. Thus, OPD appears to provide effect size estimates that do not differ from conventional data in the field. Together, our examination of the internal and external validity of data provided by online panel sources suggests such data as appropriate as other samples of convenience used in the field of applied psychology. As with all convenience samples, it important to be able to justify that the sample source is appropriate for addressing the hypotheses/research questions. For example, it would be difficult to justify MTurk as a sample source for a study on CEOs.
Theoretical Implications
It is important to understand the purposes for which OPD is or is not appropriate. OPD, like the vast majority of samples used in applied psychology, provides a convenience sample in the sense that it is not necessarily a representative sample of the US or world working population. It is not appropriate to generalize sample statistics, such as a mean, to a population when using a non-representative samples. However, point estimates are rarely the focus of research in the applied psychology field, which tends to focus much more on causal relations among constructs and rely on the concept of theoretical generalizability. According to generalizability theory (Sackett & Larson Jr, 1990), samples of convenience are appropriate when one wishes to generalize presumed causal relationships among constructs to a broader population and if the convenience sample is reasonably similar to the population to which one wishes to generalize. For such purposes, a completely random or stratified random sampling of the population is not necessary. Rather, one can make a strong case for generalizability if the convenience sample is reasonably similar to the larger population, for example, if the convenience sample is a subsample of the population. Some authors (Harms & DeSimone, 2015) have suggested that OPD respondents may not be truthful about their demographic or employment characteristics but may be so different as to preclude generalization to the broad working population. If this is so, our approach cannot tell us exactly what demographic and work experience characteristics OPD respondents possess, but our results do show that OPD data demonstrate psychometric properties and criterion validities that are not meaningfully different from conventional field data. Thus, even if OPD samples differ from organizational samples on a number of attributes, these differences do not seem to have a systematic influence on the theoretical relationships we examined. This strongly suggests that the OPD samples are reasonably similar to other samples typically used in the field and thus make up an appropriate convenience sample.
Practical Implications
Our results and review of the literature on OPD yield a number of practical implications for scholars seeking to use OPD in their research beyond the theoretical considerations discussed above. Although we coded OPD studies for the types of respondent screening and data cleaning procedures used, reporting was inconsistent and incomplete, so we could not determine exactly which procedures were used or what effect each data handling technique might have on the quality of the data. It is important to note that some data screening procedures were used in the majority of the studies that make up our OPD meta-analyses. Therefore, until we can gather more accurate information regarding exactly which screening techniques are used, the conservative approach is to recommend a relatively comprehensive list of the screening procedures we found in the OPD-based studies. Table 3 provides a summary of best practices for data handling derived from the literature and the techniques already used with OPD in the field (see also DeSimone, Harms, & DeSimone, 2015). Overall, we recommend researchers carefully consider the purposes of their study, the population sampling frame, the incentives they use to select and motivate respondents, and the data screening procedures they use to eliminate poor responders. Further, we strongly suggest expressly detailing these procedures in the methods section of the article. Future research should determine which of these procedures are effective.
OPD may not be appropriate if a researcher is theorizing is about specific contextual processes (e.g., information processing) or is concerned with a specific group of people (e.g., CEOs) since the convenience sample may not experience the type of contextual influences and may not make up a subsample of the desired population. Bergman and Jean (2016) go further to suggest that unrepresentative samples may lead scholars to overlook important workplace phenomenon that exist only in specific subgroups, such as food insufficiency or economic tenuousness. However, others have suggested that OPD sources can be of great utility precisely because they are more diverse and provide access to under-represented populations (Smith, Sabat, Martinez, Weaver, & Xu, 2015). Researchers should always be able to justify the appropriateness of the sample (source) for addressing their specific hypotheses.
Limitations and Future Research
This study, based as it is on meta-analytic techniques, has limitations common to meta-analysis. First, because the use of online panels is relatively recent in the field, the number of relationships examined and the number of studies in each meta-analysis is limited. Although we include personality, work attitudes, and leader behavior as independent variables and attitudes, behavioral intentions, and employee behavior as dependent variables, future research might extend our results to a broader range of IV–DV relations. However, the consistent nature of our results leads us to expect similar outcomes with other constructs. Second, the small number of studies for each effect size estimate restricts our ability to conduct moderation analyses by OPD source. Examining our data by OPD service source revealed no substantive differences, but future research based on a greater number of studies could explore this potential moderation with more statistical confidence.
Incomplete reporting in the primary studies regarding the way data were collected limited our ability to explore the extent to which data screening and cleaning might improve data quality. Our results suggest that the data handling procedures currently used in the field are adequate, since the OPD and conventional data do converge, but a more systematic understanding of these factors might make data collection smoother and more cost effective. Further research might also focus on the techniques and practices that the online panel firms themselves use to develop and maintain high-quality survey respondents, including the forms of compensation, identification protocols, and quality feedback from end users (Callegaro, Villar, Yeager, & Krosnick, 2014). Online panel participants and online panel service practices may change at any time, so continued attention to OPD quality issues is warranted.
A third limitation is that some of the more recent meta-analyses that we used to establish the 80% CV for conventional data themselves include a small number of OPD samples. We examined each of the conventional meta-analyses for studies that used OPD samples and found slight overlap. The Choi, Oh, and Colbert (2015) and Chiaburu et al. (2011) meta-analysis contained one study that used OPD. The Mackey, Frieder, Brees, and Martinko (2015) meta-analyses contained five studies that used OPD. We chose to use the existing meta-analyses to represent the established true score estimates in the field because the small number of OPD samples is unlikely to have much influence and because the number of judgment calls necessary to update all of these meta-analyses would inevitably raise questions of their own.
A final limitation is that the majority of the OPD sources used in this study were from USA-based companies (MTurk, StudyResponse, Qualtrics). Due to differences in labor markets, social welfare, the culture of employee-employer relations, and other cultural differences, these results may not generalize to OPD from other countries.
As these future research ideas suggest, there is much more we might want to know about the nature of online panel samples and services. However, our results support a growing body of evidence that online panels can provide data that are appropriate to test some hypotheses about the general population within field of applied psychology.
Notes
These quotes are from an open-ended question (“Is there anything else you wish to say about online panel samples that haven’t been covered in this survey?”) from an anonymous survey sent to a randomized selection of 500 review board members from Academy of Management Journal, Journal of Applied Psychology, Journal of Management, Organizational Behavior and Human Decision Processes, and Personnel Psychology in March 2014.
Although the primary purpose of this research was to examine online panel data as a whole, there may be interest in examining differences between MTurk and other online panel sources (such as StudyResponse and Qualtrics). Therefore, we performed supplemental analysis for relationships where there were a minimum of three MTurk samples and three samples from other online panel sources. These results are not substantially different as 80% of the MTurk relationships and 88% of the Qualtrics/StudyResopnse/Zoomerang relationships were within the 80% credibility interval of the conventional meta-analyses. Results are presented in Table 5.
References
The articles marked with an asterisk are included in the meta-analysis.
*Alarcon, G. M. (2009). The development of the wright work engagement scale (Doctoral dissertation). Available from ProQuest Dissertations & Theses database. (UMI No. 3393395).
*Badger, J. M. (2014). The formative nature of perceived person-environment fit (Doctoral dissertation). Available from ProQuest Dissertations & Theses database. (UMI No. 3615149).
Baker, R., Blumberg, S. J., Brick, J. M., Couper, M. P., Courtright, M., Dennis, J. M., et al. (2010). Research synthesis AAPOR report on online panels. Public Opinion Quarterly, 74(4), 711–781.
*Ballinger, G. A., Lehman, D. W., & Schoorman, F. D. (2010). Leader–member exchange and turnover before and after succession events. Organizational Behavior and Human Decision Processes, 113(1), 25–36.
*Baratta, P. (2014). The “Noonday Demon”, weariness, inattention, or all of the above? Refining the definition and measurement of state boredom (Unpublished doctoral dissertation). The University of Guelph.
*Basford, T. E., Offermann, L. R., & Behrend, T. S. (2014). Please accept my sincerest apologies: Examining follower reactions to leader apology. Journal of Business Ethics, 119(1), 99–117.
*Bauer, J. A. (2013). An investigation of OCB demands and workplace behaviors. (Doctoral dissertation). Graduate theses and dissertations. https://scholarcommons.usf.edu/etd/4634
Behrend, T. S., Sharek, D. J., Meade, A. W., & Wiebe, E. N. (2011). The viability of crowdsourcing for survey research. Behavior Research Methods, 43(3), 800–813.
Bergman, M. E., & Jean, V. A. (2016). Where have all the “workers” gone? A critical analysis of the unrepresentativeness of our samples relative to the labor market in the industrial–organizational psychology literature. Industrial and Organizational Psychology, 9(01), 84–113.
Berry, C. M., Ones, D. S., & Sackett, P. R. (2007). Interpersonal deviance, organizational deviance, and their common correlates: A review and meta-analysis. Journal of Applied Psychology, 92(2), 410–424.
Borenstein, M., Hedges, L. V., Higgins, J. P. T., & Rothstein, H. R. (2009). Introduction to meta-analysis. Chichester: John Wiley & Sons, Ltd.
Botella, J., Suero, M., & Gambara, H. (2010). Psychometric inferences from a meta-analysis of reliability and internal consistency coefficients. Psychological Methods, 15, 386–397.
*Bowling, N. A., & Burns, G. N. (2010). A comparison of work-specific and general personality measures as predictors of work and non-work criteria. Personality and Individual Differences, 49(2), 95–101.
*Bowling, N. A., Burns, G. N., & Beehr, T. A. (2010). Productive and counterproductive attendance behavior: An examination of early and late arrival to and departure from work. Human Performance, 23(4), 305–322.
*Bowling, N. A., Burns, G. N., Stewart, S. M., & Gruys, M. L. (2011). Conscientiousness and agreeableness as moderators of the relationship between neuroticism and counterproductive work behaviors: A constructive replication. International Journal of Selection and Assessment, 19(3), 320–330.
*Bowling, N. A., & Eschleman, K. J. (2010). Employee personality as a moderator of the relationships between work stressors and counterproductive work behavior. Journal of Occupational Health Psychology, 15(1), 91–103.
*Bowling, N. A., & Michel, J. S. (2011). Why do you treat me badly? The role of attributions regarding the cause of abuse in subordinates’ responses to abusive supervision. Work & Stress, 25(4), 309–320.
Brüggen, E., Wetzels, M., de Ruyter, K., & Schillewaert, N. (2011). Individual differences in motivation to participate in online panels: The effect on response rate and response quality perceptions. International Journal of Market Research, 53(3), 369–390.
Buhrmester, M., Kwang, T., & Gosling, S. D. (2011). Amazon’s Mechanical Turk a new source of inexpensive, yet high-quality, data? Perspectives on Psychological Science, 6(1), 3–5.
*Bunk, J. A. (2006). The role of appraisals, emotions, and coping in understanding experiences of workplace incivility.
*Burton, J. P. (2014). The role of job embeddedness in the relationship between bullying and aggression. European Journal of Work and Organizational Psychology, 24(4), 518–529.
Callegaro, M., Villar, A., Yeager, D., & Krosnick, J. A. (2014). A critical review of studies investigating the quality of data obtained with online panels based on probability and nonprobability samples. Online panel research: A data quality perspective, pp. 23–53.
*Carlsen, J. J. (2015). An investigation of work engagement as a moderator of the relationship between personality and work outcomes (Doctoral dissertation, San Diego State University).
*Carsten, M. K., & Uhl-Bien, M. (2012). Follower beliefs in the co-production of leadership. Zeitschrift für Psychologie, 220(4), 210–220.
*Castille, C. M. (2015). Bright or dark, or virtues and vices? A reexamination of the big five and job performance. Louisiana Tech University.
Chiaburu, D. S., Oh, I. S., Berry, C. M., Li, N., & Gardner, R. G. (2011). The five-factor model of personality traits and organizational citizenship behaviors: A meta-analysis. Journal of Applied Psychology, 96(6), 1140.
Choi, D., Oh, I. S., & Colbert, A. (2015). Understanding organizational commitment: A meta-analytic examination of the roles of the five-factor model of personality and culture. Journal of Applied Psychology, 100(5), 1542–1567.
*Chung-Yan, G. A. (2010). The nonlinear effects of job complexity and autonomy on job satisfaction, turnover, and psychological well-being. Journal of Occupational Health Psychology, 15(3), 237.
Cochran, M. N. (2014). Counterproductive work behaviors, justice, and affect: A meta-analysis (Unpublished doctoral dissertation). University of Central Florida, Orlando, FL.
*Cochrum-Nguyen, F. L. (2013). Predicting job performance and job satisfaction: An examination of the five-factor model of personality, polychronicity and role overload (Doctoral dissertation, San Diego State University).
*Cohen, T. R., Panter, A. T., & Turan, N. (2013). Predicting counterproductive work behavior from guilt proneness. Journal of Business Ethics, 114(1), 45–53.
*Cohen, T. R., Panter, A. T., Turan, N., Morse, L., & Kim, Y. (2013). Agreement and similarity in self-other perceptions of moral character. Journal of Research in Personality, 47(6), 816–830.
*Colbert, A. E., Bono, J. E., & Purvanova, R. (2008). Development of a relationship functions inventory: Assessing the functions of high-quality work relationships. Paper presented at the annual meeting of the Academy of Management, Anaheim, CA.
Colquitt, J. A., Conlon, D. E., Wesson, M. J., Porter, C. O., & Ng, K. Y. (2001). Justice at the millennium: A meta-analytic review of 25 years of organizational justice research. Journal of Applied Psychology, 86(3), 425.
Cook, T. D., & Campbell, D. T. (1976). Four kinds of validity. Handbook of industrial and organizational psychology (pp. 224–246).
*Costa, J. B. (2015). Coping through counterproductive work behaviors: An examination of how employees deal with emotional labor (Master’s thesis, Roosevelt University).
*Credé, M., Harms, P., Niehorster, S., & Gaye-Valentine, A. (2012). An evaluation of the consequences of using short measures of the Big Five personality traits. Journal of Personality and Social Psychology, 102(4), 874.
*Dahling, J. J., & Thompson, M. N. (2013). Detrimental relations of maximization with academic and career attitudes. Journal of Career Assessment, 21(2), 278–294.
Dalal, R. S. (2005). A meta-analysis of the relationship between organizational citizenship behavior and counterproductive work behavior. Journal of Applied Psychology, 90(6), 1241–1255.
*Decker, C., & Van Quaquebeke, N. (2015). Getting respect from a boss you respect: How different types of respect interact to explain subordinates’ job satisfaction as mediated by self-determination. Journal of Business Ethics, 131(3), 543–556.
DeGroot, T., Kiker, D. S., & Cross, T. C. (2000). A meta-analysis to review organizational outcomes related to charismatic leadership. Canadian Journal of Administrative Sciences/Revue Canadienne des Sciences de l’Administration, 17(4), 356–372.
DeSimone, J. A., Harms, P. D., & DeSimone, A. J. (2015). Best practice recommendations for data screening. Journal of Organizational Behavior, 36(2), 171–181.
Dillman, D, A., Smyth, J. D., & Christian, L. M. (2014). Internet, phone, mail, and mixed-mode surveys: The tailored design method (4th ed.). Hoboken, NJ: John Wiley & Sons, Inc.
Duhachek, A., Coughlan, A. T., & Iacobucci, D. (2005). Results on the standard error of the coefficient alpha index of reliability. Marketing Science, 24, 294–301.
*Duniewicz, K. (2015). Don't get mad, get even: how employees abused by their supervisor retaliate against the organization and undermine their spouses. FIU Electronic Theses and Dissertations. Paper 1848. https://digitalcommons.fiu.edu/etd/1848.
*Eschleman, K. J., Bowling, N. A., & Judge, T. A. (2015). The dispositional basis of attitudes: A replication and extension of Hepler and Albarracín (2013). Journal of Personality and Social Psychology, 108(5), e1–e15.
Feitosa, J., Joseph, D. L., & Newman, D. A. (2015). Crowdsourcing and personality measurement equivalence: A warning about countries whose primary language is not English. Personality and Individual Differences, 75, 47–52.
*Ferris, D. L., Johnson, R. E., Rosen, C. C., Djurdjevic, E., Chang, C. H. D., & Tan, J. A. (2013). When is success not satisfying? Integrating regulatory focus and approach/avoidance motivation theories to explain the relation between core self-evaluation and job satisfaction. Journal of Applied Psychology, 98(2), 342–353.
Fisher, R. (1955). Statistical methods and scientific induction. Journal of the Royal Statistical Society. Series B (Methodological), 69–78.
Fisher, G. G., & Sandell, K. (2015). Sampling in industrial–organizational psychology research: Now what? Industrial and Organizational Psychology, 8(02), 232–237.
Fleischer, A., Mead, A. D., & Huang, J. (2015). Inattentive responding in MTurk and other online samples. Industrial and Organizational Psychology, 8(02), 196–202.
*Gabler, C. B., Nagy, K. R., & Hill, R. P. (2014). Causes and consequences of abusive supervision in sales management: A tale of two perspectives. Psychology & Marketing, 31(4), 278–293.
*Gangadharan, A. (2014). Can I smile with spirit? Towards a process model associating workplace spirituality and emotional labor (Doctoral dissertation). Available from ProQuest Dissertations & Theses database (UMI No. 3642187).
*Giacopelli, N. M., Simpson, K. M., Dalal, R. S., Randolph, K. L., & Holland, S. J. (2013). Maximizing as a predictor of job satisfaction and performance: A tale of three scales. Judgment and Decision making, 8(4), 448–469.
Gillespie, M. A., Gillespie, J. Z., Brodke, M. H., & Balzer, W. K. (2016). The importance of sample composition depends on the research question. Industrial and Organizational Psychology, 9(01), 207–211.
*Goo, W. (2015). Employee needs and job-related opportuniites: From the person-environment fit framework (Doctoral dissertation). Available from ProQuest Dissertations & Theses database. (UMI No. 3681240).
Goodman, J. K., & Paolacci, G. (2017). Crowdsourcing consumer research. Journal of Consumer Research, 44(1), 196–210.
Gosling, S. D., Vazire, S., Srivastava, S., & John, O. P. (2004). Should we trust web-based studies? A comparative analysis of six preconceptions about internet questionnaires. American Psychologist, 59(2), 93.
Greco, L., O’Boyle, E., Cockburn, B., & Yuan, Z. (2015). Raising the .70 bar: A meta-analytic study of coefficient alpha. Working Paper.
Griffeth, R. W., Hom, P. W., & Gaertner, S. (2000). A meta-analysis of antecedents and correlates of employee turnover: Update, moderator tests, and research implications for the next millennium. Journal of Management, 26(3), 463–488.
*Hannah, S. T., Jennings, P. L., Bluhm, D., Peng, A. C., & Schaubroeck, J. M. (2014). Duty orientation: Theoretical development and preliminary construct testing. Organizational Behavior and Human Decision Processes, 123(2), 220–238.
Harms, P. D., & DeSimone, J. A. (2015). Caution! MTurk workers ahead—fines doubled. Industrial and Organizational Psychology, 8(02), 183–190.
*Hausknecht, J. P., Sturman, M. C., & Roberson, Q. M. (2011). Justice as a dynamic construct: Effects of individual trajectories on distal work outcomes. Journal of Applied Psychology, 96(4), 872–880.
Hershcovis, M. S., Turner, N., Barling, J., Arnold, K. A., Dupré, K. E., Inness, M., Mirielle LeBlanc, M., & Sivanathan, N. (2007). Predicting workplace aggression: A meta-analysis. Journal of Applied Psychology, 92(1), 228–238.
Highhouse, S., & Gillespie, J. Z. (2009). Do samples really matter that much. Statistical and methodological myths and urban legends: Doctrine, verity and fable in the organizational and social sciences (pp. 247–265).
Hillygus, D. S., Jackson, N., & Young, M. (2014). Professional respondents in non-probability online panels. Online panel research: A data quality perspective, 1. 219–237.
*Holtz, B. C., & Harold, C. M. (2013a). Effects of leadership consideration and structure on employee perceptions of justice and counterproductive work behavior. Journal of Organizational Behavior, 34(4), 492–519.
*Holtz, B. C., & Harold, C. M. (2013b). Interpersonal justice and deviance the moderating effects of interpersonal justice values and justice orientation. Journal of Management, 39(2), 339–365.
Huang, J. L., Curran, P. G., Keeney, J., Poposki, E. M., & DeShon, R. P. (2012). Detecting and deterring insufficient effort responding to surveys. Journal of Business and Psychology, 27(1), 99–114.
Huang, J. L., Liu, M., & Bowling, N. A. (2015). Insufficient effort responding: Examining an insidious confound in survey data. Journal of Applied Psychology, 100(3), 828–845.
Jackson, T. A., Meyer, J. P., & Wang, X. H. F. (2013). Leadership, commitment, and culture: A meta-analysis. Journal of Leadership & Organizational Studies, 20(1), 84–106.
*Jenkins, J. S., Heneghan, C. J., Bailey, S. F., & Barber, L. K. (2014). The work–family interface as a mediator between job demands and employee behaviour. Stress and Health.
*Jeon, G. (2011). Equity sensitivity versus equity preference: Validating a new viewpoint on equity sensitivity (Unpublished doctoral dissertation). University of Illionois at Urbana-Champaign.
*Johnson, V. A., Beehr, T. A., & O’Brien, K. E. (2015). Determining the relationship between employee psychopathy and strain: Does the type of psychopathy matter? International Journal of Stress Management, 22(2), 111–136.
*Johnston-Fisher, J. (2014). Testing a multi-level mediation model of workgroup incivility: The role of civility climate and group norms for civility (Unpublished master’s thesis). Western Kentucky University.
*Joseph, D. L. (2011). Emotional intelligence, leader-member exchange, and behavioral engagement: Considering mediation and reciprocity effects (Doctoral dissertation, University of Illinois at Urbana-Champaign).
Judge, T. A., Heller, D., & Mount, M. K. (2002). Five-factor model of personality and job satisfaction: A meta-analysis. Journal of Applied Psychology, 87(3), 530–541.
*Kiffin-Petersen, S. A., Jordan, C. L., & Soutar, G. N. (2011). The Big Five, emotional exhaustion and citizenship behaviors in service settings: The mediating role of emotional labor. Personality and Individual Differences, 50(1), 43–48.
*Krischer, M. M., Penney, L. M., & Hunter, E. M. (2010). Can counterproductive work behaviors be productive? CWB as emotion-focused coping. Journal of Occupational Health Psychology, 15(2), 154.
*Lambert, L. S., Tepper, B. J., Carr, J. C., Holt, D. T., & Barelka, A. J. (2012). Forgotten but not gone: An examination of fit between leader consideration and initiating structure needed and received. Journal of Applied Psychology, 97(5), 913.
Landers, R. N., & Behrend, T. S. (2015). An inconvenient truth: Arbitrary distinctions between organizational, mechanical turk, and other convenience samples. Industrial and Organizational Psychology, 1–23.
*Lee, J. (2012). The effects of leadership behavior on workplace harassment, employee outcomes, and organizational effectiveness in small businesses (Doctoral dissertation). Available from ABI/INFORM Global; ProQuest Dissertations & Theses database (UMI No. 3489453).
*Long, C. P., Bendersky, C., & Morrill, C. (2011). Fairness monitoring: Linking managerial controls and fairness judgments in organizations. Academy of Management Journal, 54(5), 1045–1068.
*Long, E. C., & Christian, M. S. (2015). Mindfulness buffers retaliatory responses to injustice: A regulatory approach. Journal of Applied Psychology, 100(5), 1409–1422.
*Lusin, J. M. (2014). Employee perceptions of authentic leadership and outcomes of planned organizational change (Doctoral dissertation). Available from ProQuest Dissertations & Theses database (UMI No. 3615782).
Mackey, J. D., Frieder, R. E., Brees, J. R., & Martinko, M. J. (2015). Abusive supervision: A meta-analysis and empirical review. Journal of Management, 0149206315573997.
Mason, W., & Suri, S. (2012). Conducting behavioral research on Amazon’s Mechanical Turk. Behavior Research Methods, 44(1), 1–23.
*Mayer, D. M., Thau, S., Workman, K. M., Van Dijke, M., & De Cremer, D. (2012). Leader mistreatment, employee hostility, and deviant behaviors: Integrating self-uncertainty and thwarted needs perspectives on deviance. Organizational Behavior and Human Decision Processes, 117(1), 24–40.
McGonagle, A. K. (2015). Participant motivation: A critical consideration. Industrial and Organizational Psychology, 8(02), 208–214.
Meade, A. W. (2010). A taxonomy of effect size measures for the differential functioning of items and scales. Journal of Applied Psychology, 95(4), 728.
Meade, A. W., & Craig, S. B. (2012). Identifying careless responses in survey data. Psychological Methods, 17(3), 437.
*Meyer, R. D., Dalal, R. S., José, I. J., Hermida, R., Chen, T. R., Vega, R. P., et al. (2014). Measuring job-related situational strength and assessing its interactive effects with personality on voluntary work behavior. Journal of Management, 40(4), 1010–1041.
*Michel, J. S., & Clark, M. A. (2009). Has it been affect all along? A test of work-to-family and family-to-work models of conflict, enrichment, and satisfaction. Personality and Individual Differences, 47(3), 163–168.
*Michel, J. S., Newness, K., & Duniewicz, K. (2016). How abusive supervision affects workplace deviance: A moderated-mediation examination of aggressiveness and work-related negative affect. Journal of Business and Psychology, 31(1), 1–22.
*Mullins, A. K. (2015). The dimensionality of destructive leadership: Toward an integration of the bright and dark sides. North Carolina State University.
*Murphy, S. L. (2015). Individual adaptability as a predictor of job performance. Louisiana Tech University.
*Nichols, A. L., & Cottrell, C. A. (2014). What do people desire in their leaders? The role of leadership level on trait desirability. The Leadership Quarterly, 25(4), 711–729.
Nunnally, J. C. (1978). Psychometric theory (2nd ed.). New York: McGraw-Hill.
*O’Brien, K. E. (2008). A stressor-strain model of organizational citizenship behavior and counterproductive work behavior (Doctoral dissertation). Available from ProQuest Dissertations & Theses database (UMI No. 3347361).
*O’Boyle, E. H. (2010). A test of the general CWB-OCB emotion model (Doctoral dissertation). Available from ABI/INFORM Global; ProQuest Dissertations & Theses database (UMI No. 3411997).
Paolacci, G., & Chandler, J. (2014). Inside the turk understanding mechanical turk as a participant pool. Current Directions in Psychological Science, 23(3), 184–188.
Paolacci, G., Chandler, J., & Ipeirotis, P. G. (2010). Running experiments on Amazon Mechanical Turk. Judgment and Decision making, 5(5), 411–419.
*Penney, L. M., Hunter, E. M., & Perry, S. J. (2011). Personality and counterproductive work behaviour: Using conservation of resources theory to narrow the profile of deviant employees. Journal of Occupational and Organizational Psychology, 84(1), 58–77.
*Petersen, N. L. (2015). Retaliatory behavior as a response to executive compensation. Bowling Green State University.
*Porter, C., Woo, S. E., & Tak, J. (2015). Developing and validating short form protean and boundaryless career attitudes scales. Journal of Career Assessment, 1069072714565775.
Postoaca, A. (2006). Launching the bottle: The rhetoric of the online researcher. The anonymous elect: market research through online access panels (pp. 67–107).
*Powell, N. C. (2013). Responding to abusive supervision: Opposing arguments for the role of social class in predicting workplace deviance. (Masters thesis). Available from https://uwaterloo.ca.
*Ramirez, S. A. (2015). Impulsive and premeditated counterproductive work behaviors and the moderating effects of self-monitoring and core self-evaluation. North Carolina State University.
Ran, S., Liu, M., Marchiondo, L. A., & Huang, J. L. (2015). Difference in response effort across sample types: Perception or reality? Industrial and Organizational Psychology, 8(02), 202–208.
*Richards, D. A., & Schat, A. C. (2011). Attachment at (not to) work: Applying attachment theory to explain individual behavior in organizations. Journal of Applied Psychology, 96(1), 169–182.
*Rosen, C. C., Slater, D. J., & Johnson, R. E. (2013). Let’s make a deal development and validation of the ex post I-deals scale. Journal of Management, 39(3), 709–742.
Sackett, P. R., & Larson Jr, J. R. (1990). Research strategies and tactics in industrial and organizational psychology.
*Salvaggio, T. (2014). Towards a more complete understanding of the effects of the virtual environment on the relationship between leadership style and outcomes of LMX relationships (Doctoral dissertation). Available from ProQuest Dissertations & Theses database (UMI No. 3690538).
Schmidt, F. L., & Hunter, J. E. (2014). Methods of meta-analysis: Correcting error and bias in research findings (3rd ed.). Thousand Oaks, CA: Sage.
Schmidt, F. L., Shaffer, J. A., & Oh, I.-S. (2008). Increased accuracy for range restriction corrections: Implications for the role of personality and general mental ability in job and training performance. Personnel Psychology, 61, 827–868.
*Schultz, L. A. (2009). Exploring the relationship between the positive and negative sides of the work-family interface: The role of enrichment in buffering the effects of time-, strain-, and behavior-based conflict (Doctoral dissertation). Available from ProQuest Dissertations & Theses database (UMI No. 3378859).
*Scott, K. A., & Zweig, D. (2008, May). Dispositional predictors of organizational cynicism. In ASAC (Vol. 29, No. 5).
Shadish, W. R., Cook, T. D., & Campbell, D. T. (2002). Experimental and quasi-experimental designs for generalized causal inference. Houghton: Mifflin and Company.
*Shao, P. (2010). Ethics-based leadership and employee ethical behavior: Examining the mediating role of ethical regulatory focus (Doctoral dissertation). Available from ProQuest Dissertations & Theses database (UMI No. 3430591).
*Shao, P., Resick, C. J., & Hargis, M. B. (2011). Helping and harming others in the workplace: The roles of personal values and abusive supervision. Human Relations, 64(8), 1051–1078.
*Sharif, M. M., & Scandura, T. A. (2014). Do perceptions of ethical conduct matter during organizational change? Ethical leadership and employee involvement. Journal of Business Ethics, 124(2), 185–196.
Sharpe Wessling, K., Huber, J., & Netzer, O. (2017). MTurk character misrepresentation: Assessment and solutions. Journal of Consumer Research, 44(1), 211–230.
Smith, R. & Hofma Brown, H. (2006). Panel and data quality: Comparing metrics and assessing claims. In Proceedings of the ESOMAR Panel Research Conference. Barcelona: ESOMAR.
Smith, N. A., Sabat, I. E., Martinez, L. R., Weaver, K., & Xu, S. (2015). A convenient solution: Using MTurk to sample from hard-to-reach populations. Industrial and Organizational Psychology, 8(02), 220–228.
Sparrow, N. (2007). Quality issues in online research. Journal of Advertising Research-New York, 47(2), 179.
Sprouse, J. (2011). A validation of Amazon Mechanical Turk for the collection of acceptability judgments in linguistic theory. Behavior Research Methods, 43(1), 155–167.
*Sprung, J. M., & Jex, S. M. (2012). Work locus of control as a moderator of the relationship between work stressors and counterproductive work behavior. International Journal of Stress Management, 19(4), 272–291.
*Tepper, B. J., Carr, J. C., Breaux, D. M., Geider, S., Hu, C., & Hua, W. (2009). Abusive supervision, intentions to quit, and employees’ workplace deviance: A power/dependence analysis. Organizational Behavior and Human Decision Processes, 109(2), 156–167.
*Tepper, B. J., Mitchell, M. S., Haggard, D. L., Kwan, H. K., & Park, H. M. (2015). On the exchange of hostility with supervisors: An examination of self-enhancing and self-defeating perspectives. Personnel Psychology, 68(4), 723–758.
*Thau, S., Bennett, R. J., Mitchell, M. S., & Marrs, M. B. (2009). How management style moderates the relationship between abusive supervision and workplace deviance: An uncertainty management theory perspective. Organizational Behavior and Human Decision Processes, 108(1), 79–92.
*Thau, S., & Mitchell, M. S. (2010). Self-gain or self-regulation impairment? Tests of competing explanations of the supervisor abuse and employee deviance relationship through perceptions of distributive justice. Journal of Applied Psychology, 95(6), 1009.
*Thompson, C. N. (2008). Personal characteristics and the impact of transformational leadership behaviors on follower outcomes (Doctoral dissertation). Available from ABI/INFORM Global; ProQuest Dissertations & Theses database (UMI No. 3375475).
Thoresen, C. J., Kaplan, S. A., Barsky, A. P., Warren, C. R., & de Chermont, K. (2003). The affective underpinnings of job perceptions and attitudes: A meta-analytic review and integration. Psychological Bulletin, 129(6), 914–945.
*Toaddy, S. (2012). Validation of a measure of external organizational justice. North Carolina State University.
Vacha-Haase, T. (1998). Reliability generalization: Exploring variance in measurement error affecting score reliability across studies. Educational and Psychological Measurement, 58, 6–20.
Vandenberg, R. J., & Lance, C. E. (2000). A review and synthesis of the measurement invariance literature: Suggestions, practices, and recommendations for organizational research. Organizational Research Methods, 3(1), 4–70.
*van Prooijen, J. W., & de Vries, R. E. (2016). Organizational conspiracy beliefs: Implications for leadership styles and employee outcomes. Journal of Business and Psychology, 31(4), 479–491.
Viechtbauer, W. (2010). Conducting meta-analyses in R with the metafor package. Journal of Statistical Software, 36(3), 1–48.
*Vogel, R. M., & Mitchell, M. S. (2015). The motivational effects of diminished self-esteem for employees who experience abusive supervision. Journal of Management.
*Wall, A. (2014). Common method variance: An experimental manipulation (Doctoral dissertation). Available from ProQuest Dissertations & Theses database (UMI No. 3662469).
*Wilson, L. M. (2015). An examination of the moderating effects of work centrality on the relationships between person-group fit and work-related outcomes (Doctoral dissertation, San Diego State University).
*Wiltshire, J., Bourdage, J. S., & Lee, K. (2014). Honesty-humility and perceptions of organizational politics in predicting workplace outcomes. Journal of Business and Psychology, 29(2), 235–251.
Wood, J. A. (2008). Methodology for dealing with duplicate study effects in a meta-analysis. Organizational Research Methods, 11(1), 79–95.
*Wynne, K. T. (2012). Profiling leaders: Using a profiling approach to examine the effects of multifactor leadership on follower deviance (Master’s thesis). Available from ProQuest Dissertations & Theses database (UMI No. 1518818).
Zimmerman, R. D. (2008). Understanding the impact of personality traits on individuals' turnover decisions: A meta-analytic path model. Personnel Psychology, 61(2), 309–348.
Author information
Authors and Affiliations
Corresponding author
Appendices
Appendix 1
Appendix 2
Appendix 3
Studies considered but excluded from the current meta-analyses (k = 23)
Excluded due to mixed samples (i.e., combined conventional and OPD samples) (k = 9):
-
Dennis, R., & Winston, B. E. (2003). A factor analysis of Page and Wong’s servant leadership instrument. Leadership & Organization Development Journal, 24(8), 455–459.
-
Irak, D. U. (2010). The role of affectivity in an expanded model of person-environment fit. (NR70552 Ph. D.), Carleton University (Canada). Ann Arbor. Retrieved from http://search. proquest. com/docview/851889665.
-
McAllister, C. P., Harris, J. N., Hochwarter, W. A., Perrewé, P. L., & Ferris, G. R. Got Resources? A multi-sample constructive replication of perceived resource availability’s role in work passion–job outcomes relationships. Journal of Business and Psychology, 1–18.
-
Raver, J. L., & Nishii, L. H. (2010). Once, twice, or three times as harmful? Ethnic harassment, gender harassment, and generalized workplace harassment. Journal of Applied Psychology, 95(2), 236.
-
Sandell, K. (2007). Transformational leadership, engagement, and performance: A new perspective (Doctoral dissertation, Colorado State University. Libraries).
-
Smith, C. L. (2007). The relational context of employee engagement: An intrinsic perspective (Doctoral dissertation, Colorado State University. Libraries).
-
Staples, D. S., & Webster, J. (2007). Exploring traditional and virtual team members’ “best practices” a social cognitive theory perspective. Small Group Research, 38(1), 60–97.
-
Thoroughgood, C. N., Tate, B. W., Sawyer, K. B., & Jacobs, R. (2012). Bad to the bone empirically defining and measuring destructive leader behavior. Journal of Leadership & Organizational Studies, 19(2), 230–255.
-
Tolentino, A. L. (2009). Are all good soldiers created equal? examining the “why” that underlies organizational citizenship behavior: The development of an OCB motives scale. (Doctoral dissertation, University of South Florida).
Excluded due to using online panel company’s survey webhosting but not panel data = (e.g., Survey Monkey) (k = 10):
-
Anderson, L. E. (2015). Relationship between leadership, organizational commitment, and intent to stay among junior executives (Doctoral dissertation, Walden University).
-
Ayers, J. P. (2010). Job satisfaction, job involvement, and perceived organizational support as predictors of organizational commitment (Doctoral dissertation, Walden University).
-
Barbuto Jr., J. E., & Millard, M. L. Developing wisdom and reducing emotional labor in the workplace: Testing the impact of servant leadership.
-
De Lacy, J. C. (2009). Employee engagement: the development of a three dimensional model of engagement; and an exploration of its relationship with affective leader behaviours.
-
Emu, K. E., & Umeh, O. J. (2014). How leadership practices impact job satisfaction of customer relationship officers’: An empirical study. Journal of Management, 2(3), 19–56.
-
Mutsvunguma, P. S. (2012). Ethical climate fit, leader-member exchange and employee job outcomes (Doctoral dissertation).
-
Rader, M. M. (2015). Effects of authentic leadership on job satisfaction and younger worker turnover intentions (Doctoral dissertation, The Chicago School of Professional Psychology).
-
Spector, P. E., & Che, X. X. (2014). Re-examining citizenship: How the control of measurement artifacts affects observed relationships of organizational citizenship behavior and organizational variables. Human Performance, 27(2), 165–182.
-
Yates, L. (2011). Exploring the relationship of ethical leadership with job satisfaction, organizational commitment, and organizational citizenship behavior.
-
Yukl, G., O’Donnell, M., & Taber, T. (2009). Influence of leader behaviors on the leader-member exchange relationship. Journal of Managerial Psychology, 24(4), 289–299.
Excluded due to niche or otherwise unique online panel (total k = 3):
Online panel of Dutch public sector employees (k = 1):
-
Ashikali, T., & Groeneveld, S. (2015). Diversity management in public organizations and its effect on employees’ affective commitment the role of transformational leadership and the inclusiveness of the organizational culture. Review of Public Personnel Administration, 35(2), 146–168.
Craigslist in Southeastern USA (k = 1):
-
Colquitt, J. A., Long, D. M., Rodell, J. B., & Halvorsen-Ganepola, M. D. (2015). Adding the “in” to justice: A qualitative and quantitative investigation of the differential effects of justice rule adherence and violation. Journal of Applied Psychology, 100(2), 278.
Social workers belonging to social work online community magazine (k = 1):
-
Sullivan, E. M. (2012). A correlational study of perceived transformational leadership styles and job satisfaction among social workers (Doctoral dissertation, University of Phoenix).
Excluded due to lack of reporting effect size for relationship of interest (k = 1):
-
Swee, H. Y. (2009). A cognitive perspective of self-other agreement: A look at outcomes and predictors of shared implicit performance theories (Doctoral dissertation, University of Akron).
Rights and permissions
About this article
Cite this article
Walter, S.L., Seibert, S.E., Goering, D. et al. A Tale of Two Sample Sources: Do Results from Online Panel Data and Conventional Data Converge?. J Bus Psychol 34, 425–452 (2019). https://doi.org/10.1007/s10869-018-9552-y
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10869-018-9552-y