Introduction

Personality measures have survived their share of criticism from researchers (e.g., Guion and Gottier 1965; Mischel 1968) for being weak predictors of job performance, to become touted as useful accompaniments to traditional measures of ability, skill, and job-relevant knowledge (e.g., Barrick and Mount 1991; Hurtz and Donovan 2000; Tett et al. 1991). Personality measures predict unique portions of the criterion domain (e.g., contextual performance and workplace deviance) that are typically less related to traditional predictors such as cognitive ability (Berry et al. 2007; Borman and Motowidlo 1997). In addition, due to their self-report nature, personality assessments are generally easy to administer and less costly than other selection tools that require more guided administration.

While the aforementioned factors contribute to the appeal of personality assessments, the use of self-report measures also introduces the potential for applicants to provide inflated or false descriptions of their personality traits. This phenomenon, which is the primary focus of the present study, has typically been referred to as faking. Although the current state of the literature on this topic would suggest that applicants are capable of elevating or faking their responses (Stark et al. 2001; Viswesvaran and Ones 1999), and that a substantial portion of applicants do so (Arthur et al. 2009, Donovan et al. 2003; Griffith et al. 2007), there are still numerous unanswered questions regarding the impact of faking on personnel selection outcomes.

These unanswered questions may be partially due to the methods used to examine the phenomenon. Most investigations into the effects of faking focus solely on personality assessments, while organizations typically do not and should not make hiring decisions on the basis of these measures alone (Hogan et al. 1996). Although using multiple predictors is undoubtedly good selection practice, this assertion has also been made by researchers under the assumption that combining personality measures with other predictors will curtail the possible negative effects associated with faking behavior (Hough et al. 1990; Mueller-Hanson et al. 2003; Peterson and Griffith 2006). For example, Hough et al. (1990) suggest that applicants who provide invalid personality profiles should be evaluated on additional predictors. In addition, Mueller-Hanson et al. (2003) note that the effects of faking may be reduced if personality measures are used as screening tools prior to assessing applicants on other characteristics.

Although these assumptions make sense conceptually, little research has actually examined the role of faking in selection scenarios using more than one predictor. While some reduction in the effects of faking may be a given when additional predictors are used to make hiring decisions, this examination bears comparison to recent work on adverse impact. Until thorough investigations into the degree to which adding noncognitive predictors to a selection battery could reduce adverse impact (e.g., Potosky et al. 2005; Ryan et al. 1998), researchers assumed that the strategy would be effective. As the aforementioned studies note, however, the degree to which this reduction is evidenced can vary greatly across situations and predictor combinations (Sackett and Ellingson 1997). It is our view that the extent to which concerns over faking are diminished by adding fake-resistant predictors (e.g., cognitive ability) to a selection battery requires a similar degree of detailed consideration.

The goal of the present study, therefore, was to examine the impact of faking on hiring decisions when a personality measure (specifically, a measure of conscientiousness) was used alone versus in combination with a measure of general cognitive ability. Although adding any uncorrelated predictor to a battery containing a personality assessment could be expected to curtail the negative effects of faking, we chose to focus on cognitive ability due to the fact that it has received a great deal of attention as a useful predictor across jobs and serves as a common contributor to personnel selection systems (Schmidt and Hunter 1998).

The degree to which the inclusion of a cognitive ability measure in a selection battery can mitigate any negative effects of faking is dependent upon two basic assumptions. First, cognitive ability scores should bear little relationship to conscientiousness scores. Meta-analytic findings by Bobko et al. (1999) and Cortina et al. (2000) would indicate that this is likely to be the case. Second, cognitive ability should also be unrelated to applicant faking behavior. While some research has indicated that cognitive ability may be related to the ability to fake (e.g., Alliger et al. 1996; Mersman and Schultz 1998; Wrensen and Biderman 2005) and faking on forced-choice measures (e.g., Vasilopoulos et al. 2006), recent findings suggest that cognitive ability is unlikely to influence faking propensity (i.e., the extent to which actual applicants engage in faking) on traditional Likert-format measures (Griffith et al. 2006; Vasilopoulos et al. 2006). Based on the findings outlined above, we proceeded under the assumption that cognitive ability would be unrelated to faking behavior and conscientiousness scores.

The Role of Faking in Hiring Decisions Involving Personality and Cognitive Ability

Although the inclusion of a cognitive ability measure in a selection battery may mitigate the effects of faking on hiring decisions, the boundaries and conditions in which this is likely to occur have not been established. To our knowledge, only two studies to date have tested the assumption that adding a cognitive ability measure to a personality measure could curtail the negative effects of faking. Using a Monte Carlo simulation, Komar et al. (2005) found that increases in the magnitude, proportion, and variability of faking all led to attenuation of criterion-related validity coefficients. However, the decrement to validity was typically less pronounced for the composite predictor. In another simulation study, Converse et al. (2009) examined the effects of combining conscientiousness with two other predictors across a variety of selection scenarios and outcome variables (criterion-related validity, mean performance of the hired sample, and selection decision consistency). The authors noted that while multiple-predictor selection tended to reduce the negative effects of faking (e.g., lower criterion-related validity), the extent of this reduction was widely variable across the range of conditions examined.

We believe the current study serves as a useful extension to these simulation studies in two ways. First, the Komar et al. (2005) study examined a single potential combination of conscientiousness and cognitive ability (i.e., a unit-weighted composite), while the current study examines a several potential combination methods. Second, as Ryan et al. (1998) noted, simulation studies may result in different conclusions than those based on more traditional research samples. Therefore, while the focus of the present study overlaps somewhat with the Converse et al. (2009) simulation, our primary goal was to extend this examination of conscientiousness–cognitive ability combinations to an experimentally manipulated applicant setting.

Outcomes of Interest: Percentage of Fakers Hired and Hiring Discrepancies

We chose to investigate faking from two perspectives in the present study. First, we examined faking as within-subject score change from applicant to honest response conditions, following the methodology of Griffith et al. (2007), in order to examine the percentage of individuals hired that were “fakers.” This operationalization categorizes individuals whose scores in an applicant context exceed the upper bound of a confidence interval established around their honest context score as “fakers.” Next, we also examined cases in which individuals faked enough to get hired when they would not have been hired based on their honest scores (we termed this a hiring discrepancy). Substantial changes in hiring decisions may have a negative impact on both the effectiveness and fairness of a selection procedure.

We were interested in examining both the percentage of fakers hired and hiring discrepancies for several reasons. First, previous research has indicated the presence of a link between faking and integrity (Griffith et al. 2006; McFarland and Ryan 2000), which may suggest the hiring of “fakers” is undesirable in and of itself. Each of the mentioned studies linked faking behavior to low integrity, which is generally related to counterproductive work behaviors (Berry et al. 2007). Second, we believe there is illustrative value in considering both outcome variables in conjunction. Specifically, the interpretation of potential reductions in hiring discrepancies is more meaningful when considered against the portion of fakers in the hired sample (e.g., a 50% reduction in hiring discrepancies in a situation where a high percentage those hired were fakers is more meaningful than a similar reduction when a small percentage of the hired sample were fakers). Finally, aside from the fact that increases in the representation of fakers in a sample may threaten validity (e.g., Komar et al. 2008), some researchers have also noted potential ethical concerns regarding the displacement of honest applicants by individuals who fake their responses. Both Hough (1998) and Morgeson et al. (2007) have noted that those who use personality assessments to make hiring decisions should be concerned with the fairness of the testing process when some individuals fake their responses, thereby making the examination of hiring discrepancies important. Therefore, regardless of the influence of faking on criterion-related validity, the potential for faking to alter hiring decisions by putting honest respondents at a disadvantage should not be ignored.

Combining Predictor Variables

When multiple-predictor batteries are used for personnel selection, the scores from each predictor can be combined in several ways. Generally, predictor combination methods can be divided into those where scores on one predictor are allowed to compensate for scores on another predictor (compensatory models), or those where a minimal level of proficiency or standing on a trait is required for each predictor (noncompensatory models; Cascio 1991; Gatewood and Field 2004).

Compensatory Models

Unit Weighting. A scenario in which all predictors in a selection model are given equal weight is referred to as unit weighting (Cascio 1991). This strategy is typically enacted by summing an individual’s standardized scores (z-scores) across the predictors involved in the selection process. In the current study, the unit weighting strategy was expected to function in much the same way as the multiple-predictor composite variables in the studies by Komar et al. (2005) and Converse et al. (2009). A unit-weighted composite of cognitive ability and conscientiousness should result in the hiring of fewer individuals likely to have faked than making decisions based solely on the conscientiousness measure. Under such conditions, individuals that do not fake substantially, and therefore may not be among the top scorers on the conscientiousness measure, would still be capable of being selected if they were to obtain high cognitive ability scores. Based on this rationale, we tested the following hypotheses:

Hypothesis 1a: A unit-weighted combination of cognitive ability and conscientiousness scores will result in the hiring of a significantly smaller percentage of fakers than the use of a conscientiousness measure alone.

Hypothesis 1b: A unit-weighted combination of cognitive ability and conscientiousness scores will result in a significantly smaller percentage of hiring discrepancies than the use of a conscientiousness measure alone.

Multiple Regression. The multiple-regression approach to the combination of predictor variables involves differentially weighting variables based on regression weights obtained from a criterion-related validation study. Like unit weighting, this method also allows for scores on one predictor to compensate for scores on other predictors (Cascio 1991; Gatewood and Field 2004; Potosky et al. 2005). However, unlike unit weighting, scores on one predictor variable may compensate more or less for scores on other variables depending on the weights assigned to each predictor.

In the present study, estimated regression weights based on Cortina et al.’s (2000) meta-analysis were used to weight the two predictor variables. Specifically, regression weights of β = .432 for cognitive ability and β = .235 for conscientiousness (Cortina et al. 2000, p. 340) served as the predictor weightings in the current study. The implications of this model for the hiring of fakers should therefore be similar to those associated with the unit-weighting strategy. In this case, the addition of a cognitive ability variable was expected to be more effective due to the increased weight associated with scores on the ability measure.

Hypothesis 2a: A regression-weighted combination of cognitive ability and conscientiousness scores will result in the hiring of a significantly smaller percentage of fakers than the use of a conscientiousness measure alone.

Hypothesis 2b: A regression-weighted combination of cognitive ability and conscientiousness scores will result in a significantly smaller percentage of hiring discrepancies than the use of a conscientiousness measure alone.

NonCompensatory Models: Multiple-Hurdle

A multiple-hurdle selection strategy is noncompensatory in that an applicant must meet a passing score on one predictor variable in order to continue on further in the selection process (Cascio 1991; Gatewood and Field 2004). For the purposes of the current study, two separate multiple-hurdle models were tested. First, following the suggestions of Mueller-Hanson et al. (2003), a model in which applicants were screened-out based on their conscientiousness scores was tested. Those applicants not meeting a minimum cut-score on the conscientiousness scale were omitted from further consideration. The sample of applicants that met the minimum requirement was then selected from, in a top-down fashion, based on their cognitive ability scores.

In the current study, we expected that this multiple-hurdle approach would result in the hiring of a similar percentage of fakers as the conscientiousness-alone approach based on the following reasoning. The conscientiousness-alone approach involves rank-ordering individuals based on this variable and selecting the top scorers (e.g., top 10%). Assuming fakers rise to the top of the distribution, this is likely to represent a worst case scenario in terms of hiring fakers due to the fact that no other variables influence selection decisions other than conscientiousness scores (which are subject to faking). On the other hand, the multiple-hurdle approach involves first rank-ordering individuals based on conscientiousness, retaining a relatively larger percentage of top scores (e.g., top 50%), and then selecting the top scorers on cognitive ability from that group. Although the retained group from which cognitive ability selection is made may contain a larger percentage of fakers than the overall sample, this should be no worse than conscientiousness-alone selection. In fact, because final decisions are based on cognitive ability (which should have little relationship with faking and conscientiousness), some of the individuals who faked their way to the top on the conscientiousness measure may not be hired due to low cognitive ability scores. This may then result in fewer fakers hired than for the conscientiousness-alone approach. In other words, because selection decisions are influenced by something other than conscientiousness, a slightly smaller percentage of fakers might actually be hired. However, this difference is likely to be small, and thus it seems that both selection strategies would result in the hiring of a similar percentage of fakers. Therefore, the following hypotheses were tested:

Hypothesis 3a: When conscientiousness scores are used as the first step in a multiple-hurdle selection model, there will be no significant difference between the percentage of fakers hired by the combination of conscientiousness and cognitive ability and the percentage of fakers hired by the conscientiousness measure alone.

Hypothesis 3b: When conscientiousness scores are used as the first step in a multiple-hurdle selection model, there will be no significant difference between the percentage of hiring discrepancies occurring for the combination of conscientiousness and cognitive ability and the percentage occurring for the conscientiousness measure alone.

The second multiple-hurdle model involved a selection scenario in which applicants were screened-out based on their cognitive ability scores and selected (in top-down fashion) based on their conscientiousness scores. Given the findings of Haaland and Christiansen (1998), which suggest that cognitive ability levels remained relatively constant throughout a range of personality scores, the portion of fakers that are screened-out in the first step should be equal to the portion that continues on in the selection process. In this case, top-down selection based on conscientiousness scores in the second step will still be subject to the occurrence of fakers rising to the top of the distribution. Therefore, the following hypothesis was tested:

Hypothesis 4a: When cognitive ability scores are used as the first step in a multiple-hurdle selection model, there will be no significant difference between the percentage of fakers hired by the combination of conscientiousness and cognitive ability and the percentage of fakers hired by the conscientiousness measure alone.

Hypothesis 4b: When cognitive ability scores are used as the first step in a multiple-hurdle selection model, there will be no significant difference between the percentage of hiring discrepancies occurring for the combination of conscientiousness and cognitive ability and the percentage occurring for the conscientiousness measure alone.

While it is not typical practice to hypothesize the lack of a significant difference (i.e., the null hypothesis) between two conditions (as was the case for Hypotheses 3a,b and 4a,b), three considerations led us to these hypotheses. First, these noncompensatory models are common in practice, and thus we believe examining them provides a useful investigation into a variety of practical selection scenarios. Second, our conceptual analysis of these scenarios led to the conclusion that these methods will result in substantially different effects than the compensatory models and therefore offer a trivial change in the two outcome variables of interest over the conscientiousness measure alone. Third, although it is generally difficult to draw clear conclusions from a nonsignificant result (particularly if the samples are not quite large), the actual percentages obtained in theses scenarios are nonetheless likely to be interpretable and are useful information particularly when compared against the percentages from the compensatory approaches.

Our decision to use the “conscientiousness measure alone” selection scenario as a point of comparison was based on two factors. First, nearly all investigations into criterion-related validity and hiring decisions that have appeared in the faking literature have focused on only a single assessment. Second, this point of comparison presents a “worst-case” scenario against which the effects of using multiple predictors can be evaluated.

Finally, the findings regarding rank-order changes that have been discussed in the preceding paragraphs suggest that applicant faking has the potential to alter hiring decisions considerably. This effect appears to be more pronounced at the upper end of the distribution of the applicant sample due to the likelihood of individuals who faked their responses rising to the top of the distribution (Haaland and Christiansen, 1998; Zickar et al. 1996). The findings of Christiansen et al. (1994), Griffith et al. (2007), and Rosse et al. (1998) also speak to this assertion in that the effects of faking on rank-order changes are more pronounced at smaller selection ratios. Given the findings of the aforementioned studies, we believed that the percentage of fakers hired across each of the selection scenarios would tend to increase as the selection ratio decreased. Although this did not serve as the basis of any further formal hypotheses, each of the selection scenarios was tested at three selection ratios (.10, .20, and .30). Previous literature investigating both faking (e.g., Griffith et al. 2007) and adverse impact (Sackett and Roth, 1996) has used similar selection ratios in hiring decision comparisons.

Method

Participants

Participants were 370 students from a southeastern university (mean age = 23.52 years, 33.00% male, 56.20% female, 10.80% did not report gender) who participated in exchange for extra course credit in their introductory psychology courses.

Procedure

In order to simulate the realism of an applicant setting, deception was incorporated into the design of this study. Although the use of deception can introduce ethical concerns, numerous steps were taken to ensure that participants experienced no adverse effects from their participation in the study. The experimenters attended a regularly scheduled class period in the students’ psychology courses and introduced themselves as associate consultants for a university-based consulting firm. The participants were informed that prior to participating in a research study they would have the opportunity to apply for a job as a customer service representative with the consulting firm. After being given a brief overview of the expected duties, schedule, and compensation associated with the position, the participants completed (under the assumption that they were applying for a job) application packets that contained an application form, job description, conscientiousness measure, and cognitive ability assessment.

After the students completed the application packet, a thorough debriefing process was conducted. During the debriefing process, the participants were informed that they were taking part in a research study, and that no jobs were available at the consulting firm. The experimenters informed the participants that they would be asked to complete an additional packet in order to complete the research study and that their responses would be used for research purposes only. The honest condition packet contained special instructions asking participants to respond as honestly as possible, as well as the conscientiousness and manipulation check scales.

Given the potential methodological confounds associated with asking participants to respond honestly after the applicant manipulation (e.g., practice effects), a counterbalanced condition was included in which a pre-manipulation honest conscientiousness assessment was gathered from a subsample of participants. In this counterbalanced condition, participants completed the conscientiousness items (which were embedded in a larger personality assessment) 1 month prior to the manipulation taking place. This measure was administered by participants’ instructors as part of a classroom exercise. Analyses comparing these two approaches are presented later.

Measures

Conscientiousness. The NEO-Five Factor Inventory (NEO-FFI; Costa and McCrae 1992) was used to assess conscientiousness in the current study. The NEO-FFI is comprised of 60 items that assess the Big Five personality constructs. Respondents indicated their level of agreement with each statement on a 5-point Likert-scale (1 = Strongly Disagree, 5 = Strongly Agree). In the case of the current study, only the conscientiousness scale was used. This scale served as the assessment of conscientiousness in both the applicant and the honest conditions. The decision to use only the conscientiousness scale was based on research supporting its prediction of job performance in customer service settings (Hurtz and Donovan 2000), and the conscientiousness-related job duties noted in the job description provided to participants (e.g., maintaining and organizing client information).

One additional point to note is that for one of the multiple-hurdle models to be tested, a cut-score was set for the conscientiousness measure. Therefore, in the multiple-hurdle model in which the conscientiousness variable was used as a screening tool (Hypothesis 3), applicants whose conscientiousness scores were below the mean (for the applicant sample) were removed from further consideration.

Cognitive Ability. The Wonderlic Personnel Test (WPT; Wonderlic 2002) is a 50-item speeded test of general cognitive ability that correlates at r = 0.92 with the Wechsler Adult Intelligence Scale (WAIS). Once again, given that Hypothesis 4 required the use of a cut-score for the first hurdle in the process, a score of 23 was used. According to the test manual (Wonderlic 2002), 25 is the suggested minimum score for customer service representative positions, while 21 is the suggested cut-score for receptionists/clerical workers. Given that the proposed job was presented as being a customer service position, but has many elements of a receptionist position, a cut-score half-way between the recommended scores for each job was chosen.

Application Form. Participants were presented with a job application form in order to add to the realism of the study’s manipulation. The application form did have an additional purpose in that it was used to gather data regarding participants’ interest in obtaining the customer service position being presented. Participants provided ratings on a Likert-scale, with response options ranging from 1 (Not at all Interested) to 5 (Very Interested).

Manipulation Check. After completing all of the study scales, participants were given a manipulation check scale. This scale asked participants to indicate their age, in addition to providing a rating of the believability of the study’s primary manipulation. Participants rated the study’s believability on a Likert-scale, with response options ranging from 1 (Not at all Believable) to 5 (Very Believable).

Analysis

Applicant Faking. Following the methodology outlined by Griffith et al. (2007), applicant faking was assessed via a change score from the applicant to honest condition for each participant. More specifically, each participant’s honest score was subtracted from their score in the applicant condition for the conscientiousness measure. While this procedure served as a means of creating a continuous variable indicating the amount of faking by each participant, for the purposes of the analyses regarding hiring decisions, participants were also categorized as “fakers” or “non-fakers” based on the magnitude of the difference between their applicant and honest conscientiousness scores.

In order to carry out this categorization, we used the methodology suggested by Griffith et al. (2007), in which a 95% confidence interval surrounding each participant’s honest score was computed by multiplying the standard error of the difference (SED = 4.20) by 1.96. The purpose of this confidence interval was to identify individuals that were likely to have faked the conscientiousness assessment in the applicant condition, while taking the reliability of the assessment into account. While Griffith et al. used both the standard error of the measurement (SEM) and the SED to classify fakers, we chose to use the SED given that neither score (i.e., applicant or honest) can be considered a fixed or “true” score.Footnote 1 If an individual’s score in the applicant condition exceeded the upper bound of this confidence interval, he/she was categorized as a “faker.” Although the terms “faker” and “non-faker” are oversimplifications, an investigation into the hiring decisions within the current study’s methodology required us to be constrained to their use for reasons of parsimony. In the current sample, the SED was equal to 4.20, thereby resulting in a requisite change score of 8.23 for an individual to be identified as a faker. In total, 19.50% of the current sample met this criterion.

In addition to assessing faking from the perspective of Griffith et al. (2007), we also examined hiring discrepancies. In keeping with our discussion pertaining to the potential for individuals who faked the personality assessment to displace individuals who respond honestly, we examined the percentage of the fakers hired that would not have been hired had they responded honestly to the conscientiousness measure. In other words, we were interested in determining how many of the fakers had elevated their scores enough to change their hiring decision. While hiring fakers may not be desirable, selecting individuals who would not have been hired based on their honest personality scores represents the most significant impact of faking on hiring decisions.

All hypotheses were tested using chi-square analysis. For each chi-square test, the percentage of fakers hired and hiring discrepancies for the conscientiousness measure alone were compared to the combination that pertained to the hypothesis being tested. The chi-square analyses were carried out six separate times for each hypothesis, in order to provide tests at three different selection ratios while examining both faking and hiring discrepancies.

Results

Manipulation Check

Given that this study’s methodology assumed that the experimental applicant manipulation would provide the requisite motivation for participants to fake, mean-level differences between applicant and honest conscientiousness scores should be observed. The results of a paired-samples t-test indicated that mean conscientiousness in the applicant condition (M = 46.54) was significantly higher than mean conscientiousness in the honest condition (M = 43.37; t(369) = 6.72, p < .001; d = 0.38). Descriptive statistics are presented in Table 1.

Table 1 Descriptive statistics and reliabilities for study scales for full sample

Additional manipulation checks included assessing the believability of the simulated applicant setting, as well as participants’ level of interest in the position that was being offered. Although these data were only available for a subsample of participants, they provide a sufficient illustration of the effectiveness of the simulated applicant setting. Overall believability ratings were obtained from 174 participants. Only 2.30% of participants gave a rating of 1 (not believable). These results provide strong support for the effectiveness of the simulated applicant setting.

Data pertaining to participants’ interest in obtaining the customer service position were obtained from 315 participants (55 participants did not complete the interest item on the application form). Of the 315 individuals that provided an answer to the interest item, 73% indicated that they were at least somewhat interested in the position. We conducted an independent samples t-test in order to determine whether there were mean-level differences in applicant conscientiousness scores between those individuals who were at least somewhat interested in the position and those who were not at all interested. The results of the t-test indicated that there were in fact no significant differences in mean applicant conscientiousness scores between interested (M = 46.82) and uninterested (M = 45.13) participants (t(313) = −1.55, ns; d = −0.19). Given that there was not a significant difference between these groups of participants, we conducted all subsequent analyses on the full sample of 370 participants.

Counterbalanced Condition

In the counterbalanced condition, honest scores were collected 1 month prior to the applicant manipulation from a subsample of 132 participants as part of a classroom exercise. In order to determine whether there was a potential confound associated with collecting honest responses after the applicant manipulation, we conducted an independent samples t-test to compare mean honest responses gathered before the applicant manipulation to those from participants whose honest scores were obtained only after the manipulation. The results of this t-test indicated that there was no significant difference between the two honest conditions with an honest mean of 44.04 (n = 132) in the counterbalanced condition and 43.47 (n = 238) in the post-manipulation only honest condition (t(368) = .66, ns; d = 0.07).

In addition to examining mean honest scores, we also investigated the percentage of fakers identified using both honest assessments. For the participants whose honest scores were gathered only after the manipulation, 20.60% of the sample would be considered fakers. When honest scores were gathered prior to the manipulation, 17.30% of the participants were identified as fakers. The results of these analyses suggest that the time at which the honest assessment of conscientiousness was obtained was unlikely to have a substantial bearing on the study’s results. As such, all subsequent analyses were conducted using the full sample of 370 participants.

Descriptive Statistics, Scale Intercorrelations, and Reliabilities

Descriptive statistics for all study scales and estimates of internal consistency for the applicant and honest administrations of the conscientiousness scale are displayed in Table 1. Both the applicant (α = 0.91) and the honest (α = 0.87) assessments of conscientiousness demonstrated acceptable levels of internal consistency, as did the difference score measure of faking behavior.

Scale intercorrelations for the three measures used in this study, in addition to the amount of faking, are displayed in Table 2. The most notable findings yielded from the correlational analyses pertain to the relationship between conscientiousness, cognitive ability, and faking. As mentioned, it was expected that cognitive ability would be unrelated to honest conscientiousness scores, applicant conscientiousness scores, and the amount of faking observed in an applicant setting. As Table 2 indicates, this was indeed the case, with cognitive ability demonstrating nonsignificant relationships with honest conscientiousness (r = −0.00, ns), applicant conscientiousness (r = 0.02, ns), and the amount of faking variable (r = 0.02, ns). Finally, the correlation between applicant and honest condition conscientiousness scores (r = 0.42, p < 0.05) suggests that there were considerable differences in the rank-ordering of participants between the two conditions.

Table 2 Correlations among primary study variables

Hypothesis Tests

Each hypothesis was tested at three different selection ratios (0.10, 0.20, and 0.30). The detailed results of the hypothesis tests for analyses examining the percentage of fakers hired are displayed in Table 3, while analyses examining the percentage of hiring discrepancies appear in Table 4. The percentage of fakers hired by the conscientiousness measure alone varied across selection ratios, with 40.50, 28.40, and 29.70% hired at the 0.10, 0.20, and 0.30 selection ratios. This finding was expected based on the results of previous studies (e.g., Haaland and Christiansen1998; Zickar et al. 1996), which suggested that the effects of faking are more pronounced at smaller selection ratios. In contrast, the percentage of hiring discrepancies occurring for the conscientiousness measure alone remained relatively constant across the selection ratios, with 100.00% hiring discrepancies at the 0.10 and 0.20 selection ratios, and 97.00% at the 0.30 selection ratio.

Table 3 Chi-square analyses for Hypotheses 1a–4a (fakers hired) across all selection ratios
Table 4 Chi-square analyses for Hypotheses 1b–4b (hiring discrepancies) among fakers hired across all selection ratios

Hypothesis 1. Hypothesis 1a suggested that there would be significantly fewer fakers hired by a unit-weighted combination of conscientiousness and cognitive ability than would be hired by the conscientiousness measure alone, while Hypothesis 1b proposed the same for hiring discrepancies. Hypothesis 1a was not supported at any of the three selection ratios. Although the unit-weighted combination did result in hiring 13.50% fewer fakers at the 0.10 selection ratio, this difference was not significant (χ 2 = 2.80, ns). The differences between the unit-weighted combination and conscientiousness alone were less pronounced and also nonsignificant at 0.20 and 0.30 selection ratios.

In contrast, an examination of Hypothesis 1b yielded stronger differences across the three selection ratios. At all three selection ratios, the percentage of hiring discrepancies occurring for the unit-weighted combination was significantly lower than the percentage occurring for conscientiousness alone. Table 4 presents the results for the hiring discrepancy hypotheses. The values in the table indicate the percentage of discrepancies (D) and non-discrepancies (ND) among the fakers hired (noted as “n fakers”) for each selection method. As indicated in Table 4, significant reductions of 20.00% (χ 2 = 56.44, p < 0.001), 17.60% (χ 2 = 74.02, p < 0.001), and 14.20% (χ 2 = 19.93, p < 0.001) occurred at the 0.10, 0.20, and 0.30 selection ratios, respectively. These findings provide consistent support for Hypothesis 1b.

Hypothesis 2. Hypothesis 2a stated that the use of a regression-weighted combination of conscientiousness and cognitive ability would result in the hiring of a significantly smaller percentage of fakers than the use of a conscientiousness measure alone, while Hypothesis 2b proposed that the same would be true for hiring discrepancies. Similar to the findings pertaining to Hypothesis 1a, although the regression-weighted combination resulted in the hiring of 10.80% fewer fakers than the conscientiousness measure alone at the 0.10 selection ratio and an 8.10% reduction at the 0.20 selection ratio, these differences were not statistically significant (0.10 χ 2 = 1.79, ns, 0.20 χ 2 = 2.39, ns). However, in this case, at the 0.30 selection ratio, the regression-weighted combination resulted in a significant reduction (χ 2 = 4.31, p < 0.05) of 9.00% in the percentage of fakers hired. This finding provided partial support for Hypothesis 2a.

Once again, the examination of hiring discrepancies (Hypothesis 2b) resulted in more noticeable effects. At the 0.10 selection ratio, the use of the regression-weighted combination resulted in a 36.40% reduction in the percentage of hiring discrepancies (χ 2 = 211.67, p < 0.001). At the 0.20 selection ratio, the regression-weighted combination reduced the percentage of hiring discrepancies by 40.00% (χ 2 = 350.44, p < 0.001). Finally, at the 0.30 selection ratio, this predictor combination resulted in a 36.10% reduction in the percentage of hiring discrepancies (χ 2 = 207.07, p < 0.001). These findings lend consistent support to Hypothesis 2b.

Hypothesis 3. The third hypothesis stated that the percentage of fakers hired (Hypothesis 3a) and hiring discrepancies (Hypothesis 3b) occurring for a multiple-hurdle combination, in which conscientiousness is used as the first step and participants are then selected based on their cognitive ability scores, would be similar to those for the use of a conscientiousness measure alone. While this predictor variable combination resulted in a reduction in the percentage of fakers hired across all three selection ratios that was as high as 13.50% (at the 0.10 selection ratio), none of the differences were statistically significant. While this finding supports Hypothesis 3a, the reduction in the percentage of fakers for this predictor combination strategy was similar to the unit-weighted and regression-weighted combinations. As was the case with Hypotheses 1a and 2a, the lack of significance also needs to be interpreted in light of sample size concerns.

Contrary to our expectations, the analysis of hiring discrepancies for Hypothesis 3b yielded significant reductions across two of the three selection ratios for the multiple-hurdle model. While the effects were not as strong as those evidenced by the regression-weighted combination, this particular multiple-hurdle model resulted in hiring discrepancy reductions of 20.00% (0.10 selection ratio; χ 2 = 56.44, p < 0.001) and 11.10% (0.20 selection ratio; χ 2 = 29.65, p < 0.001) for the smaller selection ratios.

Hypothesis 4. Hypothesis 4 suggested that the percentage of fakers hired (Hypothesis 4a) and hiring discrepancies (Hypothesis 4b) occurring for a multiple-hurdle combination, in which cognitive ability was used at the first step and participants were subsequently selected based on their conscientiousness scores, would not be significantly smaller when compared to the conscientiousness measure alone. Due to the fact that there were no significant differences in the percentage of fakers hired between this multiple-hurdle combination and the conscientiousness measure alone, Hypothesis 4a was technically supported at all three selection ratios. However, the reduction in the percentage of fakers hired was only slightly smaller than the other predictor combinations, which suggests that this finding should be interpreted with caution. Nonetheless, the multiple-hurdle strategy used in Hypothesis 4a did result in the smallest reductions in the percentage of fakers hired at each selection ratio.

Once again, our findings differed when hiring discrepancies were examined (Hypothesis 4b). While the reduction in the percentage of hiring discrepancies was nonsignificant at the 0.10 and 0.30 selection ratios, a significant reduction of 4.80% (χ 2 = 5.32, p < 0.05) occurred at the 0.20 selection ratio.

Discussion

The current study examined the impact of faking across a variety of possible combinations of conscientiousness and cognitive ability scores at three selection ratios. Results indicated that hiring decisions based on a combination of cognitive ability and conscientiousness measures generally did not result in the selection of a significantly smaller percentage of fakers than those based on a conscientiousness measure alone. However, the combinations did tend to result in significant reductions in hiring discrepancies across selection ratios. Consistent with the findings of Converse et al. (2009), the effectiveness of the multiple-predictor combinations on reducing the effects of faking varied across the conditions tested in the present study.

Findings

The findings pertaining to the percentage of fakers hired (Hypotheses 1a, 4a) provided mixed support for the study’s hypotheses. Hypotheses 1a and 2a, focusing on compensatory predictor combinations, obtained little support, with the exception of a significant reduction in the percentage of fakers hired by the regression-weighted combination at the 0.30 selection ratio. The results pertaining to Hypotheses 3a and 4a, while technically lending support to our assertions, should be interpreted with caution for several reasons. First, the noncompensatory model tested for Hypothesis 3a resulted in reductions in the percentage of fakers in the hired sample that were comparable to the compensatory models. While these reductions were not statistically significant (therefore technically supporting the hypothesis), this was likely due to sample size limitations. The noncompensatory model tested in Hypothesis 4a, while generally offering the smallest reductions in the percentage of fakers in the hired sample, should also be interpreted in light of the study’s sample size. Finally, the reduction in the percentage of fakers hired by the conscientiousness–ability combinations tended to decrease at the 0.20 and 0.30 selection ratios. Therefore, although the sample size increased for the analyses at these selection ratios, the differences in the frequency of hiring a faker also tended to become less pronounced.

While most of the reductions in the percentage of fakers in the hired sample were not significant, some ability–conscientiousness composites resulted in as much as a 13.50% reduction, which suggests that our findings may have practical implications. In large-scale hiring scenarios, reductions of this magnitude could offer considerable improvements to selection practices. Interestingly enough, under no predictor combination scenario or selection ratio did the percentage of fakers hired drop below 20.30%. In contrast, the highest percentage of fakers hired at any selection ratio by the conscientiousness measure alone was 40.50%. Therefore, while it appears that the addition of a cognitive ability measure may reduce the percentage of fakers hired to some degree it does not appear to lower the percentage to a level that should totally eliminate concern over the presence of faking.

Contrary to the results for Hypotheses 1a and 4a, our analyses examining the percentage of hiring discrepancies (Hypotheses 1b, 4b) yielded stronger effects. The percentage of hiring discrepancies (operationalized as fakers hired based on their applicant conscientiousness scores that would not have been hired based on their honest conscientiousness scores) could be greatly reduced by using the composite predictors. While the use of the conscientiousness measure alone resulted in 97.00–100.00% hiring discrepancies, conscientiousness–ability composites led to reductions in hiring discrepancies as high as 40.00% at the 0.20 selection ratio. These findings provided strong support for Hypotheses 1b and 2b and suggest that using multiple predictors can substantially reduce the percentage of individuals who fake a conscientiousness measure enough to substantially affect hiring decisions (i.e., fake enough to get hired). Once again, however, the results for Hypotheses 3b and 4b were somewhat inconsistent with our expectations. While typically offering smaller reductions in hiring discrepancies than the compensatory models, the two multiple-hurdle approaches did result in significant reductions when compared to the use of personality alone.

The contrary findings for Hypotheses 3a and 3b (and to some extent 4a and 4b) require a brief discussion. Our findings suggest that even though there were still numerous fakers present in the sample after the screening process, selecting on a variable that was unrelated to faking still resulted in the hiring of fewer fakers (or fewer hiring discrepancies) than the conscientiousness measure alone. This finding would support the assertion of Mueller-Hanson et al. (2003) that the conscientiousness measures are most effectively used as screening tools. An additional point worth noting is that the effectiveness of this scenario may also be a function of the cut-score used in the first hurdle. Specifically, conscientiousness gets less weight in this scenario due to the fact that applicants only need to fall in the top 50% (approximately) of conscientiousness scores, but are selected in top-down fashion on their ability scores.

Implications

Situations in which individuals are able to fake enough to alter their hiring decision are likely to strain our assumptions of a fair and standardized testing process. The results of the present study suggest that using multiple predictors in the selection process may reduce the negative impact of faking, particularly through a reduction in the number of individuals who are able to fake enough to alter their hiring decision. Faking that substantially affects hiring decisions is a concern not only for the effective prediction of performance, but also for the fairness of selection practices involving personality assessments (e.g., Hough 1998; Morgeson et al. 2007). While the use of conscientiousness–cognitive ability combinations significantly reduced the number of hiring discrepancies across the various selection scenarios examined, it did not appear to completely remove the influence of faking. In the best case scenario, over 20% of the hired sample faked the conscientiousness scale, and 60% those individuals would not have been hired based on their honest personality scores; these numbers are far from trivial.

Going beyond the fairness concerns noted earlier, additional research has looked at the effect of hiring decisions involving fakers and non-fakers on subsequent mean performance differences. Mueller-Hanson et al. (2003) reported lower mean performance scores for individuals selected from an incentive condition compared to those selected from an honest condition. Additionally, the participants from the incentive condition were more likely to be hired, particularly at small selection ratios. However, a recently published simulation study by Schmitt and Oswald (2006) examined the impact of removing suspected fakers from the selection pool on mean performance levels. For the most part, the results of this study indicated that the removal of fakers from applicant samples had little positive impact on the mean performance levels of the sample that was selected, and in some cases, could actually result in lower mean-level performance (if faking was positively correlated with performance).

Finally, the goal of the present study was not to argue for the inclusion of measures of general cognitive ability to personnel selection batteries as a means of reducing the impact of faking, but rather to examine faking within the context of a realistic personnel selection scenario. The benefits and limitations of using measures of general cognitive ability have been well documented in the personnel selection literature. Cognitive ability has demonstrated strong predictive validity across a variety of jobs (Schmidt and Hunter 1998), which has contributed to its broad appeal in personnel selection. However, the potential for adverse impact associated with the use of cognitive ability tests (e.g., Potosky et al. 2005) represents a persistent concern. As such, considerations regarding the composition of selection composites must take into account trade-offs between a variety of selection outcomes (e.g., criterion-related validity, adverse impact; De Corte et al. 2007).

Limitations and Future Research

There are several limitations associated with the current study that might be addressed in future research. First, the small samples involved in the hypothesis tests likely contributed to the lack of statistically significant findings related to the percent of individuals hired that were fakers. Although a situation in which 370 individuals apply for 37 available positions (i.e., a 0.10 selection ratio) may be representative of a real world hiring process, a considerably larger sample may be required in order to conduct formal analyses comparing the percentage of fakers that are hired by various predictor combination strategies. Thus, additional research involving larger samples may allow for firmer conclusions regarding percentage of fakers hired under various predictor combinations. Furthermore, while the current study investigated a limited range of selection ratios and first-hurdle cut-scores (i.e., mean conscientiousness scores, mean occupation-level WPT scores from the assessment manual), future research should consider a broader range of selection situations. For example, if fakers tend to rise to the top of the personality score distribution, setting a more stringent first-hurdle conscientiousness cut-score may result in an increased percentage of fakers in the sample that passes this hurdle.

Although the present study’s methodology attempted to simulate a true applicant setting, there are limitations associated with this design. Participants did not actively seek out the job opportunity, but were instead presented with a chance to apply for the job. While this is necessary in order to ensure that there is minimal harm associated with the study’s use of deception, it does create a somewhat artificial applicant situation. Nonetheless, as the manipulation check data suggest, a substantial proportion of the participants believed the manipulation and were at least somewhat interested in the position. One would expect that if lack of believability or interest in the position were a factor in the study, they should reduce participants’ motivation to fake, therefore leading to a conservative estimate of the amount and prevalence of faking in an applicant sample.

An additional limitation was the use of an artificially dichotomized variable to identify fakers. While we were able to obtain a continuous estimate of the amount of faking each participant engaged in, categorizing individuals as “fakers” or “non-fakers” was necessary for the purpose of the study’s analyses. However, this distinction is likely an oversimplification of a complex behavior. In addition, any attempt to identify fakers will inevitably result in errors in which honest individuals are labeled as fakers and fakers are labeled as honest (i.e., Type I and Type II errors). Different approaches could be used to reduce one of these errors at the expense of the other based on which error is of most concern given one’s situation and purposes (e.g., using a smaller confidence interval to ensure most fakers are identified as such, which would also result in more honest individuals being labeled as fakers). Additional research is likely necessary in order to determine whether Type I or Type II errors are more palatable in a given situation (e.g., if faking were linked to counterproductive work behaviors, a slightly higher Type I error rate may be acceptable). The dichotomization used in the present study also necessitated the use of only one personality dimension. While conscientiousness was the most relevant personality characteristic to the job used in the current study, this is unlikely to be the case across a variety of jobs. Therefore, the goal of examining faking in more realistic selection scenarios should also be pursued by studying more complex situations in which multiple-personality traits and other characteristics may contribute to overall hiring decisions.

Finally, due to the fact that some of the subjects’ honest responses were collected after the applicant manipulation had taken place, the degree to which those honest responses reflect participants “true” conscientiousness levels may be called into question. We sought to address this issue through the examination of data from a counterbalanced condition in which the honest conscientiousness responses were obtained approximately 1 month prior to the applicant manipulation. Our analyses suggested that collecting data on pre- and post-manipulations resulted in similar mean-level conscientiousness scores and a similar percentage of fakers in the sample. Further research is necessary in order to determine whether assessing conscientiousness prior to or following this type of manipulation provides a conscientiousness score that is more representative of an individual’s “true” trait level.

Overall, the results of the present study suggest that the effects of faking may be partially mitigated when personality measures are used in conjunction with other predictors. However, considerable room for investigations into faking under such realistic selection scenarios remains.