Introduction

It is in the news all too often. A car manufacturer cheats on engine performance tests. Corporate executives fraudulently conceal large losses. A financial advisor sells stock in nonexistent companies. Such instances of unethical behavior have widespread consequences. Bernard Madoff’s $64.8 billion dollar Ponzi scheme, the largest in American history, created a chain reaction forcing organizations to close and causing investors to lose billions of dollars. The Financial Crisis Inquiry Commission report asserted that a chief cause of the 2008 economic crisis was a systematic breakdown in ethics. Similarly, the most recent National Business Ethics Survey (NBES, Ethics Resource Center 2014) found that 41 % of workers report having witnessed unethical or illegal conduct in their workplace. Although this number has been declining over the years, it still represents a major area of concern for organizations as ethical breaches can be accompanied by hefty financial consequences.

The role of behavioral ethics in the workplace is a topic of growing interest and relevance among researchers. Comprehensive literature reviews (e.g., Mayer 2014; O’Fallon and Butterfield 2005; Tenbrunsel and Smith-Crowe 2008; Treviño et al. 2014; Treviño et al. 2006) and meta-analyses (e.g., Kish-Gephart et al. 2010; Martin and Cullen 2006; Mesmer-Magnus and Viswesvaran 2005; Pan and Sparks 2012) published on the topic of behavioral ethics bear testament to its prominence and popularity.

Despite an abundance of ethics-related publications, journals, classes, and debates, there is a relative dearth of literature that attempts to specify the behavioral dimensions of ethical performance in the workplace. In general, the work ethics literature has been dominated by the independent variable; that is, the influence of ethical codes, training programs, climate and culture, individual differences in interpersonal and cognitive processes, and failures in self-regulation (Treviño et al. 2014). The dependent variable has received far less attention. Researchers have largely focused on dependent variables such as opinions, perceptions, and values held by organization members instead of individual ethical job performance itself (Gatewood and Carroll 1991). In recent years, ethical behaviors have been defined for (a) a few occupational groups such as medical students (Schubert et al. 2008), senior managers (Foldes 2006), or leaders (Brown et al. 2005); or for (b) a few specific domains of work, such as scientific work (Helton-Fauth et al. 2003). Kaptein (2008) took a somewhat broader view and specified unethical behaviors for five different categories of stakeholders. Specifications for a general underlying taxonomy (i.e., latent structure) of ethical behaviors in the workplace are needed to provide a foundation for systematic study of ethical performance at work and development of assessment tools.

Objectives

The objectives of the current effort were twofold. First, we sought to develop a “model” of ethical behavior at work in the United States that places it within the larger context of individual job performance, as it is modeled in Industrial and Organizational (I–O) Psychology. This entails defining behavioral dimensions of ethical performance that are common across occupations. Our second goal was to develop a set of rating scales that could be used to assess individuals on the ethical performance dimensions.

Working Definition of Ethical Performance in the Workplace

Traditionally speaking, business ethics have been viewed as actions that are taken or not taken, in the work context, and which are judged as meeting, or not meeting, an ethical standard (Tenbrunsel and Smith-Crowe 2008; Treviño et al. 2014). Such definitions imply that there are value systems (plural) that specify which behaviors are ethical and which are not, and that it may not be an easy designation to make in a given situation. Moreover, there are many contextual features that could influence what is perceived as ethical, or not ethical, at any given time and place.

For many years, researchers assumed that the individual’s intent was an important consideration in determining whether behavior was unethical. Unintentional behaviors could be excused. Recent thinking distinguishes more clearly between ethical behavior and ethical intent. Well-intended behaviors may still be unethical; failing to recognize the ethical implications of a problem may result in ethical intentions but unethical actions (DeCremer 2011; Tenbrunsel and Smith-Crowe 2008; Treviño et al. 2014). For the purpose of this project, we tentatively define unethical behavior as follows:

Unethical behavior at work is behavior that violates a prescribed norm that is based on a code of behavior at work that is (a) ascribed to by the relevant organization or professional group, (b) prescribed by relevant regulatory bodies or by statute, or (c) widely endorsed in the society. An ethical violation has at least the potential for doing harm to one or more of the organization’s stakeholders. Among the relevant stakeholders are stockholders, the management, coworkers, customers, clients, and the public good. Unethical behaviors are distinct from ethical intentions. Unethical behaviors can result from (a) lack of awareness that one is facing an ethical problem, (b) ill-considered good intentions, or (c) unethical intentions. The standards by which a prescribed norm is judged to be violated most likely have multiple determinants, such as the national culture, public (i.e., government) policy, and the prevailing value systems of the important stakeholders.

We view ethical behavior as a component of job performance (i.e., a performance requirement for any work role). To this end, we situate ethical performance within the Campbell (2012) model of performance. The Campbell model is intended to synthesize all past and current attempts to model the substantive dimensionality of performance in a work role (see also Campbell et al. 1993; Campbell and Wiernik 2015). This synthesized model puts forth the following principles. First, performance is what people actually do at work for the purpose of helping the organization (even an organization of one) accomplish its goals. Consequently, two questions can be asked about what people do at work: (a) is a particular behavior or action relevant for the organization’s goals? and (b) if so, to what degree do the specific actions of an individual contribute to the organization’s goals (i.e., how do we measure the individual’s contribution to the organization)?

Second, the Campbell (2012) model contends that individual performance is multi-dimensional. Although a covariance matrix for multiple performance measures will typically yield a general factor, this is a separate issue and does not speak to the substantive differences in performance domains. After more than 30 years of research and experience, there exists substantial support for the eight factors that comprise the model. These factors are shown in Fig. 1. Extant literature (see Campbell and Wiernik 2015) also provides a reasonable synthesis of the subfactors of both leadership and management. These are shown in Figs. 2 and 3, respectively. Additionally, the Campbell model easily accommodates Borman and Motowidlo’s (1993) popular notion of contextual performance as well as the components of Organizational Citizenship Behavior (OCB).

Fig. 1
figure 1

The eight performance factors in the Campbell (2012) Model adapted from “Behavior, performance, and effectiveness in the twentyfirst century,” by J. P. Campbell, 2012, in S. Kozlowski (Ed.), The Oxford Handbook of Organizational Psychology: Volume 1. New York, NY: Oxford University Press

Fig. 2
figure 2

Components of leadership performance adapted from “The modeling and assessment of performance at work,” by Campbell and Wiernik 2015, Annual Review of Organizational Psychology and Organizational Behavior, 2, 47–74

Fig. 3
figure 3

Components of Management Performance adapted from “The modeling and assessment of performance at work,” by Campbell and Wiernik 2015, Annual Review of Organizational Psychology and Organizational Behavior, 2, 47–74

The third principle of the Campbell (2012) model specifies that individual differences in performance are a function of two sets of determinants: direct and indirect. The direct determinants are (1) current specific job knowledge, (2) current job specific skill, and (3) three volitional choices (euphemistically referred to as “motivation”): (a) the choice to expend effort on a particular activity, (b) the choice of the level of effort, and (c) the choice of how long to persist. They are the determinants that are present and operate in real time “on the job,” so to speak. Variance is also accounted for by their interactions. For example, being highly knowledgeable about a particular job requirement could increase the probability of choosing to do it.

In contrast, indirect determinants are all the things that can produce individual differences in the direct determinants (e.g., cognitive ability, personality, training, goal setting, reward preference, self-efficacy, etc.). They can influence performance only by influencing the direct determinants. That is, the direct determinants totally mediate the effects of the indirect determinants.

It is important to identify how ethical performance fits into this existing model of job performance. It is evident that ethical performance does indeed fit the characterization of “actions taken at work.” In fact, the business ethics literature often talks about ethical decision-making as a dependent variable.

Further, whether a particular ethical action is relevant for the organization’s goal (or not) is most likely a function of more than one value system (e.g., an organization espouses a code of Corporate Social Responsibility and also functions as a profit maximizer). What happens if these value systems disagree? Also, and perhaps to a greater extent for ethical performance than for other dimensions of performance, the value system of the organization and the value system of the individual may be in conflict, which makes assessing the level of ethical performance even more difficult. That is, from whose perspective did the individual perform ethically? Or unethically?

Another issue that warrants attention is whether [as in the Campbell (2012) model] it is more useful to view ethical performance as a subfactor of the counterproductive work behavior (CWB) factor; as a subfactor of the overall management factor; or as a distinct factor in its own right. The most useful specification, we argue, is one that is informed by research beginning with a systematic attempt to specify the content of ethical performance. As stated above, this is one of the primary goals of the current research.

The business ethics literature provides many examples of both the direct and indirect determinants of individual differences in ethical performance. Much of the attention is focused on the indirect determinants of the choice to act ethically, such as philosophical orientation, university training (e.g., courses in business ethics in management schools), personality, ethical efficacy, gender, analysis of decision consequences, etc.). However, the Kohlberg (1969) and Rest (1986) models of moral reasoning also include knowledge (KN) and skill (SK) as direct determinants of ethical decision-making.

The Campbell (2012) model distinguishes determinants of individual differences in performance from influences on the mean of performance for a specific sample of individuals. For example, one of the most important influences on the performance mean these days, at least in the opinion of many people, is technology. Technology only becomes a determinant of individual differences in performance if we assess the individual differences in how well people have learned to use the technology.

If we think of influences on the mean as contextual factors, then the contextual factors most often talked about in the business ethics literature are (a) the code of ethics formally adopted by the organization; (b) the organization’s ethical climate (which could be viewed as the translation of the espoused ethical code to the code “in use”); (c) ethical leadership (which could be viewed as a major component of the “climate”); (d) prevailing norms and culture (some people distinguish these from climate); and (e) the “moral intensity” of the ethical decision-making situation, which might be loosely defined as the prevailing situational pressure to act, either ethically or unethically (see Treviño et al. 2014).

As is true of other performance factors, individual differences are also a function of the interactions among determinants (e.g., the greater one’s KN and SK, the higher or lower, is the probability of making a particular choice), or between individual differences and features of the context; for example, the interaction of the individual’s value system with the organizational ethical climate; or the interaction between personality and moral intensity.

Individual ethical performance must also be distinguished from its consequences, or outcomes. That is, what is the effect of high or low ethical performance on important outcomes such as sales, the organization’s reputation, or the morale of the work team? It is axiomatic that such outcomes have other determinants as well, in addition to individual ethical performance.

Study 1: Identification of Performance Dimensions and Development of Initial EPRS

Purpose

The purposes of Study 1 were to (a) identify a set of performance dimensions capturing (un)ethical behavior at work and (b) develop ethical performance rating scales (EPRS) to accompany those dimensions.

Method

Study 1 involved qualitative analysis of four different types of information to establish a dimension structure that could be evaluated in Study 2. Qualitative methods are commonly used in the early stages of instrument development (e.g., Mallard and Lance 1998) or taxonomy definition (e.g., Flanagan 1954). We content-analyzed four distinct types of information to develop performance dimensions and scales: (a) the published literature on ethical behavior, (b) professional codes of ethics from a sample of occupations, which was the data source for Kaptein (2008), (c) critical incidents of ethical performance from a large government organization, and (d) behavioral items from ethics surveys.

Literature Review

Reviewing literature on ethical performance is a monumental task. Initially, the review spanned a variety of disciplines (e.g., philosophy, sociology, anthropology, etc.). However, given the focus, context, and intended purpose of the present research project, the decision was ultimately made to narrow the search to applied psychology and business journals. The authors searched the abstracts and titles of published articles in peer reviewed journals using the Web of Science and PsycINFO databases for keywords such as ethics, ethical, ethical performance, ethical behavior, and ethical decision-making. The keyword searches identified a variety of publications from journals including the Journal of Applied Psychology, Academy of Management Review, and Journal of Business Ethics, among others.Footnote 1 The Journal of Business Ethics, in particular, was a very useful resource. We also examined programs for two years of the annual conference of the Society for Industrial and Organizational Psychology (SIOP) and obtained relevant conference papers.

After gathering the relevant citations, the authors began to read and review the full text manuscripts and compile this information into summaries. Specifically, the initial investigative efforts were directed toward achieving three goals: (1) identifying the seminal models, conceptual treatments, and empirical works related to ethical behavior, ethical decision-making, and/or ethical performance; (2) compiling previously researched dimensions/variables relevant to the study of ethics and ethical behaviors; and (3) constructing a theoretically and practically meaningful definition of ethical performance. The final results of the literature review process thus served to provide a relatively comprehensive foundation from which to base subsequent taxonomic and model development activities.

Several behavioral dimensions were evident in the literature. The most common types of behaviors emerging from the literature were things such as truthfulness (vs. lying), showing respect (or disrespect) for others, and obeying the law (Broome et al. 2005; Gaumnitz and Lere 2002; Kaptein 2010; Stevens 2001; Vitell et al. 2000). These are also components of counterproductive work behavior (Spector et al. 2006). The organizational justice literature (Colquitt, 2001) also suggests a “fair treatment,” or procedural justice, dimension, having to do with fair treatment of coworkers and subordinates. A dimension having to do with avoiding being coercive was supported by the ethical leadership (Brown and Treviño 2006; Treviño et al. 2014) and toxic leadership (Lipman-Blumen 2004; Kellerman 2004) literatures.

Review of Professional Codes of Ethics

The Center for the Study of Ethics in the Professions at the Illinois Institute of Technology (http://ethics.iit.edu/) maintains an extensive online collection of codes of ethics for professional societies, corporations, government, and academic institutions. At the time of our search, the Center had linkages to approximately 720 ethical codes for 26 different professional categories. We sampled approximately 10 % of the codes for each professional category. So, for example, the first professional category was “Agriculture.” There were nine links under this category; one was reviewed. For the two categories with the largest number of links, “Health Care,” and “Other,” we reviewed 10 and 11 links, respectively (see Table 1). In choosing a single ethics code from a list of many, we tried to select the code with the broadest base (e.g., chose The American Veterinary Medicine Association’s ethics code instead of Cavalier King Charles Spaniel Club of Canada’s code). Also, where relevant, we used English language sites and tended toward American- or International-based ethics codes (vs. codes from a specific country). In the cases where a specific organization had more than one ethics code, because they were updated yearly, we chose the most recent update.

Table 1 Number of ethical codes reviewed by professional category

As we reviewed ethics codes, we analyzed the content and extracted behavioral statements from codes to identify universal concepts appearing across professions, disciplines, and organizations. Initially, we categorized behavioral statements into 26 concepts (e.g., honesty, impartiality, transparency, and openness). We discussed the concepts and grouped similar ones together drawing on our own knowledge of ethics literature. This process resulted in the following nine preliminary dimensions:

  1. 1.

    Does not knowingly mislead clients, coworkers, supervisors, management, or customers when offering advice or consultation.

  2. 2.

    Accurately reports product/service quality data, use of financial resources, effort levels, or performance outcomes.

  3. 3.

    Overtly acknowledges potential conflicts of interest that involve personal gain versus achieving organizational, professional, or public goals.

  4. 4.

    Gives credit to the work of others and does not maliciously harm the reputation, work, or performance of others.

  5. 5.

    Maintains appropriate confidentiality regarding client, customer, coworker, and organizational information.

  6. 6.

    Acts in accordance with the goals, values, and ethics of own occupation/profession and of the organization.

  7. 7.

    Does not violate federal, state, or local laws.

  8. 8.

    Reports unlawful behavior, maliciousness, and harmful malfeasance to the appropriate authority.

  9. 9.

    Does not obtain unfair advantage via nepotism, insider information, or violating the intellectual and/or property rights of others.

Critical Incident Sort

To evaluate the preliminary dimensions, we conducted two rounds of critical incident sorting. We had access to ethics-related critical incidents collected from a large government organization. We conducted two iterations of critical incident sorts. In the first iteration, 60 critical incidents were randomly selected. Four PhD-level research staff (“sorters”) participated in a sorting task. The incidents were write-in comments from a survey and had not been prescreened to ensure they had ethics-related content. Sorters were asked to make a yes/no decision as to whether the behavior in the incident was related to ethics according to a draft version of our definition of (un)ethical behavior provided earlier in this article. Forty-nine of the 60 incidents were deemed ethical by at least three of the four staffers. For critical incidents including ethics-related behaviors, sorters were asked to identify the most relevant of the nine dimensions by assigning a “1.” If other dimensions were also thought to be relevant, sorters were told to assign a “2” or a “3” according to the degree of relevance. Sorters were also asked to provide written comments about the categorizations. Thirty-nine of the 49 incidents with ethical content were categorized consistently across sorters (i.e., received either a “1” or a “2” from at least 3 sorters). The sorters discussed the ratings to reach consensus on the status of incidents that were not categorized consistently or tended to fall into two categories. As a result of this process, dimension definitions were revised. The first two dimensions were not well-differentiated and were merged into a broader truthfulness dimension; coercion, which was loosely associated with dimension #3, was separated out to form a dimension of its own. In the second iteration, another 60 critical incidents were sorted by the same four research staff. Fifty of the 60 incidents were rated as having ethical content, and 38 of the 50 incidents were consistently categorized into one dimension (using the criteria described for the first sort). We discussed the content of the incidents that were inconsistently and consistently sorted. Based on the discussion we decided to merge dimensions #6 and #7 above, both of which have to do with abiding by organizational or societal rules. We split #4 into three categories: (a) giving credit to others for their work, (b) maliciously harming others, and (c) harassing others. Ten performance dimensions resulted from this effort: Truthfulness, Full Disclosure, Intellectual Property, Confidentiality, Unfair Treatment, Respect for Others, Harassment, Whistle-Blowing, Abuse of Power, and Lawfulness.

Dimension Review

As a check on the dimension structure, we identified several published surveys (Broome et al. 2005; Gaumnitz and Lere 2002; Kaptein 2010; Stevens 2001; Vitell et al. 2000) and one doctoral dissertation (Foldes 2006) containing behavioral statements about ethics. Two of the team members sorted those statements into the 10 existing dimensions accompanied by definitions. No additional changes were made to the dimension definitions. Definitions of the 10 dimensions at the end of Study 1 appear in Fig. 4.

Fig. 4
figure 4

Definitions of ethical performance dimensions at the end of Study 1

Development of Behaviorally Anchored Rating Scales (BARS) for the EPRS Dimensions

We drafted anchors for behaviorally anchored rating scales by extracting content from the critical incidents and behavioral survey items sorted into each of the 10 dimensions. Within each dimension, we wrote anchors to reflect behaviors at different levels of ethicality. We chose a four-point rating format for the scales (1 = Clearly Unethical, 2 = Unethical, 3 = Ethical, and 4 = Clearly Ethical). Scaling research has recently been critical of scales that have a mid-point [i.e., a neither ethical nor unethical point in the middle (Stark et al. 2006)]. The mid-point in behavioral scales is often defined by double-barreled statements “Usually arrives at work on time but is occasionally late.” This would be particularly problematic with ethical performance rating scales. We omitted a mid-point on this scale to force raters to choose whether the ratee’s behavior tends to be ethical or unethical.

Summary

A review of the research literature, examination of a sample of existing ethical codes, and critical incident sorting exercises culminated in 10 dimensions of ethical performance in the workplace. These dimensions provide a foundation for future ethics research, and a working taxonomy that can be used to describe ethical behavior. Some or all of the rating scales can be used to evaluate employee or supervisor performance.

Study 2: EPRS Retranslation Study

Purpose

The purpose of Study 2 was to conduct a retranslation study (Smith and Kendall 1963) of a new set of ethical performance episodes, with the goal of further validating the dimension structure established in the development of the EPRS in Study 1.

Method

To accomplish this objective, we asked graduate students who had not participated in the development of the EPRS to sort behaviors from ethical vignettes into the 10 dimensions and use the EPRS to rate the level of ethicality of the performance behaviors of characters in the vignettes. This allowed us to evaluate the EPRS without having to obtain supervisor ratings in a work setting, where the incidence of unethical behavior might be low. In Study 1, we relied on a review and content analysis of professional codes of ethics and critical incidents to develop the initial version of the EPRS. The incorporation of a new and third type of stimulus in Study 2 (i.e., vignettes describing ethical or unethical behavior) provides an additional, and informative, approach to scale development. Using vignettes also allowed us to ensure that different dimensions and levels of ethicality would be represented, and facilitated higher quality data from respondents than are possible from simple questions (Alexander and Becker 1978).

Identifying and Editing Vignettes

To build the retranslation survey, we first identified journal articles featuring ethical vignettes or situation-based stimuli using keyword searches. During development of the EPRS, we learned that the Journal of Business Ethics frequently published ethical vignettes, and research assistants reviewed every article published in this journal for the last 10 years to identify and extract vignettes.

We then placed all of the extracted vignettes into a single centralized document so that the content and structure of the vignettes could be reviewed and compared. We removed duplicate vignettes, or those that exhibited redundancies (i.e., oftentimes articles featured vignettes from a prior article, editing or revising the scenarios, characters, or context based on idiosyncratic research goals). We edited the wording of the vignettes to make them all gender neutral (i.e., changed proper nouns to “Employee X,” “Supervisor,” “Coworker,” etc.) as well as to adjust the language structure to make each vignette more concise and of similar length (i.e., around 2–4 sentences). This process resulted in retaining approximately 125 vignettes.

Adding Items

We wrote one to four items that followed each vignette. Items asked respondents to rate the behavior of one or more characters in the vignette (e.g., “Rate the Supervisor’s behavior,” “Rate Employee X’s behavior”). Two example vignettes with their items appear in Fig. 5.

Fig. 5
figure 5

Two vignettes with items

Assigning Items to Dimensions

We wanted to ensure that at least some items representing each of the 10 dimensions were included in the retranslation survey, and that the items represented a range of ethicality. With that goal in mind, two of the authors independently (a) sorted a sample of 62 items accompanying a set of 25 vignettes into the 10 ethical performance dimensions and (b) rated the ethicality of the performance behavior represented by these items. We compared the dimensions to which we each assigned an item, as well as our ethicality ratings, and then discussed discrepancies to reach consensus. One author sorted and rated the remaining items based on decision rules established during this consensus discussion.

The goal of this activity was to generate a rough preliminary categorization to ensure that all of the dimensions were represented and that no single dimension was grossly over or underrepresented. Some of the ethical performance dimensions are more or less narrow in scope than others, and thus, the number of items per dimension varied. For example, Harassment was a very narrow dimension. Multiple items about harassment would have been redundant, and we did not want to make our respondents read variants on the same vignette repeatedly; therefore, we included only a few vignettes about Harassment. In contrast, Truthfulness was a very broad dimension with many different vignettes. It needed to be more heavily represented in the survey. As needed, we authored new vignettes with accompanying items for dimensions that exhibited lower content coverage.

Assigning Vignettes to Survey Forms

The prior steps resulted in retaining 73 vignettes accompanied by their respective items (146 items in total). To make the retranslation rating task more manageable (and not exhaust participants), we divided the vignettes into two retranslation survey forms. We randomly assigned the vignettes across the forms and then compared the number of items for each dimension on the two forms. We shifted some items around to ensure that all 10 dimensions received similar coverage across the two forms. We also identified highly similar items (i.e., items featuring similar content) and either split them across the two forms or moved them further apart from one another on the same form. In total, Form A had 37 vignettes (73 items) and Form B had 36 vignettes (73 items).

Retranslation Survey Instructions

Retranslation participants were asked to carefully read each vignette and the accompanying items and then make two judgments. First, they were asked to identify which of the 10 dimensions of ethical performance was most relevant to each item accompanying the vignettes. If they felt numerous dimensions applied, they were instructed to select the single dimension that was most relevant. If they felt none of the dimensions applied to a given item, they could indicate this as well by sorting the item into the “K. None” category. Second, participants were asked to indicate the ethicality of the character’s behavior described in the item using the four-point EPRS scale for the identified dimension. An example of one EPRS appears in Fig. 6.

Fig. 6
figure 6

Sample EPRS for one dimension

Sample

Retranslation study participants were 21 (Form A N = 11, Form B N = 10) students from two industrial–organizational psychology doctoral programs in the U.S., one at a large southeastern university and one at a large midwestern university. Approximately 81 % of participants held a bachelor’s degree while approximately 19 % held a master’s degree.

Results

The primary objective of the retranslation study was to examine the extent to which the dimension structure of ethical performance held up based on a review and sorting of a set of items associated with various vignettes by an independent sample of raters. In general, we found strong support for the 10-dimension structure. We also asked raters to assess the ethicality of the behavior reflected in each item. We found that raters were able to rate the level of ethicality reliably.

Dimension Sorting

Overall, results of the sorting task indicated that most raters agreed upon the classification of most of the items (see Table 2). In total, 83 % (k = 121) of the items were sorted into the same dimension by at least 50 % of the raters, and 56 % (k = 80) of the items were sorted into the same dimension by 66 % or more of the raters. Seventeen percent (k = 25) of the items exhibited no majority dimension classification.

Table 2 Number of items categorized into each dimension

Taking a closer look at the dimensions that demonstrated less clean sorting across raters, it was evident that raters had the most difficulty with the following dimensions: F. Defamation of Others, J. Rule-Abiding, and C. Intellectual Property. Dimension F was originally named “Respect for Others.” Some of the items intended for it were categorized into other dimensions also having to do with treatment of other people, such as dimension E. Unfair Treatment. Thus, we determined that the dimension title was too broad and relabeled it “Defamation of Others” to more accurately describe the content of the dimension. The original title of dimension J. Rule-Abiding was “Lawfulness” which seemed to be too narrow to capture the content domain, and raters mentioned having difficulty classifying vignettes to this dimension without knowing the actual relevant legal precedents. Thus, we broadened this dimension’s title to “Rule Abiding” to also include infractions of policies or contractual arrangements (that may or may not be legally binding).

Dimensions C. Intellectual Property and D. Confidentiality also appeared to cause some confusion, with C items sometimes categorized into D and vice versa. We attempted to remedy this by revising the dimension definition to clarify that the Intellectual Property dimension refers to stealing ideas, plans, etc. while Confidentiality refers to divulging confidential information. As noted, G. Workplace Bullying (formerly Harassment) is a relatively narrow dimension. The old title “Harassment” had a legal connotation that may have led respondents to only use the dimension if the behavior was illegal. We changed the title to Workplace Bullying to cover a wider range of situations that might occur in the workplace. The final dimension titles and definitions for each of the ethical performance dimensions appear in Fig. 7.

Fig. 7
figure 7

Definitions of ethical performance dimensions at the end of Study 2

Ethicality Ratings

In general, the ethicality ratings suggested that raters were able to make judgments regarding the ethicality of each item and that they used the full range of the ethicality rating scale to do so. The mean ethicality ratings for the items ranged from 1 to 4. The grand mean ethicality rating across all elements was 2.21. Standard deviations ranged from 0 to 1.63, and larger standard deviations naturally tended to be associated with ethicality ratings in the middle of the scale. Table 3 presents a more detailed summary of the distribution of mean ethicality ratings across items. The ethicality judgments were highly reliable as shown in Table 4. The mean correlations between respondents’ ethicality ratings were .787 and .786 for Forms A and B, respectively. The Intraclass Correlation Coefficients (ICCs) were .975 (Form A) and .972 (Form B).

Table 3 Distribution of mean ethicality ratings
Table 4 Interrater agreement on ethicality ratings

Study 3: EPRS Dimension Review

Purpose

The purposes of Study 3 were to (a) evaluate the effect of changes in the dimension titles and definitions and (b) capture our own hypotheses about relationships among performance dimensions based on their underlying performance determinants (i.e., knowledge, skill, ability, and other characteristics; KSAOs).

Method

Dimension Changes

To evaluate the effect of changes in the dimension titles and definitions, we asked another 14 graduate students from 6 graduate programs in Industrial-Organizational Psychology to sort the vignette items into the revised dimensions. None of the participants had participated in Study 2.

Performance Determinants

We assembled a KSAO list comprising (a) three knowledges—State and Federal Laws, Professional Standards, and Organizational Policies— (b) general mental ability (GMA), and (c) the 30 facets of the International Personality Item Pool’s (IPIP) version of the NEO-PI-R (http://ipip.ori.org/newNEO_FacetsTable.htm; Johnson 2014). IPIP NEO-PI-R has six facets for each Big Five construct. Four research staff rated the extent to which each knowledge, GMA, or personality facet was likely to predict performance in each EPRS dimension using a 3-point rating scale (0 = not at all likely, 1 = somewhat likely, and 2 = very likely).

Results

Dimension Changes

As shown in Table 5, Study 3 results reinforced Study 2 results. Seventy-eight of the 114 items categorized in a dimension by at least 50 % of Study 2 participants were categorized in that same dimension by at least 50 % of Study 3 participants (i.e., 68 % of the items). Another 21 items received 41–50 % agreement in Study 2 (18 %). An additional 19 items, which were not reliably retranslated in Study 2, were reliably retranslated in Study 3.

Table 5 Comparison of the numbers of reliably sorted items in Study 2 and Study 3

Nine items that had been reliably retranslated in Study 2 received 50 % or more agreement for a different dimension in Study 3. Seven of those 9 items moved to categories for which they had high endorsement in Study 2, but less than 50 % endorsement. For example, the four Rule-Abiding items that moved to Truthfulness had received 30–36 % endorsement for Truthfulness in Study 2. Two items moved from Conflict of Interest to Rule-Abiding. Those items involve employees accepting gifts from salespersons which could be considered both Conflict of Interest and a failure to follow the rules. The Rule-Abiding dimension is broader than most dimensions and can be interpreted, at its broadest, as subsuming many other dimensions.

In all, these results suggest that the changes we made to the dimension titles and definitions had very little effect on participants’ judgments, making dimension distinctions neither better nor worse. It is likely that (a) multidimensionality in the vignette items and (b) natural correlations among the dimensions played a stronger role than our edits in determining dimension categorization.

Performance Determinants

We computed the mean ratings for each dimension x KSAO rating and the mean ratings across facets at the construct level. The mean ratings were reasonably stable. Considering there were just four raters, the ICCs (Shrout and Fleiss 1978; ICC 3, K) were acceptable, ranging from .60 for Unfair Treatment to .78 for Whistle-Blowing and Confidentiality, with a median of .72.

Seven facets demonstrated the strongest linkages to the EPRS dimensions. Four of the six Agreeableness facets (Trust, Morality, Altruism, and Cooperation), one Conscientiousness facet (Dutifulness), and two Knowledges (Professional Standards and Organizational Policies) received a mean rating of 1.0 (somewhat likely) across the 10 dimensions.

As shown by the grand mean ratings (across raters and facets) in Table 6, our research team tends to view the dimensions as either being primarily determined by knowledge or primarily determined by personality facets. The three knowledges were rated as being somewhat to very likely to predict performance in the following dimensions: Rule-Abiding, Confidentiality, Conflict of Interest, Whistle-Blowing, and Intellectual Property. To perform ethically in those dimensions, workers need to know relevant laws and policies. In contrast, facets of Agreeableness, particularly Morality, Altruism, Cooperation, Modesty, Sympathy, and Trust were thought to be stronger determinants of Unfair Treatment, Defamation of Others, Workplace Bullying, and Abuse of Power. The unethical behaviors in those dimensions are personal; they are directed toward individuals and are likely influenced by the dark side of personality. Two dimensions did not fit well with the others. Whistle-blowing was linked to Neuroticism facets (Anxiety, Self-consciousness, and Vulnerability) as well as Knowledge and Agreeableness facets. Individuals who are easily intimidated or lack the emotional strength to stand up to transgressors are probably less likely to blow the whistle. The Truthfulness dimension was thought to be at least somewhat predicted by both Knowledge and Agreeableness facets. There can be many reasons for lying. Some might involve not understanding that bending the truth is against organizational policy while others might be more a function of the individual’s moral code.

Table 6 KSAO performance determinants for EPRS dimensions

General Discussion

These studies break new ground by developing a taxonomy of ethical performance at work that generalizes well to a diverse array of occupations and industries, and moving forward, can serve as a foundation upon which to develop theoretically grounded assessments. Moreover, our use of comprehensive qualitative reviews coupled with quantitative evaluation represents a comprehensive approach to taxonomy development.

Strengths and Limitations

A strength of this study, and what makes it different from many qualitative reviews, is that we content-analyzed four distinct types of information to develop performance dimensions and scales in Study 1: (a) the published literature on ethical behavior, (b) professional codes of ethics from a sample of occupations, which was the data source for Kaptein (2008), (c) critical incidents of ethical performance from a large government organization, and (d) behavioral items from ethics surveys. Each literature source has merits and deficiencies. Published research has been peer reviewed but can suffer from publication bias. Also, much of the published literature has tended to focus on three professions: accounting, marketing, and finance (Collins 2000). Drawing on professional codes of ethics from a sample of industries enhanced the breadth of coverage of our dimension structure. Even so, professional codes of ethics tend to focus on egregious unethical behaviors and may not speak to less consequential and/or positive ethical behaviors. Critical incidents are ideal because they illuminate positive and negative behaviors, but they are (a) laborious to collect and (b) rarely reported in their entirety. Analyzing and synthesizing different types of information in Study 1 enabled us to ensure broad coverage of (un)ethical behaviors and provided a strong foundation for the dimensions and rating scales.

In turn, the 10-dimension set was then evaluated using a systematic retranslation process. Another strength of our method is that we evaluated the qualitatively derived performance dimensions and rating scales from Study 1 in two retranslation studies. Studies 2 and 3 demonstrated the soundness of the dimension structure. The use of numerous independent sources as well as qualitative and quantitative methods in tandem lends strong support for the 10-dimension structure of ethical performance.

One limitation is that the EPRS were developed using U.S. and Western-based critical incidents and vignettes, and the resulting performance model is a U.S.-based model. Culture could make two kinds of differences. First, the dimensional structure could vary because the “population” of possible critical incidents from which critical incident writers sampled their examples was not the same. That is, different cultures would generate different examples of (un)ethical performance. Second, specific behaviors may vary across cultures in terms of their degree of ethicality.

It is likely that different cultures define ethical behavior differently, and this could certainly present challenges for organizations. Examining the underlying dimensionality (e.g., via Confirmatory Factor Analysis-based measurement equivalence/invariance procedures) as well as perceptions of ethicality of different behaviors and practices across cultures would contribute to a more comprehensive and global perspective of ethical performance in the workplace.

Another limitation has to do with the nature of the sample for the retranslation studies (Study 2 and Study 3). The initial scale development involved critical incidents from organizations and vignettes from the published literature which were provided by people working in organizations in a variety of roles, and the initial generation of the dimensions was provided by experienced researchers. Graduate students made the retranslations. It would be interesting to repeat the exercise with a sample of business managers. While we think that the dimensions themselves would likely be stable across business settings, prior research does suggest that the point at which a behavior is judged ethical versus unethical can be influenced by the organization’s culture, leaders, and ethics-related training (Collins 2000). Therefore, we would expect some variation across organizations in judgments of the ethicality of behavior in the vignettes.

Finally, the current study provided initial evidence for the content validity of the scales. That is what the retranslation step was designed to do. It needs replication within, and across, cultures/countries. The consistency with which judges nominated the critical determinants of ethical performance on each dimension and the similarity of these results to research on the determinants of counterproductive work behaviors begin to address the construct validity of the dimensions. Obviously, much more needs to be done.

Additional future research needs to be done, in addition to the cross-cultural replications mentioned above. We need to estimate the relationships of actual assessments of ethicality, using these scales, to other performance dimensions, to hypothesized determinants of ethical performance, and to specific outcome variables.

Theoretical Implications

This research helps clarify the overlap and differences between CWBs and ethical behavior. Four of our 10 ethical performance dimensions (Unfair Treatment, Defamation of Others, Workplace Bullying, and Abuse of Power) comprise malicious behaviors that are directed toward individuals. They evoke emotions and are characterized by mean-spiritedness. They overlap substantially with CWBs in the Campbell model of performance (Fig. 1), which is also consistent with the findings of Spector and Fox (2010). Rule-Abiding, Confidentiality, Conflict of Interest, Intellectual Property, Truthfulness, and Whistle-Blowing all rely heavily on knowing right from wrong. Such behaviors might be more a function of self-interest or failure to understand the implications of one’s behavior. These behaviors are not reflected in the Campbell model. Consideration should, and will, be given to adding a ninth factor to the Campbell model, Ethical Behavior, defined, tentatively, as “knowing and doing the right thing, following rules, avoiding conflict of interest, maintaining confidentiality, respecting intellectual property, reporting truthfully, and blowing the whistle when unethical situations arise.”

Research and Practical Implications

The 10 dimensions of ethical behavior identified in this project provide a sound foundation for a wide array of uses including: (a) performance management, (b) training/education, (c) job analysis, (d) predictor development and/or validation research, (e) analysis of ethical lapses, and (f) additional research.

Performance Management

The ethical culture of an organization sets the stage for ethical behavior of its workers (Peterson 2002; Vardi 2001), and the ethicality of leaders and peers influences worker behavior (Keith et al. 2003; Khuntia and Suar 2004). One way to communicate the importance of ethical behavior in the organization is to make ethical dimensions part of the performance management system. We see the EPRS as a tool that can be used to educate employees and communicate ethical standards. Dimension definitions and anchors could be reviewed by subject matter experts (SMEs) in the organization and the most relevant dimensions could be folded into existing performance rating tools. This would enable organizations to more systematically evaluate and influence ethical performance in the workplace.

Training/Curriculum Development

The 10 EPRS provide a potential set of learning objectives and learning materials to guide the development of training experiences to promote more effective ethical performance. There are several ways this could be done. Organizational members could rate the relevance of each of the dimensions to jobs and discuss the results in focus groups. The dimensions, particularly dimensions that require some knowledge of organizational policies, could be incorporated into any formal training programs offered by the employer. Another idea would be to develop a self-assessment tool much like a cultural assimilator (Fiedler et al. 1972). Cultural assimilators use vignettes to teach cultural awareness; they provide rich explanations of why behaviors are or are not acceptable in a particular culture. An ethical culture assimilator would contain vignettes for each of the dimensions that are high priority for the organization. It would look much like a situational judgment test but would explain the rationale for the (un)ethicality of behaviors with the intent of educating organizational members.

Job Analysis

The first step in a job analysis involves gathering job relevant materials and creating/reviewing tasks and knowledges, skills, abilities, and other characteristics. It would be ideal to consider the EPRS dimensions at this initial stage and discuss any possible tasks and KSAOs that are likely to be related to ethical performance dimensions for the job. The dimensions themselves may or may not be part of a job analysis survey; regardless, they could be used to stimulate discussion and ensure consideration of ethical concepts.

Predictor Development and/or Validation Research

Future research gathering additional validation evidence for the 10 EPRS dimensions would provide the basis for the development of individual assessment tools. The 10 EPRS could be used as a starting point for developing (a) interview questions and interview rating scales relevant on ethics or (b) situational judgment tests of ethical performance. After SME review, the EPRS could be used as criteria for the validation of predictors of ethical performance.

Analysis of Ethical Lapses

One of the reviewers prompted us to think about how the dimensions might be useful for analyzing ethical lapses in organizations. For example, can the dimensions help us better understand Volkswagen’s cheating-on-the-engine-test incident? The incident can be deconstructed, according to the roles individuals played. Someone made the conscious decision to cheat (i.e., the Truthfulness and Rule-Abiding dimensions). It is likely there were employees who knew what was going on but did not report it (i.e., Whistle-blowing). Did a supervisor make an overt or implied threat to ensure employees would follow through with the scam (i.e., Abuse of Power)? Breaking an incident down into individual’s roles and relevant dimensions could help identify organizational interventions that reduce the likelihood of future ethical lapses.

Additional Research

The 10 dimensions of ethical behavior identified in this project provide a taxonomic structure for future research on ethical performance in the workplace. The scales could be used to estimate the relationships among ethical performance, counterproductive work behavior, and management and leadership performance. Nevertheless, the present studies take an important first step in advancing our understanding of this phenomenon, and serves as a foundation upon which future cross-cultural studies and assessment-development efforts can be built.