Introduction

The German sociologist Luhmann (2008) described morality as a symbolic generalization, reducing the complexity of relationships between interacting subjects to an expression of respect. By applying a moral judgment, one communicates conditions for respecting or disobeying others. Thus, individual moral values are not inherent or given ab initio (Haidt 2013), but negotiated in a framework of social interaction.

Deontology versus Consequentialism

From a philosophical perspective, morality describes a discourse between deontological and consequential positions. Deontology is a normative theory regarding which choices are morally required, forbidden, or permitted (Kant 1995/1787). Consequential approaches consider an action to be advisable if it yields the best possible outcome—with classic utilitarianism as its most prominent example (Bentham 2008/1823). These two ethical stances are of central interest in descriptive ethics and modern psychological accounts of moral cognition.

The Complex Nature of Moral Cognition

Since the cognitive turn in psychology has shown that conditioning paradigms insufficiently explain moral development, empathy is seen as a prerequisite, even in the constructivist and cognitive theories of Piaget (1965) and Kohlberg and Kramer (1969). Initially, the family is the social framework in which the infant learns the basic dichotomous model of good and bad behavior. The child develops through the principles of conditioning, observational learning, and imitation. Family members who represent moral authority assist the child in developing a moral framework through behavior responses to the violation of moral rules (Shweder et al. 1987). In this context, a lack of empathy leads to the misconstruction of emotional signals and a lack of interest in other individuals (Baron-Cohen 2005). Without empathy, the ability to reflect is highly deficient (Frith and Happé 1999) and thus alters moral decision making (Batson et al. 1983; Krahn and Fenton 2009; Myyrya et al. 2010).

Consequently, moral cognition rests upon a large number of complex processes: the moral content of a social situation must be perceived and categorized; information content needs to be adjusted on the basis of internalized social norms, one’s own experience and autobiographical memories; various courses of action must be balanced against the situation and compared with imagined and anticipated consequences. Despite the complexity of moral cognition in problem-solving and reasoning, moral development is not correlated with general intelligence within the limits of the typical range (Beißert and Hasselhorn 2016; Derryberry et al. 2005; Tebartz van Elst 2015). With increasing age, the spectrum of social context expands into settings beyond the family. Peer groups and the media become influential, and moral values develop through imitation and identification. Even Kohlberg and Kramer (1969) assumes that adolescents and adults predominantly process moral values without explicit analysis. Thus, the majority of moral impartations are conveyed by emotional expressions that the recipient must be able to “read” (Schaafsma et al. 2015).

Social Intuitionist Model and Greene’s Dual-Process Theory

The Social Intuitionist Model (SIM) by Jonathan Haidt (2001) emphasizes the role of emotion with recourse to Hume (2000/1739), who claimed that “reason is the slave of passion”. Consequently, moral cognition is based on an implicit, automatic process (e.g. passion, emotion, or intuition). Furthermore, the degree of emotionality depends on the type of moral dilemmas presented to the participants (Greene et al. 2001; Shenhav and Greene 2014).

Dual-process theories (Greene et al. 2001; Greene 2007) differentiate between fast, automatic inferences and slower, conscious decisions caused by normative principles such as “thou shall not kill” (Kahneman 2011). In other words, Greene suggests that rationality is the basis of utilitarian decisions and emotional involvement leads to deontological moral decision-making.

Schema Theory and Moral Reasoning

The psychological constructionist account of Cameron et al. (2015) states that the interaction of a “core affect” and conceptual knowledge in a specific situation leads to discrete emotions and an ensuing judgment. A core affect is a physiological state and a non-reflective feeling (Russell 2003), perceptible as valence (positive vs. negative) and arousal (high vs. low), which leads to a moral judgment when reconciled with conceptual knowledge. This constructionist model shows an evident overlap with schema theory. A schema is a “complex of values, attitudes, cognitions, and affective responses that are elicited by and interact with new relevant information. That interaction determines the resultant attributions, decisions, and behaviors” (Dienstbier et al. 1980). Schemas represent knowledge—from abstract ideologies to culture, from behavior in social situations to the meaning of single words (Hoernig et al. 1993; Komatsu 1992). The acquisition and restructuring of schema knowledge is affected by frequently repeated assimilative and accommodative processes. Kintsch and van Dijk (1978) describe schemas as a framework that integrates information and highlights key relationships in order to provide a model for certain social situations (e.g. moral dilemmas). Thereby, a person who lacks adequate social schemas will be unable to identify, structure and organize the relevant causal cues of a situation, understand the pragmatic context of a social situation, or predict the behaviors of involved individuals. This shows that mindreading is a necessary prerequisite to draw on goal-driven perception and for the categorization of facial expressions, gestures and posture as well as pragmatic faculties to understand irony, ambiguity or, in general, the intended meaning of verbal expressions (Schaller and Rauh 2017).

Infants develop particular schemas for frequently experienced social situations that reflect acceptance, preferences, conventions, responsibilities and a variety of social relationships. Thus, schemas serve to categorize social perceptions.

Autism Spectrum Disorder and Moral Reasoning

Beyond restricted interests and rigid, repetitive behavior, ASD is defined by deficits in reciprocal social interaction and social communication (ICD-10, DSM-5). These social deficits include an inability to understand and respond to social information or socio-affective cues (Dziobek et al. 2006; Golan et al. 2006; Klin et al. 2002). Theory of Mind (ToM) is based on the ability to second-order representations (Dennett 1978) and the capacity to build a theory about the feelings, thoughts and beliefs (Baron-Cohen et al. 1985) through the observed behavior of another individual. Thus, the ability to develop ToM depends largely on implicit, basic, and procedural processes concerning perception and categorization of facial, gestural, prosodic, postural expressions of emotion (Newen 2015).

It is well established that individuals with ASD in an age range between infancy and adulthood have social cognition deficits, especially in face recognition, emotion categorization, and ToM, (Baron-Cohen et al. 1985; Fridenson-Hayo et al. 2016). Individuals with ASD should consequently exhibit difficulties in moral reasoning. In the sparse number of studies investigating moral reasoning, there is evidence that the development of an intent-based moral judgment in children with ASD is impaired (Margoni and Surian 2016). People with ASD do not seem to use emotional information and may rely more on explicit rules to judge moral acceptability (Brewer et al. 2015). According to the study of Fadda et al. (2016) investigating judgments of the consequences of moral and non-moral actions, children with ASD did not take psychological information or the subjective state of an agent into account. In a study by Baez et al. (2012), participants with ASD were less able to implicitly encode and integrate contextual information in order to gain access to social meaning. Comparing moral judgments of adults with ASD and neurotypical controls, only subtle differences have been found regarding intentional action in moral situations (Buon et al. 2013). Self-rating their own empathetic concern in socio-moral situations, individuals with ASD saw themselves as empathetic, but unable to apply those feelings in moral reasoning conditions (Senland and Higgins-D’Alessandro 2013).

Rationale of Our Study

The need to investigate moral reasoning abilities with regard to emotional affect (valence and arousal) and conceptual knowledge (morality and permissibility) motivated us to develop the Intuitive Moral Reasoning Test (IMRT). The computer-based test includes first and second order false belief tasks and moral dilemmas based upon stories addressing social schemas with and without close social relations. In addition, the IMRT differs between extreme situations and everyday situations, a factor that influences moral cognition (Nunner-Winkler 2007). The goals of IMRT are to investigate significant differences between ASD and NTD in response behavior, including:

  1. 1.

    The variables decision type, emotional valence, emotional arousal, moral acceptability and permissibility for different moral dilemmas.

  2. 2.

    A lack of difference in the decisions made by individuals with ASD in a dilemma with high emotional involvement (Footbridge Dilemma) versus those of impersonal structure (Trolley Dilemma)—as opposed to the NTD group.

  3. 3.

    A differential effect when switching from the perspective of the actor to the perspective of the victim of the action.

  4. 4.

    The variables emotional valence, emotional arousal, moral acceptability and permissibility using extraordinary, rare and extreme situations (Pharmacist Dilemma, Trolley-Dilemma, Footbridge-Dilemma) versus everyday dilemmas without any extreme or exceptional circumstances.

  5. 5.

    Whether there is an association between ToM abilities and moral reasoning.

  6. 6.

    Whether there is an association between severity of autistic symptoms and response behavior.

  7. 7.

    The differential influence of social schemas incorporating close social relationships.

Method

Participants

Eighty adolescents and adults ranging from 14 to 61 years of age and with normal intellectual ability (IQ ≥ 70) took part in this study. For the clinical group, adolescents were recruited at the outpatient clinic of the Department of Child and Adolescent Psychiatry, Psychotherapy, and Psychosomatics; adults were recruited at the outpatient clinic of the Department of Psychiatry and Psychotherapy Medical Center - University of Freiburg. An ASD diagnosis according to ICD-10/DSM-5 as determined by experienced clinicians of our departments was required of all patients. Adolescent patients met lifetime criteria for ASD based on the gold standard instruments ADOS (Rühl et al. 2004) and ADI-R (Bölte et al. 2005). The clinical diagnosis of ASD for adult patients was established as a consensus diagnosis by a multi-professional team following a structured diagnostic procedure, including a history of the caregivers (parents, partners, siblings and so on) and behavioral observations (ADI-R and ADOS).

Initially, 87 individuals were accepted into the study. However, four participants of the ASD group and 3 participants of the NTD group were excluded from the final analysis due to missing data caused by technical difficulties or lack of cooperation. Therefore, the final analysis was conducted with 80 participants. Of these, 36 participants with high-functioning ASD and 44 NTD controls were assigned to an adolescent (14–17 years) or an adult age group (18–61 years). The adolescent ASD group consisted of 15 participants; the adolescent NTD group of 22 participants. The adult ASD group included 21 participants; the adult NTD group had 22 participants.

Materials

Accompanying Questionnaires

The SRS-2 questionnaire contains 65 items and quantifies the severity of social impairments associated with ASD in a total score and five subscale scores: (1) Social Awareness, (2) Social Cognition, (3) Social Communication, (4) Social Motivation, and (5) Restricted Interests and Repetitive Behavior. In the adolescent group (ASD and NTD), parents completed the SRS-2 School Age version (Bölte et al. 2008). Participants of the adult group filled out the SRS-2 Adult Self-Report.

The Autism Spectrum Quotient (Baron-Cohen et al. 2001) is a self-administered paper-and-pencil questionnaire measuring the number of autistic traits on five subscales: (1) Social Skills, (2) Communication, (3) Imagination, (4) Attention to Detail, and (5) Attention-Switching. The AQ was administered for group comparisons and correlation between the degree of autistic trait manifestations and response on items of the moral dilemmas presented in the IMRT.

IQ was measured using the revised German version of the Culture Fair Intelligence Test (CFT 20-Weiß 2008) a non-verbal test that—according to Cattell (1963)—assesses in particular the fluid intelligence factor.

For acquisition of characteristics in the autism spectrum, we also applied the Empathizing Quotient (EQ) and the Systemizing Quotient (SQ, Baron-Cohen 2009). Using the EQ by Simon Baron-Cohen, we wanted to investigate the relationship between empathizing abilities and moral cognition in individuals with ASD and their neurotypical peers. Since individuals with ASD have strong tendencies “to analyze, understand, predict, control and construct rule-based systems” (Wheelwright et al. 2006), the Systemizing Quotient (SQ) measures individual differences in systemizing. The SQ has been found to correlate positively with the AQ (Baron-Cohen et al. 2003).

Using the Toronto Alexithymia Scale (TAS-26), a self-administered questionnaire, we aimed to measure the ability to identify and describe one’s own emotions and those of others in order to test associations with the outcome of the IMRT. Its three subscales are (1) difficulty identifying feelings (DIF), (2) difficulty describing feelings to others (DDF), and (3) externally oriented thinking (EOT, Kupfer et al. 2000).

Main Instrument: The Intuitive Moral Reasoning Test

The Intuitive Moral Reasoning Test (IMRT) is a computer-based forced multiple-choice test, presenting short stories containing different moral dilemmas and classical false belief stories in written and read out form. After presenting the auditory recording of the story (simultaneously readable on the monitor), the participants have to answer questions within a timeframe of 10 s. The objective was to determine the relationship between ToM abilities, emotion, moral reasoning, and permissibility. In order to provide a spectrum of social situations, the moral dilemmas capture conflicts of loyalty, egoism-altruism, economics and other transgressions. In order to investigate moral reasoning in various contexts, we used two dilemmas displaying extreme-life-situations in the manner of Foot (1967) and Kohlberg (1981) and two dilemmas with true-to-life situations in the manner of Nunner-Winkler (2007). In addition to the distinction between true-to-life examples and extreme life situations, one section of the dilemmas differentiates between close personal relationships (love relationship, best friend relationship) and impersonal relationships and/or encounters (unknown person). Figure 1 illustrates the structure of the IMRT-Dilemmas. The participants are requested to decide between two alternatives (action or inaction). With respect to their decision, they then have to rate their emotional valence (negative–positive) and their degree of emotional arousal (high–low). Furthermore, they have to assess the degree of moral acceptability (morally right–morally reprehensible) and the degree of permissibility (legal–illegal) for their decision. In order to capture changes in the estimation of emotional valence, arousal, moral acceptability, and permissibility, the IMRT implies the perspective of the victim of an action and forces participants to rate from this victims’ viewpoint. In addition to the moral dilemmas, the IMRT included three classical false belief stories of first and second order. We focused here on capturing the capacity of perspective taking as a partial aspect of ToM. Furthermore, we surveyed social desirability by means of the “openness” subscale items of the Freiburger Persönlichkeitsinventar (FPI-R, Fahrenberg et al. 2001). Primary outcome of the IMRT are type of decision and the four scales of emotional valence, emotional arousal, moral acceptability, and permissibility. For a better understanding, the five dilemmas are presented in the "Appendix".

Fig. 1
figure 1

Structure of IMRT-dilemmas

Measures

The IMRT was implemented in PsychoPy (Peirce 2006), a platform-independent experimental control system and open source software tool using Python. Stimuli were presented on a 17-inch monitor connected to a computer with Microsoft Windows XP operating system.

Primary outcome measures for the IMRT task were (1) the dichotomous categorical dependent variable “type of decision” (“yes, take the action” vs. “no, do not act”). (2) In order to assess core affect in terms of arousal and valence, we utilized the “Self-Assessment Manikin Scale” (Lang et al. 1997), a visual-analogue scale using scores between 1 and 9 for emotional valence (1 = very unhappy, 9 = very happy) and for emotional arousal (1 = very calm, 9 = very aroused). (3) To measure moral acceptability and permissibility, participants rated their decision on a visual-analogue scale between 0 = morally not acceptable and 5 = morally full acceptable (moral acceptability) and 0 = absolutely not allowed and 5 = absolutely allowed (permissibility). Furthermore, we assessed: (4) the perspective of the decision maker and his/her chosen type of decision (“yes, take the action” vs. “no, do not act”) and the perspective of the victim of an action (action that harms the victim) on the scales of emotional valence, emotional arousal, moral acceptability and permissibility; (5) performance in false belief tasks based on relative frequencies of correct answers; (6) social desirability by means of items from the Freiburger Persönlichkeitsinventar (Fahrenberg et al. 2001).

Procedure

Individuals fulfilling inclusion criteria and not violating exclusion criteria gave informed consent and filled out the questionnaires AQ, EQ, SQ, SRS-2 and TAS-26. Adolescents and adults were assessed in group sessions of at most 5 persons. All participants completed the computer based IMRT individually within 45 min in a quiet room and in separate cabins wearing acoustic noise-canceling headphones (Bose QuietComfort 15, Bose Corporation, Framingham, MA).

Statistical Analyses

For dichotomous outcome variables, such as type of decision (“yes” vs. “no”), group differences between NTD and ASD were analyzed by means of binary logistic regression analyses. Group differences concerning continuous variables, e.g. scale scores of questionnaires, were tested by ANOVAs. Where the distributional assumptions of parametric tests did not apply, non-parametric alternatives, like Wilcoxon–Mann–Whitney tests, were applied. Assumptions of normality were checked using the Kolmogorov–Smirnov tests. Effect sizes for group differences are reported in terms of (1) odds ratios (OR) in the case of dichotomous outcome variables, and (2) in terms of standardized mean differences (SMD) in the case of continuous dependent variables. For the latter, unbiased Hedges’s g, rather than Cohen’s d, is used as point estimator of effect sizes (Borenstein et al. 2009), because the former enables the computation of the 95% confidence interval (CI). For a systematic review of results that emphasizes group differences between ASD versus NTD, forest plots based on ORs and SMDs together with their 95% CI were computed. All statistical analyses were performed with SAS software, Version 9.4 (SAS Institute Inc., Cary, NC, USA). For hypotheses testing, a significance level of α = 0.05 was adopted.

Results

Sample’s Characteristics

Statistical analyses were conducted on the basis of a total sample size of n = 80 participants. The group of adolescents comprised 37 participants (8 female/29 male). Of these, 15 (4 f/11 m) boys and girls belonged to the ASD group while 22 (4 f/18 m) teenagers were part of the NTD control group. In the adult group, 43 participants (20 f/23 m) were included for statistical analysis. Of these, 21 (10 f/11 m) participants were diagnosed with ASD and 22 (10 f/12 m) individuals belonged to the NTD group. The sample’s characteristics are summarized in Table 1 and Fig. 2. Table 1 displays basic characteristics, including gender distribution, chronological age, and IQ. There were no significant group differences for gender, age, or IQ within the sample or within age groups. Figure 2 summarizes questionnaire scores as a forest plot. In the top half of the figure, results of the questionnaires applied to the entire sample are shown (AQ, EQ, and TAS-26), whereas the lower half displays scores for questionnaires applied to only one of the age groups (SRS-2-SchoolAge and SQ for adolescents only; SRS-2-Adult Self-Report for adults only).

Table 1 Sample’s characteristics of gender distribution, chronological age, and IQ
Fig. 2
figure 2

Forest plot of standardized mean differences (Hedges’s g) for questionnaires applied for the total sample and within age groups for adolescents (14–18 year) and for adults (≥ 18 year). p Values are from one-way ANOVA with diagnostic group (NTD vs. ASD) as independent factor. NTD neurotypical development, ASD Autism Spectrum Disorder, AQ Autism Spectrum Quotient, SocSki Social Skill, AttSwi attention switching, Com communication, Ima imagination, AttDet attention to detail, EQ empathy quotient, TAS-26 Toronto Alexithymia Scale, DIF difficulty identifying feeling, DDF difficulty describing feelings, EOT externally-oriented thinking, SQ systemizing quotient, SRS2-Adult-SR Social Responsiveness Scale 2 Adult SelfReport, Awr Social Awareness, Cog social cognition, Com social communication, Mot social motivation, RRB restricted interests and repetitive behavior, SQ systemizing quotient

Concerning autistic symptomatology, the ASD group showed significantly higher scores for the AQ total score and for all subscales, with large effect sizes for the total sample (all g’s ≥ 0.93) as well as within age groups (all g’s > 0.83). With regard to empathy, the ASD group also scored significantly lower on the EQ in the total sample (NTD: M = 1.11, SD = 0.25, ASD: M = 0.60, SD = 0.27; F(1, 76) = 74.47, p < .0001 g = − 1.95) as well as within age groups (see Fig. 2 for details). In addition, alexithymia scores as assessed by the TAS-26 were also significantly higher for the ASD group for all scales with the exception of TAS-26-EOT (= Externally-Oriented Thinking) (F(1, 75) < 1). The same pattern of significant differences with respect to alexithymia was also observed within age groups (see Fig. 2 for details).

It is worth noting that the adult individuals with ASD reported higher symptomatologic impairments on all questionnaires/scales than the adolescents with ASD. For the following (sub-)scales, these differences were also significant (significant interaction effects between diagnostic group and age group in 2 × 2 ANOVAs): AQ-Total (F(1, 76) = 9.33, p = .003), AQ-SocialSkills (F(1, 76) = 7.32, p = .008), AQ-Communication (F(1, 76) = 6.94, p = .010), AQ-Imagination (F(1, 76) = 7.11, p = .009), EQ (F(1, 74) = 4.17, p = .045), and TAS-26-DIF (F(1, 73) = 7.27, p = .009).

For the age group-specific questionnaires, there was no significant difference in SQ between ASD and NTD (F(1, 34) < 1). For both SRS-2 questionnaires, however, large differences were obtained. In comparison to parents of NTD adolescents, parents of adolescents with ASD reported much higher ratings of impaired social responsiveness resulting in higher T-scores in the SRS-2-SchoolAge for the total score as well as for all five subscales, with very large effect sizes (all g’s ≥ 2.88). The same holds true for the adults in their self-reports: ASD adults scored much higher on the total raw score as well as on all 5 subscales (raw values) of the SRS-2-Adult SelfReport, with very large effect sizes (all g’s ≥ 2.78).

Group Differences Concerning Type of Decisions for the 5 IMRT Dilemmas

To test for group differences between the ASD and the NTD samples concerning the type of decisions made in the 5 dilemmas, logistic regression analyses were performed with type of decision (“yes, take the action” vs. “no, do not act”) as the dichotomous dependent variable and the diagnostic group (NTD vs. ASD) as predictor. As can be seen in the forest plot of odds ratios in Fig. 3, there are two dilemmas in which the ASD group showed an altered pattern of decision type proportion. First, a significant group difference was observed for the Graffiti Dilemma (Χ2(1) = 3.93, p = .047; OR = 2.50)—only 17 of the 44 (= 38.64%) NTD individuals would betray his/her friend to the school principal, whereas 22 out of 36 (= 61.11%) ASD individuals would choose to do so. This difference is descriptively larger in the adolescent group, but failed to be significant (Χ2(1) = 3.57, p = .059; OR = 4.00). The second dilemma with a different decision behavior pattern concerns the Pharmacist Dilemma. Most NTD individuals (34 of 44 = 77.27%) would burglarize the pharmacy vs. only 21 of 36 (= 58.33%) ASD individuals. In the total sample, this difference just failed to be significant (Χ2(1) = 3.23, p = .072; OR = 0.41), but in the age group of adolescents, the difference is more pronounced and significant (Χ2(1) = 6.78, p = .009; OR = 0.14). In the adult group, however, there is almost no difference (Χ2(1) = 0.01, p = .912; OR = 0.93). For the other three dilemmas, no significant differences emerged for the total sample or within age groups (see Fig. 3 for details). The special dilemmas of the Footbridge and Trolley will be presented in the next section. In summary, group differences concerning decision type were found in those dilemmas where intimate personal relationships were addressed.

Fig. 3
figure 3

Forest plot of odds ratios for each of the five dilemmas presented in the IMRT. In addition to the results of the total sample, subgroup analyses are displayed for adolescents (“14–18 year”) and adults (“≥ 18 year”)

The Special Case: Type of Decisions in ASD from the Perspective of Dual-Process Theory

The differential dual-process theory of ASD was assessed with the two related dilemmas “Footbridge” and “Trolley”. In the Trolley Dilemma, 18 out of 43 (= 41.86%) NTD participants decided to pull the lever in order to rescue five persons and sacrifice another (58.14% decided not to pull the lever). The decisions of the ASD group reveal consent to switch the lever in 41.67% of the cases, while 58.33% decided not to switch. As the descriptive values already indicate, there is no significant difference between the ASD and the NTD groups with respect to proportions of “yes”-decisions (Χ2(1) = 0.0003; p = .986). A very similar pattern of results was found within both age groups (adolescents: Χ2(1) = 0.03, p = .864, OR = 1.13; adults: Χ2(1) = 0.01, p = .924, OR = 1.07).

In the Footbridge Dilemma, within the NTD group, only 2 out of 44 (= 4.55%) decided to push a person from a bridge, while 95.45% rejected this option. In the ASD group, 5 out of 36 (= 13.89%) participants agreed to push a person from a bridge and sacrifice him/her in order to rescue five other persons, whereas 86.11% decided against this option. Descriptively, a higher percentage of ASD participants would push the person from the bridge; however, the difference was not significant either for the total sample (Χ2(1) = 2.19, p = .139) or within age groups (adolescents: Χ2(1) = 0.89, p = .345, OR = 2.50; adults: Χ2(1) = 2.97, p = .085, OR = 5.77).

Ratings of Valence, Arousal, Moral Acceptability, and Permissibility

Concerning the ratings of valence, arousal, moral acceptability, and permissibility, almost all variables deviated significantly from the normal distribution within diagnostic groups. Therefore, median (Mdn) and interquartile range (IQR) as descriptive statistics are reported, and the Wilcoxon–Mann–Whitney test for differences between diagnostic groups was applied. Additional sub-group analyses were run for “yes”- and “no”-deciders within each diagnostic group (see Table 2).

Table 2 Group differences: ratings of valence, arousal, moral acceptability, and permissibility for each dilemma

For the Footbridge Dilemma, no significant group differences for the total sample or within the type-of-decision groups were found with respect to valence, arousal, moral acceptability, and permissibility. Similarly, for the Bicycle Dilemma, no significant group differences were found—except for valence: the ASD group reported more positive emotional valence than the NTD group, irrespective of their decision (WS = 1700.00, z = 2.383, p = .017). For the Graffiti Dilemma, the singular significant group difference was that the ASD group reported higher moral acceptability ratings than the NTD group (WS = 1672.00, z = 2.070, p = .038). This difference is driven by the participants who refused to betray the friend. ASD participants tend to find this course of action more morally acceptable than their NTD counterparts (WS = 364.00, z = 1.923, p = .055).

For the Trolley Dilemma, many group differences were obtained. The ASD group reported more positive emotional valence than the NTD group, whatever their decision (WS = 1729.00, z = 3.000, p = .003). They also reported lower emotional arousal than the NTD group (WS = 1177.00, z = − 2.745, p = .006), though this difference is driven by the participants who refused to pull the lever. ASD participants were significantly less aroused than their NTD counterparts (WS = 371.50, z = − 2.809, p = .005). Concerning permissibility, the ASD group reported higher ratings than the NTD group (WS = 1642.00, z = 2.001, p = .045). Again, this difference is driven by the participants that would pull the lever and ASD participants tend to find this course of action more permissible than their NTD counterparts (WS = 301.50, z = 1.670, p = .095).

For the Pharmacist Dilemma, significant group differences were obtained only within type-of-decision groups. Those participants with ASD that would burglarize the pharmacy report less positive emotional valence, higher emotional arousal, and judge that course of action as less morally acceptable than their NTD counterparts (WS = 453.00, z = − 2.355, p = .019, WS = 174.50, z = 2.512, p = .012, and WS = 440.50, z = − 2.566, p = .010, respectively). Within the group of non-burglars, the ASD participants report significantly lower emotional arousal than the NTD participants (WS = 174.50, z = 2.512, p = .012).

Ratings of Valence, Arousal, Moral Acceptability, and Permissibility from the Adopted Perspective of the Victim

Table 3 summarizes the analyses of ratings of valence, arousal, moral acceptability, and permissibility when the participant was asked to take on the perspective of the victim. Again, nearly all variables deviated significantly from the normal distribution within diagnostic groups. As above, median (Mdn) and interquartile range (IQR) are additionally reported as descriptive statistics, and the Wilcoxon-Mann-Whitney test for differences between diagnostic groups was applied.

Table 3 Group differences: ratings of valence, arousal, moral acceptability, and permissibility for each Dilemma, as assessed from the adopted perspective of the victim

No significant differences were obtained for the Bicycle Dilemma. Both groups, when instructed to adopt the perspective of the first prospective buyer, assessed the bicycle owner’s decision in a similar fashion. For the Footbridge Dilemma, the only significant group difference was found with respect to valence. Individuals with ASD reported significantly higher emotional valence ratings than the NTD individuals (WS = 1720.50, z = 2.675, p = .007). The same pattern was obtained for the Trolley Dilemma (WS = 1740.00, z = 3.055, p = .002). In the two dilemmas that addressed close personal relationships, more group differences emerged. In the Graffiti Dilemma, ASD individuals reported significantly lower valence ratings than the NTD individuals (WS = 1276.50, z = − 2.060, p = .039). In the Pharmacist Dilemma, ratings for valence, moral acceptability and permissibility differed significantly. From the perspective of the robbed pharmacist, ASD individuals reported more negative valence and rated the burglary less morally acceptable and less permissible than their NTD peers (WS = 1192.00, z = − 2.622, p = .009; WS = 1157.00, z = − 2.925, p = .003; WS = 1250.50, z = − 2.142, p = .032, respectively).

Group Comparisons for IMRT Subscales False-Beliefs and Social Desirability

With respect to accuracy of identifying false beliefs in the corresponding three stories, no significant group difference was obtained (WS = 1405.50, p = .601). The mean of the individual relative frequencies of correct answers in the NTD group was M = 0.833 (SD = 0.114) and M = 0.809 (SD = 0.145) in the ASD group.

Concerning the tendency to answer questions according to social desirability as assessed by the FPI-R subscale “openness”, there was also no significant group difference (NTD: M = 0.651, SD = 0.181; ASD: M = 0.625, SD = 0.190; WS = 1387.00, p = .492).

Associations of Type of Decision with Ability to Identify False Beliefs, Autistic Symptomatology, Empathy, and Alexithymia

As for associations of other variables with decision type, logistic regression analyses were computed where possible. For the ability to identify false beliefs, no significant contributions in predicting type of decision were obtained for all five dilemmas (Footbridge: Wald-Χ2 = 3.73, p = .053; Trolley: Wald-Χ2 = 0.22, p = .640; Pharmacist: Wald-Χ2 = 1.10, p = .295; Graffiti: Wald-Χ2 = 0.08, p = .785; Bicycle: Wald-Χ2 = 0.03, p = .876).

Since there are group differences in self-reported autistic symptomatology (AQ), empathy (EQ), and alexithymia (TAS-26) with respect to diagnostic status (ASD vs. NTD) and chronological age (adolescents vs. adults), logistic regression analyses with these scales were performed (1) without controlling for any other variables, and (2) controlling for the variables diagnostic group and age group (and their interaction) in order to assess their contribution to type of decision without being confounded (see Table 4).

Table 4 Associations between type of decision with AQ total, EQ total and TAS-26 total

The associations of autistic symptomatology, as assessed by the AQ total score, with type of decision were not significant, except for in the Trolley Dilemma. In the logistic regression analysis, higher autistic symptomatology tended to predict a lower probability of pulling the lever (β = −1.82, Wald-Χ2 = 3.06, p = .080). After controlling for the other covariates, the association was even stronger (β = −4.50, Wald-Χ2 = 5.01, p = .025). Further correlational analyses revealed a significant negative point-biserial correlation in the ASD group (rpb = − .558, p = .0004) as opposed to a non-significant point-biserial correlation in the NTD group (rpb = − .107, p = .495). This indicates that, for individuals with ASD, the higher the autistic symptomatology the lower the probability to pull the lever—an association that was neither expected nor predicted by any theoretical account.

The associations of degree of empathy (EQ total score) and degree of alexithymia (TAS-26 total score) with type of decision were not significant, except for in the Pharmacist Dilemma. The higher the self-reported degree of empathy, the higher the probability to burglarize the pharmacy. This association (β = 2.05, Wald-Χ2 = 6.91, p = .009) was robust even after controlling for the other covariates (β = 3.07, Wald-Χ2 = 5.50, p = .019). For the degree of alexithymia, however, the “uncontrolled” association (“the higher the alexithymia score the lower the probability to burglarize the pharmacy”) was no longer significant after controlling for the other covariates (β = −0.06, Wald-Χ2 = 3.50, p = .061).

Discussion

Commonalities and Differences

In this study, we investigated differences in moral cognition between individuals with ASD and NTD controls by using a new computer-based test composed of distinctive dilemmas and additional classical false belief tasks. Using different social schemas, the IMRT allows for a differential testing of influences of social relationships, everyday situations and extreme situations, as well as stories describing false belief situations. Our results show many commonalities in moral reasoning in individuals with ASD compared to NTD, but also some systematic differences in dilemmas focusing on social schemas dealing with close social relationships. Particularly when combined with an extreme situation, individuals with ASD tended to show significant differences in type of decision and on scales assessing emotional valence, arousal and moral acceptability of the chosen alternatives. No significant differences were found between the two groups concerning the ability to solve classical false belief tasks (FB), and there was no significant association between FB and the scales concerning moral dilemmas.

Testing the dual-process theory of emotional engagement in moral reasoning, the results indicated that participants with ASD showed no significantly different response behavior compared to the control group. The observed differences in moral decision making are predominantly attributable to the adolescent ASD group, while the response behavior of the adult ASD group seemed to converge with that of the NTD participants. Social desirability seems to have no differential effect, as both groups displayed the same level on the “openness” scale.

The Impact of Social Schemas on Moral Decisions

Out of the five presented dilemmas, only the Graffiti Dilemma and the Pharmacist Dilemma yielded significant differences in moral decision making. Both dilemmas are the only ones in the IMRT that focus on social schemas with close social relationships. The Graffiti Dilemma deals with the social schema of friendship. Although it is reported that ASD are capable of making friends (Bauminger and Kasari 2000; Petrina et al. 2014), individuals with ASD report substantially different perceptions of friendship and those friendships are reported to be fewer and of lower quality (Mendelson et al. 2016; Petrina et al. 2016). The Pharmacist Dilemma targets the social schema of love/sexual relationship, particularly romantic love. Since the level of interest and desire for romantic love/love relationship is comparable to neurotypical peers, the difference between ASD and NTD is reflected in the degree of acquired experience in such relationships (Hancock et al. 2017; Mehzabin and Stokes 2011). The significant difference between ASD and NTD concerning decision type of the Pharmacist Dilemma can be ascribed to the adolescent subsample with ASD. The decision to rescue the beloved person but to violate a law can be seen as a developmental effect of accumulated experience that leads to a considerable degree of common ground between the adult ASD group and the NTD group.

Moral Decisions and Dual-Process Theory

In the NTD cohort, there was clear evidence for response behavior in line with the predictions put forth by Greene et al. (2001). The relative percentages for the impersonal action of pulling a lever to kill one and save five persons’ lives vs. directly pushing a person off the bridge in order to save five people demonstrate that the response behavior of the ASD group does not deviate widely from the NTD group. The overall analysis of the dual-process scenarios reveals that ASD as well as NTD prefer utilitarian judgments as long as the dilemma deals with a scenario of distant and emotionally low salience; otherwise, both tend to decide in a deontological manner.

Emotional Valence, Arousal, Moral Acceptability, and Permissibility from the Perspective of the Decision Maker

In summary, participants with ASD tended to report a more positive emotional valence than the control group, except for in the Pharmacist Dilemma. The ASD group displayed a lower arousal overall, but seemed to be markedly more aroused when confronted with the Pharmacist Dilemma. In other dilemmas, ASD participants rated moral acceptability similar or higher than NTD—except for the Pharmacist Dilemma. Participants with ASD who decided pro action rated the moral acceptability of this behavior significantly lower than the NTD group. However, the assessment of the permissibility of an action did not differ from the ratings made by the NTD group. On the one hand, the Pharmacist Dilemma can be assigned to the category of extreme life situations; on the other hand, the underlying concept addresses the schema of love relationship. Adolescents and adults with ASD, although having desires for romantic and sexual partnerships, are reported to show diminished enjoyment in social situations and display inappropriate courting behaviors (Chevallier et al. 2012; Stokes et al. 2007). Considering the decisions of the ASD adolescent group, especially in the Pharmacist Dilemma, the schema of love relationship may not be as elaborated as in the group of their typically developed peers. Understanding of facial expression, gesture, posture, prosody as well as a subtle sense for ambiguity and irony are essential and implicit sub-structures needed to cope with the demands of courting behavior. Since social schemas develop in an interplay of observing prototypical situations and, to a large extent, by role models conveyed through media, it is obvious that individuals with ASD will develop deviant schemas that hamper decoding, understanding and predicting the development of a social situation (Christensen and Michael 2016).

Emotional Valence, Arousal, Moral Acceptability, and Permissibility from the Perspective of the Victim

According to the victim perspective, participants with ASD are usually able to put themselves in the place of others once they bring the described situation in line with a schema that is familiar to them. Thus, a schema concerning a situation associated with social aspects like friendship, loyalty, love, and partnership can be considered to be unfamiliar to those with ASD compared to schemas including norms and rules.

Social Schemas of Relationship

In comparison to other dilemmas, the Pharmacist Dilemma has the largest share of significant differences between ASD and NTD. However, given that this dilemma describes an extreme incident and is based on a social schema with explicit reference to interpersonal relations and special codes, individuals with ASD are disadvantaged. They show deficits in facial emotion categorization (Uljarevic and Hamilton 2012), impaired use of social and figurative language (Dennis et al. 2013; Happé 1993; Kalandadze et al. 2016), and cultural knowledge displayed in event schemas (Loth et al. 2008). A closer look at the results reveals that it is the adolescent subsample of the ASD group that shows a deviant response behavior. This can be explained by the fact that adolescents with ASD, compared to their control peers, lack sufficiently developed cultural knowledge (Loth 2007), especially in terms of love and intimate relationship (Holmes and Himle 2014). To be able to develop and expand such schema relevant content, it is crucial to draw on complex social cognition. This steady expansion of experience could also explain why the adult participants with ASD show a response behavior like that of the control group.

Contrary to the Pharmacist Dilemma, the Graffiti Dilemma combines the social schema of friendship with an everyday life situation. Participants with ASD also had difficulties with this combination, since the social codes and typical actions depend on a fast and common processing to allow for an implicit and automatic recall in a particular situation. This also explains why most adolescents with ASD have little reluctance to betray the friend. For them, schemas concerning norms and rules can be recalled faster and outweigh inconsistent and rarely used schemas of friendship. Considering the approximation of the ASD to the NTD group, it can be assumed that the expansion and refinement of social schemas increases with age.

The results of the IMRT suggest that intuitive decisions are influenced by social schemas. Statistically significant differences in the Pharmacist and the Graffiti Dilemma can be explained by deficient—that is to say, not yet fully developed—social schemas. This is reflected in the contrast to dilemmas without close and complex relationships (Bicycle Dilemma, Trolley Dilemma, Footbridge Dilemma). The response behavior of ASD in the Pharmacist Dilemma showed significantly more negative valence, and participants rated the decision to rob the pharmacy as morally less acceptable. According to the psychological constructionist account of Cameron et al. (2015), the emotional content depends on the core affect that will be perceived as valence and arousal combined with general knowledge about love relationships and autobiographical experience. For individuals with NTD, the schema of love relationship dominates the process of moral reasoning, even if they are instructed to empathize with the pharmacist. In participants with ASD, the altered structure of a social schema forces them to rely on different concepts. Being instructed to put themselves into the shoes of the pharmacist might trigger a combination of core affect and conceptual knowledge that is familiar to them, namely the physiological state of negative valence and high arousal in combination with autobiographical memories of victimization (Chamberlain et al. 2007; van Roekel et al. 2010) and tendencies toward legal-rule-oriented behavior and perseveration (Perner and Lang 2000). Consequently, they have a preferential understanding for the position of the pharmacist.

Everyday Life Situations versus Extreme Life Situations

Extreme-life situations in combination with a social schema (Pharmacist Dilemma) yielded the greatest differences. Nonetheless, extreme life situations without social schemas also showed some significant differences in several items as well. The dilemma with an everyday-life situation exclusively (Bicycle Dilemma) revealed only one statistically relevant effect. The combination of everyday-life situations with the social schema of friendship also yielded only one significant difference in the assessment of the decision. It can therefore be concluded that individuals with ASD are able to come to similar decisions and assessments in moral reasoning provided the case is based on an everyday life situation without social schemas of close relationship.

Classical False Belief Tasks and Moral Reasoning

The first attempts to describe the human nature of moral reasoning emphasize the importance of the ability to understand and reflect upon intentions and beliefs of others (Kohlberg and Kramer 1969; Piaget 1965). More recent publications highlight the effect of ToM abilities on moral reasoning (Fu et al. 2014; Sodian et al. 2016). In a series of investigations considering false-belief tasks as standard instrument for measuring ToM, it has been shown that individuals, in particular children with ASD, display deficits in false belief tasks (Baron-Cohen et al. 1985; Happé 1995). On the other hand, there are a number of studies which show that especially adolescents and adults with high-functioning ASD are able to manage false-belief tasks of first and second order (Happé 1995; Senju et al. 2009). In our study, the results of the classical false belief task yielded no significant differences and the accuracy rates in all three stories were almost the same between ASD and NTD. This is consistent with findings in earlier studies using classical false-belief tasks (Blair 1996; Schaller and Rauh 2017) and those with a focus on adolescent or adult participants with Asperger syndrome or high-functioning autism (Peterson et al. 2007; Roeyers and Demurie 2010). Furthermore, the association between the accuracy rates of the classical false belief task and the response behavior in the IMRT yielded no significant results. This could be because the protagonists of these classical false belief stories act as providers of epistemic states, only providing information about knowledge and non-knowledge, without any statement about their emotional state or complex (emotional) relationships between the actors. Thus, we suggest that the ability to solve classical false belief tasks is a necessary but not sufficient condition for moral reasoning.

Symptomatology and Moral Reasoning

Symptom severity and alexithymia did not significantly explain additional variance in the moral dilemmas. Taking the EQ score as a dimensional measure of empathy, the Pharmacist Dilemma yielded a significant association with type of decision. This could be interpreted as evidence that a greater degree of empathy may lead to a higher tendency to take action and rob the pharmacy. This result complies with our initial assumption that empathy is a prerequisite for moral reasoning (Blair 2008; Kohlberg and Kramer 1969; Piaget 1965). What is the decisive factor that encourages individuals with ASD to assimilate their response behavior to that of typically developed individuals? Considering the decreasing ASD–NTD difference from the adolescent to the adult age group, it can be assumed that adult participants with ASD have an experience-based advantage. Over time, they accumulate knowledge about social situations that allows them to socially compensate and augment their rudimentary social schemas with alternative heuristics. We suggest that the ability to reflect on social situations and one’s own role within them is delayed in individuals with ASD. Adult individuals with ASD seem to have more comprehensive insight into their communicative and social competence.

Limitations and Implications for Future Research

The assessment of moral cognition on the basis of a psychological constructionist account and with a particular focus on social schemas led to the development of the IMRT. This instrument asks participants to decide between two actions and to rate valence, arousal, moral acceptability, and permissibility of this decision. Since an investigation based on these criteria, to our knowledge, has not been done, this first attempt is associated with some limitations.

The results of this study are based on a relatively small sample size, only powerful enough to detect large or medium effect sizes. Since the ASD group shows an average IQ of 111 the results are only attributable to those individuals defined as high-functioning on the autistic spectrum and cannot be generalized to low-functioning ASD. Although the ASD sample included a rather high number of female participants, it was still too small to allow for a robust statistical analysis of gender-specific differences. Future studies should focus on possible differences between genders with regard to moral reasoning and social schemas (Capraro and Sippel 2017).

It is apparent that such extensive testing of each dilemma limits the total number applicable in a single test. However, in future studies, it would be desirable to investigate more and other dilemma variations. In particular, the demonstrated influence of social schemas on moral reasoning suggests varying “social” parameters within the schemas, like modifying the level of intimacy of social relationships. In the age range between 14 and 61 years, we were able to show that the false belief task and some of the moral reasoning dilemmas yielded no significant difference between participants with ASD and typically developed participants. Accordingly, we would recommend opening the age range for children younger than 14 in future studies. Considering the age-related expansion of experience resulting in schemas of higher complexity, it is likely that a comparison of younger age groups will yield more pronounced differences. Regardless of the research questions examined here, it would be informative to conduct more precise examinations on the structure of social schemas in ASD.

Finally, we want to stress that this study by no means claims to exhaustively investigate all aspects of social schemas in moral reasoning. Still, the IMRT is an approach to investigate moral reasoning as a particular aspect of social cognition from the viewpoint of social schemas. Further research in this field may be helpful to come to a better and more comprehensive understanding of the difficulties individuals with ASD are exposed to in a social world.