Detailing the understanding of moral judgements in autism. A study with Spanish-speaking children

Abstract This study compared the theory of mind features of moral judgements in 60 children with and without Autism Spectrum Disorder (ASD), using a novel verbal moral judgement task. The task focused on three different scenarios which were morally unacceptable. Five measures were assessed with forced-choice responses, whilst the other two were categorised according to the quality of the response (categories: (i) related to mental states; (ii) descriptions; (iii) literal reiterations; (iv) inappropriate). No significant differences regarding the forced-choice answers were found between the groups. Justifications were classified mostly in category (ii) description by the ASD group, whereas the justifications of children with typical development referred to the use of mental-state words and descriptions for explanations. Implications of ToM in moral reasoning for children with ASD are discussed.


Introduction
Human-beings tend to read others' actions and intentions in order to understand social interactions and communication (Astington, Harris, & Olson, 1988). Importantly, based on what we predict from a specific behaviour, we can form a judgement and decide to act accordingly. For example, if we recognise that a child hit someone by accident, the moral evaluation we might make of that child will be different from that formed if we recognise that the action was carried out on purpose. In this recognition, the capacity that allows us to attribute mental states (such as desires and beliefs) to oneself and to others plays a central role and is known as Theory of Mind ([ToM], for a review see Frith & Frith, 2005;Hughes & Leekam, 2004, or Perner, 1999. Therefore, ToM ability is critical to evaluate others' intentions and, as in the example given, it is important to make a distinction between intentional and non-intentional actions of an agent. In consequence, ToM enables us to form a moral judgement about the agent involved in a specific situation (Buon et al., 2013;Turiel, 2006;Wellman & Liu, 2004). Indeed, there is a link between ToM reasoning and moral judgement that runs from the intention of an action to the moral evaluation (Leslie, Knobe, & Cohen, 2006). Therefore, the elemental features involved in an intentional action should be analysed, according to Cushman (2015), in order to better understanding: (i) mental states -beliefs and desires: the agent has a mind and (i.e. talking in class). Participants with ASD (age range = 8-17 years old) were divided into Non-ToM and ToM groups, depending on whether they passed the first-order false belief (FB) task successfully or not. Their findings showed that the group of children with ASD who lacked the ability to mentalise were sensitive to the distress of others. Children with ASD made the distinction between moral and conventional transgression. Moreover, their results in the false belief tasks were not associated with the tendency towards this distinction. Later, following Blair's study, Leslie, Mallon and DiCorcia (2006) also compared children with ASD (7 to 16 years old; mean verbal mental age of 5;11) to children with TD (3 to 5 years old) on basic moral judgements (basic distinction between good and bad, avoiding to test the comprehension of intention) and the punishment or reward they deserved. The authors found that only one participant with ASD successfully passed the two standard first-order FB tasks. However, the ASD group understood how bad or good the acts were (item example: 'Was it good, bad, or just okay that Patty hit Sarah? ' [good, OK, bad]). Hence, both studies found that children with ASD were as good as children with TD at distinguishing morally acceptable acts from morally unacceptable acts. Thus, the ASD groups in these two studies would be capable of reaching a basic understanding of right and wrong moral aspects.
It could therefore be seen that moral judgements would be relatively independent of ToM, thereby showing that children with ASD who failed standard FB tasks may yet retain a basic sense of morality. However, in both papers the authors argued that a minimal level of ToM would be necessary to understand some transgressions, such as the mental states involved in the act of lying. Telling a lie requires that the liar deliberately and successfully creates a FB in the mind of another person. Thus, ToM is an important tool to enable the agent to do so but is also a valuable aid enabling the observer to fully understand the transgression (Talwar & Lee, 2008;Talwar et al., 2012). As a highlighted point, the two studies cited above were based on a basic distinction of good or bad in stories with both positive and negative valences (unambiguous), and they did not ask children about the agents' intentions, nor were the participants required to deploy ToM to make moral distinctions. Even so, these two classical studies in which most of the literature is based shed some light to the fact that the moral judgement capacity of individuals with ASD has been greatly underestimated, and when answers are forced-choice (and no verbal justification is asked), people with ASD can make adequate moral judgements. Importantly, a working hypothesis formulated in Margoni and Surian's work (2016) paves the way to understanding that when tasks are simple with unambiguous moral cases (i.e. negative outcome produced by an intentional action), children with ASD can respond as well as children with TD. Therefore, when the moral judgement tasks involve ToM competence, differences can appear and reveal difficulties in reasoning about others' mental states when people with ASD are able to pass first-or second-order classical FB tests.
Some studies centred on advanced ToM tests (complex scenarios where an advanced level of ToM is required) have recently shown that individuals with ASD have difficulties in distinguishing the intentionality of the actions (accidental / deliberate) and these misunderstandings could be affecting their moral judgements. In their study, Moran et al.
(2011) reported a mismatch between the comprehension of the negative outcome of the agent's action and his/her actual intention (good / accidental), and it was mediated by the ToM impairments of the ASD group. Specifically, as a consequence of the ToM deficit, participants with ASD blamed the agent who caused the accidental harm more severely than the TD group did. However, no differences between groups were found in the moral judgements of neutral acts, attempted harm or intentional harm.
In another study based on an accidental action task (Zalla et al., 2009), the so-called 'faux pas' test (Baron-Cohen, O'Riordan, Jones, Stone, & Plaisted, 1999), the interpretations of blunders or 'faux pas' were analysed. A 'faux pas' occurs due to a FB of the real situation, when the speaker says something that can hurt the listener emotionally, although this negative emotional impact is not intentional. Zalla and colleagues (2009) observed that adults with ASD could correctly detect the 'faux pas', but they failed to interpret the FB of the speaker. They also provided explanations in terms of malevolence, judging the speaker's intention to humiliate or offend the listener as deliberate (intentional).
Consequently, the studies cited above may indicate the idea that individuals with ASD do have difficulties in understanding intentions in terms of ToM reasoning when the situations are socially complex or ambiguous, as pointed out by Margoni and Surian (2016). This probably occurs because they tend to judge the culpability of the agent on the basis of the explicit observation of his/her outcome (hurting a person) rather than his/her implicit intention; that is, they judge the agent as "bad" because the outcome was bad or harmful, regardless of whether the agent's intention of his/her action .
Another study which asked whether children with ASD were able to judge transgressions in terms of outcomes or intentions was that of Grant, Boucher, Riggs, and Grayson (2005). Grant et al. (2005) presented pairs of vignettes adapted from Elkind and Dabek (1977), in which the actions were either deliberate or accidental and caused a bad outcome (harm to a person or to an object). Results showed that children with ASD used information about intention as the basis of culpability judgements, considering the agent's intention to be more relevant than the consequences for the moral evaluation. The members of the ASD group were also capable of judging the negative consequences as worse (i.e., as more blameworthy) when a person, rather than an object, was involved.
Nevertheless, results differed when the participants were asked to verbally justify their responses. Grant and colleagues classified the verbal responses of this study in categories, giving importance to the pain (i.e., greater culpability in the case of injury to persons than damage to property or objects), reversibility (i.e., property can be replaced, but damage to another person cannot) and intent/motive (i.e., whether an act was deliberate or accidental). Although children with ASD could offer some appropriate verbal justifications about moral judgement of the agent, their explanations were poorer, as they usually reiterated the story rather than elaborating on their explanations regarding the culpability, with fewer references to the agent's intention. The majority of the justifications in Grant's study were classified as 'not scorable'. These findings are related to the study conducted by Shulman et al. (2012), in which the authors found that children with TD used significantly more abstract rules as rationales than participants with ASD and, in addition, children with ASD provided more non-specific condemnations of the behaviours and poorer explanations, i.e. "that's bad" or "you can't do that". Authors argued that high scores (of appropriate answers) were linked to the use of mental state words in the children's explanations (e.g. Bishop & Norbury, 2002). delineating autism across the world, but also for the potential it brings in helping to refine the understanding of autism to researchers and clinicians in their own communities (Norbury & Sparks, 2013).
Specifically, in the current study we investigate whether or not children and preadolescents with ASD are able to use relevant implicit information about intention to form their moral reasoning in unambiguous stories created for the purpose of the study, following Cushman's model (2015): bad (or selfish) desires -deliberate intentionwrong action -bad outcome. Indeed, although participants with ASD could have problems in moral tasks which require more complex mentalising abilities, the studies analysed showed no differences between TD and ASD groups when responses were forced-choice (i.e. good/bad; see Leslie et al., 2006) and when tasks were not ambiguous Among existent studies about moral reasoning in children with ASD, our approach is innovative in three main ways. First, the present study introduces a novel moral task based on understanding transgressions which involve both ToM competence (i.e., false belief, desires, lies or trust understanding) and moral judgement evaluations with both types of questions (as argued in the study by Killen and colleagues, 2011): open-ended and dichotomous answers. Second, the age of participants in our sample is crucial given the little research that has examined moral reasoning in children between 7 and 12 years old.
Specifically, this may be a difficult age for children and preadolescents with level 1 ASD in terms of peer-relating. Finally, our dichotomous questions were different from those used in the other studies mentioned (such as Blair, 1996;Leslie et al., 2006). We were intended to ask a general question (e.g. "Was what [the agent] did right or wrong?") rather than being specific and repeating the act or the transgression (e.g. "Was it bad for Catherine to pull Sally's hair?"). As in the study by Nobes, Panagiotaki, and Bartholomew (2016), this specific information was excluded here because it provided the answer to the wrongness question, and thus, could increase children's tendency to make outcome-based judgments in their reasoning. Other questions were also added (i.e. emotions triggered in the victim -such as the resulting harm). In addition, as commented on Leslie et al. (2006), different actions were contemplated in terms of mental states involved in some transgression (i.e. it is more necessary to appeal to mental states in the act of 'lying' than in the act of 'breaking' an object). Finally, open-ended questions were based on morality and intention in order to examine the importance of the mental states involved in moral judgement stories (as in Grant et al., 2005). Therefore, the goal of the current study was to compare children close to adolescence with ASD to children with TD in terms of their comprehension of the moral stories that include complex ToM events. To address this goal, some research questions will be answered.
Related to the forced-choice answers: 1. Are children with ASD as competent as children with TD in the moral stories regarding different aspects: (1) detection of the transgression, (2) identification of the perpetrator who caused the bad action, (3) emotion triggered in the victim, (4) perpetrator's morality, and (5) wrongness of the act? 2. Are children with ASD as competent as children with TD depending on the mental states involved in the transgression (for example: when the bad action is 'lying' to someone, the story could be more difficult to understand than when the action is more basic, such as 'breaking' an object)?
Related to verbal justifications: 3. Are children with ASD as competent as children with TD making action-morality and intention justifications? 4. Which justifications are more reiterative in the two groups?

Predictions
Related to forced-choice answers: The first hypothesis of the present study was that children with ASD and TD would not differ in the moral judgements task forced-choice answers (as per Blair, 1999, andLeslie et al., 2006) since the transgressions contemplated in the study are based on unambiguous moral transgressions (bad intention -bad outcome; following Margoni and Surian's hypothesis).
The second hypothesis was that children with ASD should find it more difficult (and differences are expected when we compare stories between groups) to understand when the agent's action involved mental states than when the action is more basic (as commented in Leslie et al., 2006).

Related to verbal justifications:
The third hypothesis was that more difficulties and poor explanations were expected in the ASD group in their verbal justifications about intention and action-morality questions.
Finally, the fourth hypothesis was that participants with ASD would include more reiterations or nonsensical justifications on their verbal answers than participants with TD (as in Grant et al., 2005, andShulman et al., 2012).

Method Participants
A priori sample size analyses indicated that 29 participants per group would be adequate to detect large effect sizes (d = .80; α = .10). This sample was comparable to those used in previous research in this field (i.e., Shulman et al., 2012). A total of 63 children took part in the present study, three of whom did not pass Task

Materials
All tasks were administered and performed in Spanish, since all the participants spoke Spanish as their mother language.

Intelligence Quotient (IQ):
Sattler's short adaptation (1992) of the WISC-III (Wechsler intelligence scale for children (Wechsler, 1991) was administered first (in the TD group). The WISC-III full scale IQ was highly correlated with the short form (Vocabulary and Block Design) as found in classic studies (Ryan, 1981;Sattler, 1992). Comparison of scores was possible as all the participants with autism had diagnostic reports made by a qualified psychologist or neurologist at their specialist neuropsychology centre within the previous two years.
Given this high correlation between the short form and the full scale of the WISC-III, the short form WISC-III was administered to the TD group as a reliable estimate of the group's intelligence quotient, with the main objective of ensuring comparable IQ levels in both groups.

Theory of Mind and Moral Judgement tasks:
Task 1: First-order false belief task (ToM). All participants were screened by means of a first-order FB task adapted from the study by De Villiers and De Villiers in This task was designed as a sequence of pictures, where the first character (a girl) places an unusual object (a little chick) into a box originally containing shoes. The children are then asked which object the second character (a boy) thinks will be inside the box (by presenting the option of a chick or shoes). Specifically, they were asked: "What goes in here?", "What is he thinking?" and "What is actually inside the box?". The maximum score was required (3 points, 1 point for each question).
Task 2: Moral Judgement. The Moral Judgement task was created by the authors of this study to measure the reasoning of Spanish children with level 1 ASD in daily life inappropriate scenarios. The moral scenarios (lying, stealing, physical and emotional harm to someone else) were based on universal moral norms, regardless of the authority or social context, which involve a victim and are general in scope (see Sousa, 2009).
Although the stories used in the present work are based on universal moral norms, most existing articles on moral reasoning in autism focus on English-speaking children, and it is important to expand knowledge in this field in children of a different culture and language, as in the case of the present study, the Spanish-speaking children living in Spain. This is important not only to improve the knowledge of the moral reasoning of children with ASD in a more universal and detailed way but also for the Spanish professionals who attend children with ASD in Spain who also will have a more approximate knowledge of the reality of their moral reasoning (Norbury & Sparks, 2013).
Following Cushman's model (2015), the "structure" of the stories was unambiguous: bad (or selfish) desires -deliberate intention -wrong action -bad outcome. Based on this structure, three different basic moral stories were created for the purpose of the study: 1. Dog story. The transgression involves a lie and inculpating an innocent: the child's desire was not to be punished and she deliberately blamed the dog, which cannot defend itself.
2. Car story. The transgression involves breaking a valuable object which belongs to someone else: the boy's desire was to reach his destination as quickly as possible and he deliberately broke the window of his friend's car to get obtaining the keys.
3. Sandwich story. The transgression involves stealing or doing something without the permission of the owner: the boy's desire was to satisfy his hunger and he deliberately ate his friend's sandwich without permission.
Information was collected through dichotomous choice answers and verbal justifications and was subsequently categorised (see Questions section).
For better comprehension, the three stories are placed in the Appendix.

Questions
After each story, five questions were asked that required forced-choice responses (correct answers are in bold): (1)  all participants chose the correct response ('it was wrong'), and it was not included in the analysis; however, this result is commented on in the discussion. Therefore, the score range per story was 0-4, excluding the point for the fifth question, and the score range of each variable (e.g. total score of (1) detection of the transgression) was 0-3.
In addition, two other open-ended questions were asked to obtain more verbal information: (6) action-morality: 'Why was what [the perpetrator] did wrong?' and (7) intention: 'Why did the character do this? What was his/her intention?'. The answers to both questions were classified into four categories according to: (i) appeal to mental states -especially desires -that are involved in the transgression; (ii) superficial descriptionjust the description of the action; (iii) literal reiterations -the child copies the same sentence from a character in the story; and (iv) inappropriate or nonsensical -the child provides an inappropriate justification which does not fit into any of the categories.
For analysis, the mean value was understood as useful in interpreting participant comprehension in the descriptive analysis and its subsequent comparative analysis of the categories (for example in Table 4). Thus, the scores were distributed in relation to the adequacy of the answers on an inverse scale: score of 4 (max) = category (i) justification, down to a score of 1 (min) = category (iv) justification. A score of 0 was given for a nonresponse, since one child in the ASD group did not reply to the justification questions.

Procedure
This study was approved by the research ethics committee of [content hidden] and the school authorities. Prior to taking part in the study, the parents of each child gave informed consent for their children to participate. As a result of these meetings, six mainstream centres (all from the [content hidden]) participated in the study. Over several months, the different tasks were administered to the children, including the short Sattler adaptation (1992) of the WISC-III, the ToM task adapted from De Villiers and De Villiers (2012) and the verbal moral task.
Each task was performed in a quiet room free of any distractors with a table and two chairs. The duration of the session was approximately 50 minutes per participant and only children who passed the ToM first-order FB task (Task 1) went on to the additional moral judgement task (Task 2). Three children did not pass Task 1 and were excluded. Children listened to the audio recording via E-prime, which randomised the order in which the stories were presented for each participant. Children were informed that they would listen to a story and be asked questions at the end. They were told to listen carefully and do their best. The five questions with dichotomous answers were asked first, followed by the two for explaining the justifications regarding morality and intention. Participants' responses were recorded and transcribed verbatim. For the categorisation of responses, two raters (the first author and a colleague blinded to the children's diagnoses and the hypotheses of the study) independently coded all 360 justifications (Cohen's kappa = 0.75; good strength of agreement). Disagreements between coders were resolved through discussion.

Data Analysis
Data analysis was conducted using the statistical package SPSS (v. 24). Non-parametric statistics were used, as the variables of interest did not follow the normal distribution in the two groups (using the Kolmogorov-Smirnov test for normality). Therefore, the data for dichotomous answers were analysed using non-parametric tests (Mann Whitney-U and Wilcoxon Signed-Rank test) and the justifications were analysed using Chi-squared (χ2). The significance level for all the analysis was α = .05. However, in order to be consistent across our analyses, as the comparison in all the analysis was split by the three stories (Dog, Car and Sandwich), to test the first hypothesis (differences between ASD and TD group in the five forced-choice answers) a threshold of significance of .05 / 3 = .017 was adopted.

Are children with ASD as competent as children with TD in forced-choice answers regarding moral scenarios?
The ASD group scored lower (trend towards statistical significance) than the children with TD on (1) detection of the transgression (U = 375.00, α = .02, r = .30, two-tailed) and (2) identification of the perpetrator (U = 320.00, α = .02, r = .31, two-tailed). There were no significant differences between groups on (3) emotions triggered in the victim (U = 372.00, α = .06, r = .24, two-tailed) and (4) perpetrator morality (U = 352.50, α = .13, r = .20, two-tailed) (see Table 1). Means on (5) wrongness of the act were the maximum score in both groups. Following the scenario-analysis detailed above, apparently differences between-groups were found on questions (1) and (2). However, as there were three stories, a threshold of significance was adopted of a significance level of .05 / 3 = .017. Therefore, no significant results were found when forced-choice answers were compared between groups.
[ Table 1. Mann-Whitney U scores by question and groups in the moral task (min = 0;

Are children with ASD as competent as children with TD in forced-choice answers depending on the mental states involved in the transgression (lying/blaming, breaking, stealing)?
On dividing the scores among the three stories in the forced-choice answers, significant differences only existed in the Dog story (lying/blaming) between groups (ASD group mean = 3.43, Mdn = 4; TD group mean = 3.83, Mdn = 4; U = 315, α = .011, r = .33, twotailed).
There was no significant difference in the Sandwich story (χ2 (3) = 6.83, α = .077, φ = .34). The mean value is understood as useful in interpreting participant comprehension in the descriptive analysis and its subsequent comparative analysis (see Table 2 for means).
[ Table 2. Mean and Mode (the level of the category) in the ASD and TD groups.

What justifications are more reiterative between groups?
As shown in Figures 1 and 2, while the children with TD performed well in both the action-morality and the intention questions, giving responses that fell into categories (i) and (ii) (max scores) across all three stories, the children with ASD displayed more varied responses (across all categories), especially regarding action-morality. Thus, the TD group was more focused on the mental states of the characters, and also on the description of the action (e.g. breaking, lying), when it came to reasoning about the action-morality (child with TD, real example: "because he ate the food" -classified as category ii: descriptive, in the Sandwich story. See Table 3 for more examples). For the intention questions, the TD group focused on the actions, except in the Dog story (Figure 2a), for which it is important to note that the justifications were based on mental states (child with TD, real example: "because if she had told the truth, she would have been punished by her mother" -classified as category (i): mental states. See Table 4, category (i) for more examples).
In the ASD group, the justifications were more random in action-morality, with more inappropriate answers and superficial descriptions of the act in comparison to the TD group (child with ASD, real example: "because he hit it" -classified as category (ii): descriptive). Nevertheless, justifications for intention were commonly better classified in category (ii): the superficial description of the act (child with ASD, real example: "because she likes chocolate cakes". See Table 4, category (ii) for more examples). [

Discussion
The main aim of this study was to investigate whether preadolescents with level 1 ASD would be as competent as preadolescents with TD on a verbal moral judgement task that include ToM aspects, using both forced-choice and verbal open answers. It is important to investigate the type of moral reasoning that children and preadolescents with ASD make, because during this developmental period social relationships with peers are very important, and understanding moral transgressions can be crucial not only to better understand how friendships and human relationships work, but also to prevent bullying and mate-crime cases (Perren et al., 2012).
Our first hypothesis was that children with ASD and TD would not differ in their forcedchoice answers in the moral judgements task because the transgressions included in the study are based on unambiguous moral transgressions. Between-group comparisons confirmed that no significant differences on the dichotomous choice answers between preadolescents with ASD and TD were found. These results are consistent with Leslie et al. (2006), who found basic moral judgements were substantially intact in children with ASD. Importantly, in the choice answers, children with ASD showed some understanding of the aspects considered to be 'more difficult', namely, the recognition of basic (3) emotions and the (4) perpetrator-morality. Related to (3) emotion, our results showed that children with ASD could distinguish when an action may make feel "bad" or "sad" other people. These findings highlighted the importance of others' feelings and emotions and agree with research by Grant et al. (2005), who demonstrated that children with ASD, TD (and MLD) were able to complete their memory and comprehension questions satisfactorily, most of which were related to emotions (i.e. "Did this make John happy or sad?"), as are the questions in the present study. Although in our stories the actions of the characters were always bad, this may be controversial in (3) emotions triggered in the victim in the Car story, as some children with ASD commented in their verbal justifications, "he did it for her, maybe she could be happy because she retrieved her keys", without considering the bad consequences in their explanations, and also the conflicting emotions caused to the girl (i.e. "however, she is sad or upset because the window is broken"). Regarding forced-choice answers about (4) perpetrator's morality, Leslie et al. (2006) stated that the distinction between 'good' and 'bad' is necessary to be able to make basic moral judgements. This fundamental ability underlies the capacity to judge the appropriateness of someone's social behaviour. However, studies usually focus on how bad or good the acts are, and questions related to character-morality are usually asked as culpability judgements or explanations (such as Grant et al., 2005: 'Why do you think X was naughtier?'; see Killen et al., 2011). Our study considered both perpetrator's morality and wrongness (act-morality) in the forced-choice answer questions. As regards the judgement of the character as good or bad, no significant differences between groups were observed because both groups scored the lowest mean value (compared to the other variables) and both were equally unable to correctly judge the perpetrator's morality, as in some of the stories they responded that the character who had caused the bad action was good. This finding is inconsistent with the results in wrongness (as explained in the Method section and Results), since all the participants in both groups judged the action as bad, as in Blair's pioneering study. As expected, the capacity to understand (5) wrongness is intact in children with ASD when they are asked to distinguish between right/wrong actions. This discrepancy between questions (4) and (5) may be explained by the fact that judging a character can be more difficult than judging an action. Following the model of Cushman (2015), to distinguish intention it is critical to evaluate the morality or culpability of an agent, and the evaluation of the wrongness of the act depends upon the agent's mental states but also on the agent's action. In this study, particularly, the intention and the outcome were not ambiguous and children could find it easier to understand the bad action because they linked it to the bad consequences or the victim's emotion, as commented in Margoni and Surian (2016), but they probably do not link it to the agent's mental states. These results highlight one of the reasons why more research is required in the moral field in autism, particularly, on the difficulties in comprehending ecological situations and judging the intentions and morality of others and their acts.
Emerging evidence in the criminal justice field shows that adolescents with ASD (even children and adults) are more vulnerable to becoming involved in "mate crime" -where someone deliberately befriends a person with ASD in order to take advantage of him/her -as a consequence of the fact that they have these difficulties in interpreting others' intentions and judging whether an act or person is good or bad (Grundy, 2011;Thomas, 2011).
Our second hypothesis stated that children with ASD would find it more difficult to understand when the agent's action involved mental states than when the action is more basic compared to TD group. Between-group comparisons corroborated this hypothesis, because these expected difficulties to understand when the agent's action involved mental states became especially visible in the resolution of one of the three stories (Dog story), in which the selfish action resulted in harm to another (the dog). This action affected the dog directly and involved more mental states to understand the reason for that action. So, the most significant differences between groups were found in the Dog story, and as expected, it was classified as the most difficult story for children with ASD. The main act -lying -is of such a nature that an advanced level of ToM is required for its complete understanding. Actually, the Dog story has a double morality that emerges from lying and blaming another person (two bad acts). In this sense, as detailed in previous research (Spence et al., 2004, Talwar & Lee, 2008Talwar et al., 2012), a lie exists if the speaker causes a false belief in the mind of the listener with the intention of deceiving. In our Dog story, the child (speaker) knows something that her mother (listener) does not know, allowing the child to have a certain degree of control (deliberate action). Also, the observer may take into account that, as a dog cannot speak, the moral strategy of the child is doubly bad. As a consequence of the lie, the girl will seem innocent, the mother will put the blame on the dog (false belief), and the dog will be punished.
The third hypothesis stated that more difficulties and poor explanations were expected in the ASD group in their verbal justifications (open questions) about intention and actionmorality. Results showed that children with ASD had greater difficulty than children with TD in recognising and explaining intentionality and action-morality in basic moral stories. Significant differences were found between both groups in five out of the six explanations, showing that participants with ASD failed to apply the required ToM ability to understand and judge the intention and the morality of the act described in the task. As other studies showed (Grant et al., 2005;Shulman et al., 2012), children with ASD were able to provide some appropriate justifications, although most of them were of poorer quality than those of the TD group, using descriptions of the narrative, reiterations and not attributing mental-state words. In our study, judgements made by the ASD group differed from those made by TD group when the explanations were related to the morality of the act -more so than in the case of intention. This interesting result is inconsistent with the findings in Blair (1996) and Leslie et al. (2006) based on bad/good/OK choice answers, meaning that ToM could be influencing moral understanding and, especially, the explanations when mental states are involved. Particularly, ASD group's justifications for the Dog story included a greater frequency of inappropriate answers and verbal reiterations of the story, classified as the lower scores: category (iv) inappropriate or nonsensical. Again, this story was also considered the most difficult of the task when justifications were asked for in the open-ended questions (Talwar & Lee, 2008;Talwar et al., 2012).
Finally, the hypothesis 4 stated that participants with ASD would include more reiterations or nonsensical justifications on their verbal answers than participants with TD. In this sense, results showed that TD group was a more homogeneous group (answering predominantly classified between categories (i) and (ii)), and the ASD group was more heterogeneous, giving justifications classified in all categories; especially, justifications were classified in the lower categories in the explanations about actmorality (as in Grant et al., 2005). Taking a closer look at the stories, some information can be extracted from the verbal justifications. For example, the Sandwich story was set in a school, and the scenario was believed to be more relatable to children than, for example, the Car story, which is more related to adolescent/adult contexts. Children seem to perform best on tasks that were closest to their own experiences and with dialogues that they could face in their daily life. In Grant et al.'s (2005) study, the good scores in the ASD group were explained by the fact that they learned by their own experience or because they were explicitly taught. Beyond the scenarios and the value of the objects (i.e. a car and a sandwich), the trust and emotions involved in the stories were more salient for children in the more relatable scenario. It is important to highlight that most children appreciated the trust and friendship in the Sandwich story more, as they explained some similar experiences in their justifications. In fact, this scenario would involve more mental states and emotions for children and preadolescents than a common story based on a robbery in a bank. However, in the Car story, both groups simply explained what was happening and they did not look at the feelings or the culpability of the character arising from breaking an expensive object. Possibly, the most striking difference between groups was in the Dog story, which was the easiest to explain for the TD group by attributing mental states to the child's intention. Nevertheless, the ASD group tended to provide repeated elements of the story on their verbal justifications, which were more descriptive than using their own words. As found in the studies by Grant et al. (2005) andShulman et al. (2012), these kinds of differences are based on considering an appropriate answer linked to the use of mental state terms in the responses (Bishop and Norbury, 2002). When a moral transgression occurs, it is important to use the information of the perpetrator's mental state to make sense of the whole act, and understand, for example, why (in this specific situation) he/she did not follow a social norm or why he/she acted in a certain way. In this sense, we must be flexible and correctly attribute mental states to individuals to understand that sometimes people break a rule for various reasons (e.g. she acted wrong moved by a desire). Possibly, individuals with ASD less frequently use information about a particular mental state due to their cognitive inflexibility to follow moral norms or social rules. In contrast, participants with TD tend to justify open answers in a more creative way (e.g. giving complex arguments or empathizing with the characters).
In general, the findings of the present study confirmed that children with ASD were as able as children with TD to respond in the forced-choice answers, since the transgressions were unambiguous (bad intention / bad outcome), as Margoni and Surian (2016) argued.
This finding reveals interesting knowledge about the moral reasoning capacities of children with ASD; such as, when forced-choice answers are shown, participants with ASD can perform as well as participants with TD, although the scenarios could be classified as complex. This complexity increased due to important information being excluded from the question because it could provide the correct answer (as in Nobes et al., 2016). Nevertheless, when the participants answered verbally, findings indicated that moral reasoning was more difficult for children with ASD, especially when mental states were involved in the transgressions. For example, participants with ASD had difficulty recognising and explaining intentions and why perpetrators did 'bad' things (morality of the act). However, further information is necessary in order to know what variables could affect people with ASD in their reasoning of moral situations, and whether verbal justifications are a good way to determine differences between participants with ASD and TD.

Limitations
There are some limitations of the present study that should be acknowledged. First, a limited number of stories were used. Second, the research team created the scenarios used, and the stories were novel and were not administered in advance before the current study.
Finally, the task was administered only verbally. Therefore, future studies are necessary due to there is a need for more novel ecological tasks related to children's and adolescent's everyday interactions. These studies could include a greater range and variety of moral scenarios such as telling white lies, deceit, cheating or hitting, and also test larger samples. In addition, future research could compare these novel stories to other advanced ToM tasks, such as Strange Stories (Happé, 1994) or the Faux Pas test (Baron-Cohen et al., 1999). Other Executive Functions tasks could be administered as well, for example, working memory would be relevant to handle and interpret the information, given all the elements of verbal content one needs to remember and comprehend. The effect of working memory must be controlled for in future research, although of course working memory is required in peer-to-peer dialogues and other real-world social situations. For this reason, some working memory control measures should be analysed in relation to moral judgements. Future studies should further explore the differences between children with ASD and comparison participants using moral situations with visual and verbal formats, considering differences in working memory -especially in the verbal information -and, as commented earlier, different scenarios by focusing on the comprehension of emotions, morality, intentions and beliefs.
Overall, difficulties associated with ToM, along with intrinsic understanding of morality and possible links to poor judgement, provide a rationale for further research on the different aspects of the interrelatedness of ToM and moral reasoning, as is evident from recent studies (Killen et al., 2011;Shulman et al., 2012). Another explanation for the difficulties found in the justifications lies in the broad area of complex information processing (Minshew and Goldstein, 1998), the weak central coherence theory (Frith, 1989) and executive function (Ozonoff, 1997;Russell, 1997), in the sense of ignoring distracting information, planning, processing, memorising and working with complex information to decode the story and formulate appropriate answers, based on their own beliefs.
It can be summarised that ToM is a multidimensional process that requires the integration of multiple components to understand and predict both our own and others' mental states (Amodio & Frith, 2006). The similarities and differences found in both groups can shed some light on, and pave the way towards knowing how individuals with ASD process the information. Our results may not be understood as 'pass' or 'fail', the focus may be on the need to strengthen those aspects that people with ASD can comprehend easily (a task designed with forced-choice answers) and clarify how to compensate difficulties (verbal reasoning).
For future research, it is encouraged to build more scenarios that involve ToM elements and morality into moral judgement tasks, with the purpose of revealing interesting findings regarding the moral reasoning capacities of individuals with ASD. In the social context, the importance of this understanding in individuals with ASD close to adolescence can help to prevent some situations of bullying or mate crime. These kinds of materials that contemplate both morality and ToM aspects can bring individuals with ASD closer to real situations that they can face in their interpersonal relationships. This could be a good manner to show scenarios more related to real life than the prototypical stories. Moreover, it is during adolescence when many moral transgressions scenarios occur every day (Brugman et al., 2003). As denoted by Norbury and Sparks (2013), our conclusions not only extend the knowledge about professionals and clinicians have about autism in a cross-cultural sense, but the need for culturally appropriate assessment practices can also be important for our own culture. Actually, in a broader sense, if we can improve tasks and tools for people with ASD through better design, this can be beneficial for everyone.