Incivility in online news and Twitter: Effects on attitudes toward scientific topics when reading in a second language

On the Internet polite communication can stand side by side with uncivil comments. Research on online incivility has been conducted with users reading in their mother tongues (L1), while the potential effects of incivility in a second language (L2) have been largely underexplored. This paper analyzes the effects of uncivil comments written in an L2 on attitudes around emerging technologies. Study 1 replicates a previous experiment (Anderson et al., 2014) by adding a bilingual condition. Study 2 analyzes the effect of incivility on four fictitious Twitter debates around different scientific issues. Results show that participants usually endorse claims written in civil rather than uncivil manner, but only in their L1.

In one well-cited study, Anderson et al. (2014) gave a representative sample of US population an online newspaper article on the pros and cons of a largely unfamiliar topic, nanotechnology, followed by a civil or uncivil comment. These authors measured polarization taking 'risk perception' as the dependent variable of their study, as controversies about modern technologies frequently focus on risk (Sjöberg, 2004). They found that incivility drove a polarization of risk perception of nanotechnology, especially among individuals who did not support nanotechnology or among highly religious people. According to the authors, this means that when faced with incensed comments online, people rely more on previous beliefs or heuristics (Anderson et al., 2014). In the present paper, we seek to explore if this effect is also true for people reading in an L2. As it will be shown below, psycholinguistic research on bilingualism, in fact, has found that affective processing in an L2 is less automatic and triggers lower affective reactions (see Pavlenko, 2012 for a comprehensive review), which has been linked with a reduction of framing biases and more rational decision-making (Costa, Foucart, Arnon, et al., 2014;Hadjichristidis et al., 2015;Keysar et al., 2012).

Emotional Distance in an L2
Research on bilingualism has acknowledged the complexity of the relationship between linguistic experience and emotional processing (Freeman, Shook, & Marian, 2016;Pavlenko, 2008Pavlenko, , 2012. Studies conducted with different methods, such as interviews, questionnaires, memories, experimental tasks, psychophysiological approaches, and neuroimaging demonstrate that bilinguals show stronger emotional responses in their native language (Caldwell-Harris, 2014;Pavlenko, 2005;2012). The reduced emotionality in an L2 is especially evident in unbalanced bilinguals, e.g. people who display greater ease in one of the languages, as, for example, L2 learners (Opitz & Degner, 2012). For example, saying 'I love you' is perceived differently when proffered [Paper currently under review -please do not cite without authors ' permission] in a foreign language (Dewaele, 2010), because of the stronger emotional experience in the L1 (Pavlenko, 2005). This is also true for negative emotions. Bilingual speakers tend to change from L2 to L1 in conversation after showing a negative facial emotion (Williams, Srinivasan, Liu, Lee, & Zhou, 2019), or when recalling harmful memories (Ladegaard, 2018). Moreover, bilingual speakers usually shout in their L1 when in anger (Pavlenko, 2005), and perceive swear words strongly in the L1 than in the L2 (Dewaele, 2004;Pavlenko, 2008). In the same vein, it has been shown that Chinese-English bilinguals spoke longer about embarrassing topics in the L2 (Bond & Lai, 1986), given the increased emotional distance with a later-acquired language (Bond & Lai, 1986;Caldwell-Harris, 2014;Pavlenko, 2008).
Experimental studies have shown that the reduced emotional response in an L2 allows people to be less affected by decision biases when using an L2 than in their native language, as they rely on more systematic processes in decision-making (Keysar et al., 2012). Keysar et al. (2012) replicated the Asian disease problem 2 with bilinguals, specifically observing the influence of L2 on the framing effect, i.e. the fact that the way a given problem is presented, such as negatively or positively, influences participants' reasoning about gains or losses. Results replicated the framing effect in a sample of monolingual participants, but this effect disappeared when bilinguals read the problem in an L2. These findings were consistent across different decision-making situations, such as moral dilemmas, financial decisions, and risk-taking behaviors (Costa,Foucart,2 The Asian disease problem is a famous experiment designed by Nobel laureate Daniel Kahneman and his colleague Amos Tversky (Tversky & Kahneman, 1981) in which they asked participants to choose among four options the best program to combat the outbreak of an unusual Asian disease that is expected to kill 600 people. Options were equivalent, but simply framed differently. Results robustly showed that subjects are risk-averse for gains and risk-seeking for losses.
[Paper currently under review -please do not cite without authors' permission] Arnon, et al., 2014;Costa, Foucart, Hayakawa, et al., 2014;Geipel, Hadjichristidis, & Surian, 2015a;Hayakawa, Costa, Foucart, & Keysar, 2016;Keysar et al., 2012). The overall interpretation is that the framing effect is diminished when subjects took decision-making tests with an emotional basis in an L2 (Geipel et al., 2015a;Keysar et al., 2012). This paper explores the effects of a specific type of affective information in a second language, verbal incivility, by replicating Anderson's et al. (2014) study on the effects of nasty comments on risk perception of emerging technologies adding a L2 condition (Study 1) and further testing the findings in a different scenario, a debate in Twitter (Study 2). Based on the facts that civil comments are usually considered more rational and persuasive in an L1 (Chen & Ng, 2016;Popan et al., 2019), and that emotional terms have a weaker effect in an L2 (Pavlenko, 2008(Pavlenko, , 2012, we expect that civility of online comments will have different influences on risk perception of scientific topics contingent on the language in which they are written. Finally, given that we used only monolingual discussions (either in participants' L1 or in the L2), where participants discussed using the same language, we did not expect that the status of language (i.e. prestige of English) could be used as a heuristic cue to evaluate the quality of the arguments (Rösner et al., 2014;Tan et al., 2008). Specifically, the following hypotheses will be tested: Hypothesis 1. While reading in an L1, bilingual participants will align their risk perception to the view supported by civil comments to a higher extent than the view supported by uncivil ones.
Hypothesis 2. While reading in an L2, bilingual participants will not change their risk perception, regardless of the civility of online comments. between 60-80% correspond to a CEFR B2 level, and participants scored on average 70.5 (SD = 6.2). In the main analyses, we used data from the 70 participants that completed all the tasks. Participants provided an informed consent form prior to participation, and all the data was anonymized to ensure privacy.

Materials.
Texts. Participants read two newspaper online articles about the risks and benefits of nanotechnology and biofuels, adapted from the materials used by Knobloch-Westerwick, Johnson, Silver, and Westerwick (2015). These topics were selected as they were rather unfamiliar to students, thus avoiding biases caused by strong previous beliefs or personal involvement (Anderson et al., 2014;Tan et al., 2008). The articles presented the conflicts in a balanced way, providing one argument about the risks and another argument about the benefits of nanotechnology/biofuels (Table 1).
The articles simulated the layout of an online newspaper, with the BBC logotype on the top of the page and a frame that imitated a comment section at the bottom of the page.
As the original texts were written in English, materials were translated from English into Spanish by one of the authors; materials were reviewed by native speakers of both languages for accuracy and authenticity. The English version of the text on nanotechnology was 411 words long (457 in Spanish), and the English version of text on biofuels was 334 words long (409 in Spanish).
Online comments. Texts were followed by an anonymous comment between 56 and 72 words in length (M= 65.38, SD= 6.91) written in a civil or uncivil manner, which added an additional claim about either risks or benefits of the issue discussed in the text. The additional argument was the same for each topic, but it was preceded by a polite or impolite header. The polite headers were: 'In my opinion, these arguments are quite incorrect. I read myself a report that didn't present such good data' for biofuels, and: 'Honestly, I don't agree with the author of the article, as they seem to overlook some clear advantages of nanotechnology' for the other topic. The impolite headers were: 'This is a completely stupid article!! The author is an ignorant person, as I have read myself a report that didn't present such good data' for the biofuels news report, and: 'The author is an idiot. Please, investigate a bit on the advantages of nanotechnology before writing this shit!' for the nanotechnology news report. Summary task. Participants wrote a short summary of the article in Spanish (participants' native language), even if they read one of the texts in English. The number of textual arguments in the summaries (sum of arguments drawn from the text and from the comment) was coded. For example, the following summary included a negative argument from the text (food prices rise due to the production of biofuels) and an additional idea from the comment (ethanol produces more pollution per unit of energy): I enjoyed reading about biofuels because I didn't know much about it. I think it's a big problem that farmers are more interested in the production of biofuels than food and that [Paper currently under review -please do not cite without authors ' permission] this will lead to a rise in the price of the latter. Regarding the benefits or disadvantages for the environment, I thought that biofuel was less polluting than gasoline, as it comes from petrol, but ethanol produces more pollution per unit of energy burned.
To ensure the reliability of the coding system, 10% of the summaries were rated by the two authors, resulting in a high agreement (86.7%). Inconsistencies were agreed upon discussion, and the first author coded the remaining summaries.

Design.
The experiment was an incomplete 2 civil (civil vs. uncivil) x 2 language (L1 vs. L2) mixed design. All students read two texts, one in each language. For half of the students, the text read in their L1 included a civil comment, and the L2 text an uncivil one. For the other half, the L1 text included an uncivil comment, and the L2 text a civil one. Across participants, text topics were counterbalanced across languages and civility.
The main dependent variable was attitude change towards the view reported in the comment. To calculate such index, we subtracted the risk perception scores (inversed and centered) and the initial support centered scores, and adjusted the sign to the view expressed in the comment. A change towards a more positive attitude was left as positive if the comment discussed a benefit, but it was changed to negative if the comment included a risk. From these scores were created three groups: a score higher than 0 indicated that participants' risk perception changed towards the view expressed in the comment, a score lower than 0 that their perception changed in the opposite direction of the view expressed in the comment, and a 0 indicated that participants did not change their attitude towards the topic.

Procedure.
[Paper currently under review -please do not cite without authors ' permission] The study took place in computer labs, in groups of approximately 15 participants, in 60 minutes' sessions. Participants were randomly assigned to one of the two sequences of the study: half of them read a text with a civil comment both in the L1, and a text with an uncivil comment in the L2, while the other half read a text with an uncivil comment in the L1, and a text with a civil comment in the L2. Upon arrival, participants completed the pre-test measures. Then, they read the texts at their own pace. Afterwards, they completed the post-test measures and wrote a summary for each text. They were allowed to re-read the text before writing the summary. Finally, participants' English level was assessed by means of a brief standardized online test (Lemhöfer & Broersma, 2012).

Results
First, we checked participants' level of familiarity and support to the topic used.
Overall, participants indicated that they were not much familiar with the topics discussed (M nanotechnology = 3.2, SD = 1.7, M biofuels = 4.3, SD = 1.9), and they tended to have neutral or slightly positive attitudes towards them (M nanotechnology = 5.8, SD = 1.8, M biofuels = 7.2, SD = 1.8). Next, to ensure that the manipulation was effective, we analyzed students' perception of the civility of the comments for each language. As expected, for the L1 texts participants rated the civil (M = 3.6, SD = 1.2) as more polite than the uncivil (M = 1.6, SD = 1.0) comment, t(68) = 7.0, p < .001, d = 1.7. More critical, for the L2 texts participants rated as more polite the civil (M = 3.7, SD = 1.0) than the uncivil (M = 1.8, SD=1.2) comment, t(68) = 7.3, p < .001, d = 1.9. Next, to ensure that participants understood the text in both languages, we analyzed the quality of the summaries by language. Participants included a similar number of correct ideas included in the summaries for the L2 texts (M = 2.1, SD = 0.4) as for the L1 texts (M = 2.0, SD = 0.5), t(69) = 1.2, p = .18, d = 0.1).
[Paper currently under review -please do not cite without authors ' permission] In sum, the results indicated that participants identified uncivil online comments as impolite and understood the texts to a similar degree in the L1 and in the L2. Next, we tested the assumptions that comments' civility will affect participants' attitude towards the topic, contingent on the language of the text.

Comment civility and attitude change.
We used Chi-square tests to analyze the effect of text language (L1 vs. L2) and comment civility (civil vs. uncivil) on participants' attitude change towards the view reported in the comment. Across subgroups, descriptive data revealed that the majority of participants did not change their attitude from pre to post-test (percentages ranged from 39.5 to 60.5, see Table 2). Comparing groups across language, when participants read L1 texts, analyses indicated that comment civility significantly affected participants' attitude change,  2 (2) = 10.21, p < .01, Cramer's V = .39. According to Cohen's (1988) effect size benchmarks, this is a large effect. But when participants read L2 texts, comment civility did not affect participants' attitude change,  2 (2) = 2.12, p = .34, Cramer's V = .18.

[TABLE 2]
To interpret the effect, we conducted follow up Wilcoxon Rank tests comparing the number of participants who changed their attitude towards the view expressed in the comments, against the others (i.e. no change or change against the comment).
Supporting hypotheses 1 and 2, results showed that participants reading in an L1 changed their attitude more often towards the view expressed by civil, rather than uncivil, comments, Z = 2.8, p < .01, r = .33 (hypothesis 1), while those reading in the L2 did not modify their attitude contingent on the civility of the comments, Z = 1.3, p = .17, r = .15 (hypothesis 2).
[Paper currently under review -please do not cite without authors ' permission] Descriptive data suggested that the effect on comment civility in the L1 is largely due to the fact that participants tend to subscribe to a lower extent the uncivil comment in the L1 (approximately 41% less than in the L2), rather than subscribing to a higher extent the civil comment (approximately 5% more than in the L2). Of note is that the change on attitude towards the view expressed in civil comments in L1 is a medium effect, according to Cohen's benchmarks (1988).

Discussion
Overall, the results of Study 1 partially replicate Anderson's et al. (2014)  Each discussion included 25 tweets of 3 different types: civil (10), uncivil (10) or neutral (5). Specifically, each tweet presented a brief argument in favor of or against the topic under debate, preceded by a short header. This header was either polite or impolite. Concretely, impolite headers presented either insults or rude words, and were directed to third persons (e.g. 'What a bunch of idiots'!). Polite headers were written in the first person plural and in an educated way (e.g. 'We should learn more about that').
Neutral tweets were added as distractors. These tweets did not present arguments related to the topic, but only jokes, displays of interest, or claims about the need to deal with these or other topics (e.g. 'We can fight for or against nanotechnology, but on one thing we will all agree: it shows that size does matter').
We created two language versions of the tweets, by translating arguments into Spanish from the original English materials (Knobloch-Westerwick et al., 2015). Tweets were reviewed and slightly modified by native speakers of both languages, to ensure that they could be perceived as a regular tweet in each language. Tweets written in Spanish ranged from 128 to 206 characters (M= 171.33, SD= 18.49), while those written in English were from 111 to 199 characters in length (M= 156.19,SD= 21.97).
[Paper currently under review -please do not cite without authors ' permission] Materials were presented in an interface that simulated Twitter: alongside each tweet, there was a squared picture of the face of the fictitious author of the tweet (half male and half female), and on top, the name and last name of the author were included.
Names corresponded to common Spanish first and last names; we did not mix Spanish and English names to avoid the potential effect of the perceived origin of the author or in-group identification (but see Walther et al., 2018). The interface showed the logo of the platform, but metrics on engagement (likes, comments, retweets), and hashtags were not included.
We created two types of discussions by manipulating the percentage of civil comments favoring a particular view (i.e. pro or against the topic) (see Table 3). All discussions included the same number of tweets in favor (pro-topic) and against the topic (against-topic) being discussed, as well as neutral tweets. On the one hand, in civil pro-topic discussions 80% of the tweets including an argument in favor of the topic discussed were civil, and only 20% uncivil. Conversely, in these discussions 20% of the tweets against-topic were civil, and 80% uncivil. On the other hand, in civil againsttopic discussions, 80% of the tweets including an argument against the topic discussed were civil, and only 20% uncivil. Conversely, in this type of discussions 20% of the tweets in favor of the topic were civil, and 80% uncivil. Topics were counterbalanced across languages and civility of the discussion.

[TABLE 3]
Tweets and discussions were carefully created to control for potential confusion of argument, civility, and language. Across participants, a tweet with an identical argument was preceded 50% of the times by a civil heading, and 50% of the times by an uncivil one. Similarly, the same tweet appeared across participants 50% of the times in the L1 and 50% in the L2. Order of appearance of tweets was randomized across participants. Topic assignment to condition (discussion type and language) was also randomized.
All participants followed two civil pro-topic discussions (one in an L1 and one in an L2 -see first two rows in Table 3), and two civil against-topic discussions (one in an L1 and one in an L2 -see last two rows in Table 3). Each discussion was presented separately, using an ad-hoc program written in Visual Basic for this project. Specifically, participants read one tweet at a time and pressed the space bar to indicate that they had finished reading it. To ensure that they processed the information from the tweets, after reading each tweet, participants indicated its perceived relevance for the debate in a 1 to 9 scale. After the response, the program automatically presented a new tweet.
Questionnaires. We used the same questionnaires as Study 1 to assess familiarity, initial support and post-test risk perception, adapted for each of the four topics used. Attitude change was computed in the same way as in Study 1, by subtracting participants risk perception (inversed scores and centered) to pre-test support (centered).
Procedure. Participants worked individually in a quiet room at the university.
The whole session lasted approximately 30 minutes. First, participants completed the pre-test measures. Afterwards, they were introduced to follow the debates on Twitter. In order to familiarize participants with this rating task and the computer program, the four discussions were preceded by an additional training debate written in an L1, composed by 10 tweets in favor or against the use of school uniforms. After we ensured they had understood the task, they started to read the four debates about nanotechnology, [Paper currently under review -please do not cite without authors' permission] biofuels, fracking, and GMOs, rating each comment for relevance. A 1-minute pause was enforced after finishing each discussion, to ensure that participants could elaborate on the topic they had just read. Finally, they completed the post-test measures. After completing all the tasks, they took a short English test (Lemhöfer & Broersma, 2012).

Results
As was the case in Study 1, familiarity with the topics tended to be low (M biofuels =4.64,

Civility of discussion and attitude change
We used Chi-square tests to analyze the effect of text language (L1 vs. L2) and discussion view (civil pro-topic vs. uncivil pro-topic) on participants' attitude. When participants read Twitter discussions in the L2, the civility of the pro-topic supporters did not affect participants' attitude change,  2 (2) = 0.59, p = .75, Cramer's V = .17. But when participants read the discussion in their L1, data revealed that the civility of the pro-topic supporters significantly affected participants' attitude change,  2 (2) = 8.0, p = .01, Cramer's V = .63 (i.e. large effect, see Cohen (1998)). We conducted follow up Wilcoxon Rank tests to interpret the pattern of effects. Specifically, we compared the number of participants who changed their attitude towards the pro-topic view, against the others (i.e. no change or change against pro-topic view). Supporting our hypotheses, results showed that when participants read in the L1, they changed their attitude more often towards the pro-topic view when the majority of the Twitter pro-topic discussants [Paper currently under review -please do not cite without authors ' permission] expressed their view in a civil rather than in uncivil manner, Z = 2.7, p < .01, r = .58 (i.e. large effect, see Cohen (1998)). When reading in the L2, participants did not change their attitude contingent on the civility of the posts, Z = 0.3, p = .76, r = .06. Table 4, descriptive data suggested that the effect of discussion civility in the L1 is due both to the fact that participants tend to subscribe to a higher extent the pro-topic view when their supporters expressed themselves mostly in a civil manner (approximately 14% more than in the L2), and tend to go away from pro-topic views when their supporters discussed in an uncivil manner (approximately 9% less than in the L2). Of note is that the effect of discussion civility in the L1 is a large effect, which speaks about its relevance.

Discussion
Unlike the previous scenario described in Study 1 (a balanced news report followed by an additional anonymous online comment), in these Twitter debates there was a potential discussion between educated and rude users, and readers did not have a balanced text to refer to when judging the relevance of the comments. Findings of Study 2 were consistent with those of Study 1 and with Hypothesis 2 in showing that when reading a debate on emerging technologies in an L2, the civility or incivility of the tweets did not affect risk perception. On the contrary, as in Study 1, in debates written in participants' mother tongue, the civility of the topic significantly affected post-test risk perception measures (Anderson et al., 2014). This means that, independently from the argument, civil tweets in the L1 affect more attitude change, while uncivil tweets written in the L1 are less influential (Chen & Ng, 2016;Popan et al., 2019). Thus, the results of this study are also consistent with Hypothesis 1.
[Paper currently under review -please do not cite without authors' permission]

General Discussion
The present study tested whether uncivil online comments about emerging technologies written in a foreign language have also a polarizing effect on attitudes of bilingual people, as shown in a study conducted in English on a representative sample of US population (Anderson et al., 2014). Drawing on literature on emotional processing in an L2 (Costa, Foucart, Arnon, et al., 2014;Keysar et al., 2012), we expected that uncivil comments written in the L2 would have a less relevant role in polarizing attitudes about these socioscientific issues. Overall, the findings support with medium and large effects the hypotheses that the civility of the comments written in the L1 significantly affect risk perception in populations without extreme previous attitudes on emerging technologies (Anderson et al., 2014), while no effect is observed when users read comments in the L2. Specifically, in their mother tongues, participants gave more support to views conveyed by educated messages, and less to opinions supported by rude messages. This is likely because users of profanity generate less favorable impressions, thus individuals using uncivil expressions may be considered less trustworthy (DeFrank & Kahlbaugh, 2019) or less rational (Popan et al., 2019). But in a foreign language, participants did not change their support to a particular view regardless of the civility of the online comments.
As the scenarios used in our studies included a similar amount of relevant arguments pro and against the topics, a rational response was not to change the original perception of the controversy. We hypothesized that in the L2 participants experienced lower emotional reactions to civil and uncivil online comments (see Pavlenko, 2008Pavlenko, , 2012, which allowed for a more rational approach to interpreting the socio-scientific controversy.
Originally, the 'nasty effect' was found in a study using online news commentaries with English-speaking participants (Anderson et al., 2014). Of note is that we not only replicated the effect in a language other than English with bilingual participants (i.e. [Paper currently under review -please do not cite without authors ' permission] Spanish was our participants' mother tongue), but also in a totally different online scenario (Twitter). Specifically, we replicated the effects of online civility and language (L1 or L2) in two different online scenarios, in online news comment sections and on a social media platform (Twitter). In the scenario of online news deliberation, uncivil comments were addressed against a formal account of the controversy, which was written by a journalist. This situation could have increased participants' perception of the communication exchange as unfair, thus penalizing the fact that the news reporter could not counterargue the commentator. One may argue that such an unfair situation, and not the civility of the comments, could partially explain participants' negative reaction against uncivil online comments. In a second study using Twitter as the scenario, we ruled out this possibility, as both proponents and opponents of particular views on socio-scientific controversies could speak their voice to defend their position in a more or less polite manner. The fact that both studies converge on similar results suggests that comment civility, and not the interactivity of the online scenarios, is responsible for participants' attitude change when reading in their L1.
Our studies used single-language discussions, and therefore limited participants' consideration of the prestige of English as heuristic to evaluate the comments (Rösner et al., 2014;Tan et al., 2008). In multilingual online discussions, civility of comments, prestige of language and mother tongue could interact in complex ways. For example, the positive influence of a civil comment in the L1 found in our studies could be reduced if that comment addressed a text in an L2 with a higher perceived status. Future studies should address this issue.
This study also comes with limitations. In both experiments, participants selfassessed their knowledge and support for emerging technologies, but like in Anderson's et al. study (2014), their actual knowledge was not tested. High background knowledge [Paper currently under review -please do not cite without authors' permission] on a controversy could allow participants to move beyond the civility of the comments and to focus on the conceptual discussion and the reliability of the sources (Bråten, Strømsø, & Salmerón, 2011). In addition, our participants did not hold strong attitudes towards the topics discussed in the studies, which could have limited the emotional reactions elicited by the text arguments (List & Alexander, 2017). In the future, more controversial topics could be chosen, such as national politics, feminism, or politicized socio-scientific issues (e.g. vaccines, abortion, climate crisis, etc.), where participants' previous attitudes and emotions may be stronger (Kleinke, 2008;Sampietro & Valera-Ordaz, 2015). For example, on the issue of abortion, Chen and Ng (2016)  Future studies could test if these findings are also true for comments written in an L2 or if the argument is more influential in a second language, as could be expected from our results.
Another limitation concerns our sample of bilingual participants, as most of them acquired the second language (i.e. English) through formal education during adolescence and adulthood. Although English as a second language is quite popular in Spain, participants' mother tongue (i.e. Spanish) remains predominant in society. Results may not be generalized to people with different linguistic trajectories, such as those who live in bilingual countries or regions, or immigrants, because early age of acquisition, high proficiency, language learning via immersion, and high-usage frequency may increase the emotional resonance of a second language (Caldwell-Harris, 2014). Future research should replicate our experiments with people with different L2 levels or with different [Paper currently under review -please do not cite without authors ' permission] languages, in order to test differences in attitude change depending on the proficiency in the second language.