Psychometric properties and validation of the Spanish versions of the Overall Anxiety and Depression Severity and Impairment Scales

Background: Anxiety and depressive disorders are the most frequent disorders for which patients seek care in public health settings in Spain. This study aimed at validating the Overall Anxiety Severity and Impairment Scale (OASIS) and the Overall Depression Severity and Impairment Scale (ODSIS), which are brief screening scales for anxiety and depression consisting of only five items each. Methods: The study was conducted in a Spanish clinical sample receiving outpatient mental health treatment (N=339). A subsample of participants (n=219) was assessed before and after receiving a course of cognitivebehavioral treatment. Results: The results revealed excellent internal consistency estimates (Cronbach's alpha for the OASIS and the ODSIS was 0.87 and 0.94, respectively), along with promising convergent and discriminant validity and testcriterion relationships (i.e., moderate correlation with other measures of depression and anxiety, as well as with neuroticism, quality of life, adjustment, and negative affect). A one-dimensional structure was obtained for the OASIS and the ODSIS. The ROC analyses indicated an area under the curve of 0.83 for the OASIS and the ODSIS when predicting moderate-to-severe anxiety and depression, respectively. Good sensitivity to therapeutic change was also evidence and the analysis of the sensitivity as a function of 1-specificity area suggested a cutoff value of


Introduction
Anxiety and depressive disorders, also known as emotional disorders (EDs; Barlow, 1991) are common conditions that negatively impact functioning (Campbell-Sills et al., 2009). According to the World Health Organization (WHO), in Spain alone, anxiety and depressive disorders affect nearly 2 million and 2.5 million people, respectively, which amount to up to 5.2% of the population in the country (WHO, 2017). Not surprisingly, EDs are largely prevalent in primary care and specialized mental health settings in Spain (Gutiérrez-Fraile et al., 2011;Muñoz-Navarro et al., 2017;Ruiz-Rodríguez et al., 2017). The present study aims at adapting two short screening tools for anxiety and depressive symptoms to be used in Spanish public care.
Several authors have indicated that diagnosis and monitoring of patients with EDs in Spanish public health settings is problematic. For example, a study by Castro-Rodríguez and colleagues (2015) suggested that half of the patients with an ED are not correctly diagnosed in primary care. Consequently, a large number of individuals with ED are not properly referred to specialized care and do not receive the most optimal treatment (Martín Jurado et al., 2012). While there might be several factors explaining this, some authors have attributed the previous to healthcare overload, time

ACCEPTED MANUSCRIPT
A C C E P T E D M A N U S C R I P T constraints during consultations, and family physicians' insufficient familiarity with the assessment and treatment of psychiatric disorders (Muñoz-Navarro et al., 2017). . It is important to note that it is at this first level of care where correct identification of those who need specialized interventions is of paramount importance, so quick and straightforward assessment tools for an early detection of EDs are fundamental . A similar problem occurs in specialized mental health settings in Spain, where limited resources (e.g., reduced number of professionals, especially psychologists) has shown to negatively impact the quality of services provided (Gabilondo et al., 2011). An example of this is the long waiting lists that currently exist in Spanish public mental health settings, that is, from 45 to up to 75 days between appointments (Chueca et al., 2003;González et al., 2008). A quick and feasible assessment tool for the evaluation of anxiety and depressive symptoms to be used in routine care would benefit psychologists and psychiatrists by reducing the time they require for the assessment of the patients' progress through treatment.
Many self-report screening measures have been designed to assess specific diagnostic categories (e.g., Panic Disorder Severity Scale [PDSS]; Shear et al., 2001). Although these scales are useful to ascertain whether an individual is likely to meet criteria for a specific disorder, the administration of a large number of screening measures (e.g., one for each ED diagnosis) is not feasible in public health settings. Thus, transdiagnostic measures that capture the spectrum of ED presentations may be particularly useful in such settings (Campbell-Sills et al., 2009). Furthermore, traditional self-report measures typically assume that the frequency of symptoms reflects the severity of the disorder, while ignoring the key role of functional impairment or distress for diagnosis (Ito et al., 2014).
The Overall Anxiety Severity and Impairment Scale (OASIS; Norman et al., 2006) and the Overall Depression Severity and Impairment Scale (ODSIS; Bentley et al., 2014) were developed in response to the limitations of existing screening tools. Both instruments were designed to assess transdiagnostic symptoms and functioning associated with anxiety and depressive disorders, respectively. The OASIS and the ODSIS have five items each and can be administered in only 2-3 minutes. Both measures have been shown to be useful in the assessment of patients' progress through treatment (Barlow et al., 2017;Norman et al., 2013;Osma et al., 2015). The OASIS has been validated into English (Norman et al., 2006), Japanese (Ito et al., 2014), and Dutch (Hermans et al., 2014). An online version of this measure has also been recently been validated into Spanish in a sample of individuals with milder depression and anxiety (González-Robles et al., 2018). A single-factor model including all items has been replicated across investigations (Bragdon et al., 2016;Campbell-Sills et al., 2009;González-Robles et al., 2018;Hermans et al., 2014;Ito et al., 2014;Moore et al., 2015;Norman et al., 2013;Norman et al., 2011;Norman et al., 2006) and a good internal consistency (i.e., between .80 and .94) has been reported (Hermans et al., 2014;Ito et al., 2014;Norman et al., 2006).
Similar to the OASIS, the ODSIS has been validated into English (Bentley et al., 2014) and Japanese (Ito et al., 2015). A single-factor solution has been proposed for the ODSIS and internal consistency findings have been excellent (Bentley et al, 2014;Ito et al., 2015). The cutoff points for the ODSIS have ranged from between 7 and 11 with clinical and non-clinical samples altogether (Bentley et al., 2014;Ito et al., 2015).

ACCEPTED MANUSCRIPT
The present study goal was to replicate and extend prior research on the psychometric properties of the OASIS and the ODSIS in a Spanish outpatient sample of mental health patients with EDs. Specifically, we aimed at replicating the findings of the validations conducted in the USA, the Netherlands, Spain, and Japan, thus providing further evidence for the transcultural validity of these scales. The study also had a novel goal, that is, to determine the sensitivity of both measures to therapeutic change after a transdiagnostic, cognitive-behavioral treatment for EDs, namely, the Unified Protocol (Barlow et al., 2017; Blind note).
The following hypotheses were proposed: 1) for both scales, a one-dimensional structure will be replicated; 2) an acceptable internal consistency will be obtained; 3) convergent and discriminant validity will be established through significant positive correlations between the OASIS and the ODSIS and measures assessing similar constructs, as well as weaker correlations with measures of different constructs; 4) the ROC curve will reveal an optimal cut-off point of approximately 10 for both measures; and finally, 5) adequate sensitivity to therapeutic change will be indicated by significant and positive correlations between the change detected by the reference measurements (the Beck Anxiety Inventory and Beck Depression Inventory-II) and the change detected by the OASIS and the ODSIS.

Participants
Participants were drawn from a large effectiveness trial of the Unified Protocol (UP) in the Spanish public mental health system (Blind note). The sample for the present study was composed of 339 participants of whom 272 were women (80.2%). Mean age in the sample was 42.63 years (SD = 12.79, range = 18-77). Table 1 shows additional socio-demographic characteristics. The selected participants adults seeking psychological assistance in the national health system. The sample was obtained in 8 public mental health centers of Spain after the approval of their respective ethics and research committees (for more information, Blind note): Blind note. All participants had a primary ED diagnosis. The inclusion criteria to participate in the study were: 1) Principal diagnosis (most interfering and severe) of anxiety disorder, mood disorder, adjustment disorder; 2) The patient is over 18 years of age; 3) The patient is fluent in the language in which therapy is performed (Spanish in the present study); 4) The patient is able to attend to the evaluation and treatment sessions and signs the informed consent form; 5) Patients taking pharmacological treatment for their emotional disorder are asked to maintain the same dosages and medications for at least 3 months prior to enrolling in the study and during the whole treatment. The exclusion criteria were: 1) The patient presents a severe condition that would require to be prioritized for treatment, so that an interaction between both interventions cannot be ruled out. These include a severe mental disorder (bipolar disorder, schizophrenia, or an organic mental disorder), suicide risk at the time of assessment, or substance use in the last three months (excluding cannabis, coffee, and / or nicotine); 2) The patient has previously received 8 or more sessions of psychological treatment with clear and identifiable CBT principles within the past 5 years.
From the total sample (N = 339), we extracted the subsample of 219 participants who completed two assessments (one before psychological treatment onset and the other at the end of treatment) to assess the sensitivity to therapeutic change of the OASIS and the ODSIS. Mean age in the sample was 43.25

ACCEPTED MANUSCRIPT
A C C E P T E D M A N U S C R I P T years (SD = 12.21, range=18-72) and 79.5% of them were women. When comparing those who completed both evaluations against participants who only answered to one assessment, we found no significant differences in the OASIS (t = .184, p < .366), the ODSIS (t = .210, p < .834), the BAI (t = .306, p < .481), and the BDI-II (t = .901, p < .246).

Procedure
When seeking care at a Spanish public mental health center, potential participants were evaluated by professional specialists (clinical psychologists, psychiatrists, or resident psychologists) with the lifetime version of the Anxiety Disorders Interview Schedule for DSM-IV (ADIS-IV-L; Brown et al., 1994) or the Clinical Structured Interview for Axis I Disorders of the Clinical Version of the DSM-IV (SCID-I-CV; First et al., 1997). This was done to guarantee the existence of a primary diagnosis of ED. We did not use the ADIS-5 because it is not available in Spanish yet. Participants meeting the requirements for the primary study (i.e., Blind note) were given information about data confidentiality and were asked to provide written informed consent. The same day, they completed the evaluation tools in person and on paper and pencil.
The proposed treatment is a cognitive-behavioral intervention called Unified Protocol for the transdiagnostic treatment of emotional disorders (PU; Barlow et al., 2018). The UP focuses on the identification of dysfunctional emotion regulation strategies that are common to all emotional disorders and helps patients to tolerate and cope with emotions in a more adaptive way (Barlow et al., 2018). The UP consists of 8 core intervention modules that can easily be delivered in group format (Bullis et al., 2015;Osma et al., 2015). For the present study, a 12-session group format was conducted in different mental health settings in Spain. The treatment had a duration of approximately three months and patients received a 2-hour session weekly. Both the patient and the therapist manuals have been previously translated into Spanish (Barlow et al., 2015) Instruments Semi-structured Interview Schedule for Anxiety Disorders Lifetime version (ADIS-IV-L; Brown et al., 1994); translated into Spanish by Botella and Ballester (1997). This semistructured interview evaluates anxiety and depressive disorders, as well as somatoform and substance use disorders according to the fourth edition of the Diagnostic and Statistical Manual of Mental Disorders (DSM-IV; APA, 1994). Test-retest reliability of this interview ranges from .67 to .86 (Brown et al., 2001). First et al., 1997), adapted and validated into Spanish by First et al. (1999). This semi-structured interview is used to establish DSM-IV Axis I diagnoses (APA, 1994). The response scale ranges from 1 (Symptom absent or false) to 3 (True or clinically present). Test-retest reliability varies from .37 to 1, depending on studies (Rogers, 2001).

Clinical Structured Interview for Axis I Disorders of the Clinical Version of the DSM-IV (SCID-I-CV;
Beck Depression Inventory (BDI-II; Beck et al., 1996), validated into Spanish by . The BDI-II assesses depressive symptoms. It consists of 21 items, each with four different statements that reflect an increase in the degree of depression. A score of 0 indicates the absence of depressive symptoms and 3 reflects the most severe levels of depression. The scale has a 0-to-63 range. In the

A C C E P T E D M A N U S C R I P T
Spanish validation with a clinical sample, a Cronbach alpha of .89 was obtained (Sanz et al., 2005). Cronbach's alpha for the BDI-II in the present sample is .917. The cut-off point used is 20, as suggested in the literature .
Beck Anxiety Inventory (BAI; Beck & Steer, 1993), validated into Spanish by Sanz and Navarro (2003). The BAI is composed of 21 items that evaluate anxiety symptoms. Responses use a 4-point Likert scale ranging from 0 (Not at all) to 3 (Severely). The total score can range from 0 to 63. The Spanish validation study in a clinical sample revealed a Cronbach alpha of .91 (Sanz et al., 2012). Cronbach's alpha for the BAI in the present sample is .926. The cut-off point used is 16, as recommended in previous research (Sanz & Navarro, 2003).
Positive and Negative Affect Scale (PANAS; Watson et al., 1988), validated into Spanish by Sandín et al. (1999). The PANAS consists of 20 items that measure both positive and negative affect, with 10 items for each dimension. Respondents indicate the extent to which they experience different emotions representing both dimensions. Items use a 5-point Likert response scale ranging from 1 (Not at all or very slightly) to 5 (Extremely). In its Spanish validation, Cronbach alphas between .87 and .91 were obtained for both men and women in positive and negative affect (Sandín et al., 1999). In the present study, Cronbach's alpha for the PANAS are .889 for Positive Affect and .893 the Negative Affect.
Quality of Life Index (QLI; Ferrans & Powers, 1985), validated in Spanish by Mezzich et al. (2000). This measure is comprised of 10 items, each of which refers to a different dimension of quality of life. Each item is evaluated on a 10-point scale ranging from 1 (Bad) to 10 (Excellent). An overall index is calculated by summing all items (total range is 10 to 100). In the Spanish validation study (Mezzich et al., 2000), the QLI obtained a Cronbach alpha of .89. The Cronbach's alpha in the present sample is .869.
Scale of Maladjustment (SM; Echeburúa et al., 2000). This scale reflects the extent to which people's current problems affect different areas of their daily lives, including work or studies, social life, leisure time, couple relationship, and day-to-day maladjustment. All items are evaluated on a 6-point response scale ranging from 0 (Not at all) to 5 (Very severe). The total score ranges from 0 to 30, and is obtained by summing all items. Higher scores indicate poorer adjustment. The scale had excellent testretest reliability (r = .86) and an internal consistency of .94 (Echeburúa et al., 2000). The Cronbach's alpha in the present sample is .830.
NEO-Five-Factor Personality Inventory (NEO-FFI; Costa & McCrae, 1999). The NEOFFI consists of 60 items that provide a quick and general measurement of the Big Five personality dimensions. These are neuroticism, extraversion, openness to experience, agreeableness, and conscientiousness. The responses are rated on a 5-point Likert scale ranging from 0 (Strongly disagree) to 4 (Strongly agree). Each personality dimension is composed of 12 items (scores for each dimension range from 0 to 48). The internal consistency of the NEO-FFI dimensions ranges between .82 and .90. Cronbach's alpha values in the present sample are .760 for Neuroticism, .809 for Extraversion, .769 for Opening to the Experience, .709 for Agreeableness, and .708 for Conscientiousness.

ACCEPTED MANUSCRIPT
Overall Anxiety Severity and Impairment Scale (OASIS; Norman et al., 2006). The scale presents five items. This items are related with the frequency of symptoms (During the last week, did you often feel anxious?), their intensity (During the last week, when you felt anxious, to what extent was your anxiety intense or severe?), their interference with the person's work or school life (During the last week, to what extent did anxiety interfere with your ability to do the things you had to do regarding work, school, or your home?), and their interference with social life (During the last week, to what extent has anxiety interfered with your social life and your relationships?). All items are rated on a 5-point Likert scale ranging from 0 (I didn't feel anxious) to 4 (Constant anxiety). The total scale score, which ranges from 0 to 20, is obtained by summing all items. Higher scores are indicative of greater severity and functional impairment as a result of anxious symptoms (Norman et al., 2006). For this study, we used the Spanish translation of the original scales (Appendix 1), which was carried out by Osma and García-Palacios by means of a back translation process. The Spanish version of the OASIS was obtained after a backward translation to ensure conceptual equivalency. Forward translation into Spanish was carried out by an independent native-speaking translator who was proficient in English. The Spanish version was back translated into English by an independent English native translator. No significant changes were required to obtain the final version. Cronbach's alpha for the OASIS in the present sample is presented in the results section.
Overall Depression Severity and Impairment Scale (ODSIS; Bentley et al., 2014). Similar to the OASIS, the ODSIS includes five items, which are related with the frequency of symptoms (During the last week, did you often feel depressed?), their intensity (During the last week, when you felt depressed, to what extent was your depression intense or severe?), their interference with the person's work or school life (During the last week, to what extent did depression interfere with your ability to do the things you had to do regarding work, school, or your home?), and their interference with social life (During the last week, to what extent has depression interfered with your social life and your relationships?). The third item of the OASIS and ODSIS differs, however. In the OASIS, this item evaluates the presence of avoidant behavior (During the last week, how often have you avoided situations, places, objects, or activities because of your anxiety or fear?), whereas the ODSIS measures the loss of interest or difficulty in participating in activities (During the last week, how often have you had difficulty to carry out or feel interest in activities that you normally enjoy because of your depression?). Items in the ODSIS use a 5 point Likert response scale ranging from 0 (I didn't feel depressed) to 4 (Constant depression). Again, total scores range from 0 to 20 after summing all items, with higher values revealing a more severe and functional impairment as a result of depressive symptoms (Bentley et al., 2014). As with the OASIS, we used the same process to translate the Spanish version of the ODSIS, which can be found in Appendix 1. Like with the OASIS, the Cronbach's alpha value for the ODSIS in the present sample is presented in the results section.

Demographic and sex differences in study variables
Analyses were carried out using the statistical package IBM SPSS Statistics version

ACCEPTED MANUSCRIPT
A C C E P T E D M A N U S C R I P T 22.0 for Windows (IBM Corp., 2013) and Mplus version 6 program (Muthen & Muthen, 1998). Firstly, sociodemographic characteristics of the total sample were analyzed (N = 339), as well as the means and standard deviations for all measures. Next, sex differences in the OASIS and the ODSIS were investigated. Because scores in men and women were not normally-distributed (Kolmogorov-Smirnov normality test), a Mann-Whitney non-parametric U-test was selected for this analysis.

Model fit, internal consistency, and convergent and discriminant validity
Next, we investigated the internal structure and the convergent and discriminant validity of the OASIS and the ODSIS. The internal structure of the OASIS and the ODSIS were calculated with a Confirmatory Factor Analysis (CFA). Because we had no missing data, we were able to use a maximum likelihood estimator with standard errors and a mean-adjusted chi-square test statistic that are robust to non-normality (MLM). The one-factor model fit was evaluated with the Satorra-Bentler robust Chi-square test, the root mean square error of approximation (RMSEA), the comparative fit index (CFI), the Tucker-Lewis index (TLI), and the standardized root mean square residual (SRMR). RMSEA and SRMR scores below .05 indicate good model fit and values below .08 reflect acceptable fit. CFI and TLI scores above .95 reveal an excellent fit (Hu & Bentler, 1999). The convergent and discriminant validity of the OASIS and the ODSIS was explored by calculating their Pearson correlations with well-established measures of anxiety (BAI) and depression (BDI-II), respectively. To test the hypothesis that the OASIS would be more strongly associated with the BAI and the ODSIS would correlate more with the BDI, the strength of these bivariate associations was compared using the cocor R package (Diedenhofen & Musch, 2015). Pearson correlations with several constructs closely related to anxiety and depressive symptoms (i.e., personality, affect, quality of life, and adjustment) were also conducted to investigate the convergent and discriminant validity of the OASIS and ODSIS. Specifically, the correlation between the aforementioned related constructs and the OASIS and the ODSIS are expected to be significant (i.e., a positive association with neuroticism, negative affect, and maladjustment, and a negative correlation with extraversion, positive affect, and quality of life or similar constructs), but smaller when compared to the correlation between the OASIS and the ODSIS with the BAI and the BDI-II (e.g., Campbell-Sills et al., 2009;Ito et al., 2015).

Optimal cut-offs and sensitivity to change
As a final step, we calculated the optimal cut-off points for the two aforementioned scales using a ROC curve analysis, as well as their sensitivity to capture therapeutic change. In relation to the former goal, sensitivity and specificity were investigated for several cut-offs, following the guidelines of Zweig and Campbell (1993). Positive and negative predictive values and likelihood ratios were also investigated. The sensitivity and specificity, as well as the positive and negative predictive values and likelihood ratios of the OASIS and the ODSIS were calculated in relation to the BAI and BDI-II, respectively. The BAI and BDI-II cut-offs used to classify individuals as severely anxious and depressed were 16 and 20, respectively (Sanz, 2014;. Regarding sensitivity to therapeutic change, we first calculated variables for the OASIS, the ODSIS, the BAI, and the BDI-II by subtracting the total score after treatment from the total score prior to treatment. Next, Pearson correlations between OASIS and BAI, and ODSIS and BDI-II, respectively, change scores were conducted. Also, we calculated Cohen's d effect size to help interpret the findings. This analysis was

A C C E P T E D M A N U S C R I P T
carried out with the subsample of participants (n = 219) that participated in two assessment points (pre-treatment and post-treatment). Table 1, the most frequent primary diagnoses in the sample (N=339) were adjustment disorders (n=77, 22.7%), followed by major depressive disorders (n=62, 18.3%), and generalized anxiety disorders (n=43, 12.7%). Ninety nine participants (26.8%) also had a secondary diagnosis, of which major depressive disorder (n=19, 19.2%) and non-specific anxiety disorder (n=14, 14.1%) were the most frequent. Table 2 shows the mean scores and standard deviations for all instruments. The average score for the OASIS and the ODSIS was 10.45 (SD = 4.49, range 0-20) and 9.87 (SD = 5.14, range 0-20), respectively.
The analysis of internal consistency indicated good reliability of the OASIS (α = .867). Table 2 also reports the results of the convergent and discriminant validity of the OASIS, as reflected by Pearson correlations. With regard to convergent and discriminant validity, the OASIS correlated moderately both with the BAI (r = .57, p < .001), and the BDI-II (r = .60, p < .001). According to the results obtained with the cocor R package, both correlations were comparable in strength (the 95% CI for the difference includes zero: -.0964, 0.0356). The OASIS and the ODSIS clearly correlated with each other (r = .69, p < .001). Taking the criterion validity of the OASIS, positively the scale correlated positively with a measure of neuroticism (r = .43), maladjustment (r = .57), and negative affect (r = .49), and were negatively associated with a measure of extraversion (r = -.30), quality of life (r = -.57), and positive affect (r = -.34).
All correlations were significant at an alpha level of .001.
The final analyses for the OASIS included an exploration of its optimal cut-off points, followed by an analysis of its sensitivity to change. In relation to the former, the ROC analysis indicated an area under the curve of .83 when predicting moderate-to-severe anxiety as measured with the BAI (95% CI = .78, .88). We considered 10 to be the best cut-off point to discriminate between people who had moderateto-severe anxiety symptoms and those with milder or no symptoms because of the good balance between sensitivity (.75) and specificity (.73). With this cut-off point, 74.9% of correct classifications were reached (Youden Index of .49). Positive and negative predictive values were 91.4% and 38.8%, respectively. Positive and negative likelihood ratios were 2.8 and 0.3, respectively. The analysis of

ACCEPTED MANUSCRIPT
A C C E P T E D M A N U S C R I P T sensitivity to change (n = 219) revealed a correlation of .40 (p< .001) between the BAI and the OASIS. Cohen's d effect size was .50, showing moderate differences between assessments.
Similar to the OASIS, the analysis of internal consistency indicated excellent reliability of the ODSIS (α = .936). The results of the convergent and discriminant validity of the ODSIS are reported on Table  2. The ODSIS correlated moderately with the BDI-II (r = .68, p < .001) and weakly with the BAI (r = .42, p < .001). Different to the results obtained with the OASIS, the analyses indicated differences in strength between the aforementioned correlations (the 95% CI for the difference did not include zero:.1919, 0.3335). In the assessment of criterion validity, the ODSIS was found to positively correlate with a measure of neuroticism (r = .43), maladjustment (r = .58), and negative affect (r = .42), and were negatively associated with a measure of extraversion (r = -.33), quality of life (r = -.64), and positive affect (r = -.48). All correlations were significant at an alpha level of .001.
We next explored the optimal cut-off value in the ODSIS to discriminate individuals with moderateto-severe depressive symptoms in the BDI-II from those with milder or no symptoms. The ROC analysis indicated an area under the curve of .83 (95% CI = .78, .87). Again, a cut-off of 10 was considered to provide an adequate balance between sensitivity (.74) and specificity (.79). Using this cut-off score, 75.2% of individuals were correctly classified (Youden Index of .53). Positive and negative predictive values were 91.7% and 48.4%, respectively. Positive and negative likelihood ratios were 3.5 and 0.3, respectively. Finally, the analysis of sensitivity to change (n = 219) indicated a correlation of .56 (p < .001) between the BDI-II and the ODSIS. Cohen's d effect size was .73, showing moderate differences between both evaluations.

Discussion
This is the first study to explore the psychometric properties of the OASIS and ODSIS in a Spanish sample of patients diagnosed with ED in a specialized mental health setting, and, consequently, the contributions made are innovative and necessary.
The results regarding the internal structure of the OASIS and the ODSIS, as obtained with the CFA, are consistent with other studies and indicate that a one-factor structure for the five items has an optimal fit (Bentley et al., 2014;Bragdon et al., 2016;Campbell-Sills et al., 2008;Hermans et al., 2014;Ito et al., 2014;Ito et al., 2014;Moore et al., 2015;Norman et al., 2013;Norman et al., 2011;Norman et al., 2006).The present study revealed that the influence of the covariance between Items 1 and 2 has to be taken into account when exploring the internal structure of the OASIS and the ODSIS, as noted in previous similar studies (Campbell-Sills et al., 2009). As revealed in the present

ACCEPTED MANUSCRIPT
A C C E P T E D M A N U S C R I P T investigation, the reason for this is that the absence of anxious or depressive symptoms (score of 0 on Item 1) unequivocally leads to a score of 0 on Item 2 (intensity), so the items are dependent on each other.
Data from the present study suggests that a cut-off point of 10 for both instruments results in the best rate of correct clinical/non-clinical classifications (72.5% of correct classifications in the OASIS and 75.2% in the ODSIS), which is different to the value proposed by some previous investigations (e.g., Hermans et al., 2014;Norman et al., 2013). Both validation studies proposed cut-offs of 5 in clinical samples, with 82.5% and 91% of correct classifications respectively. As noted by these authors, the different cut-off points proposed tend to be attributable to the particular characteristics of their samples (% of women and men, variety of EDs, and % comorbidity disorders in the sample, among others) and the setting where the studies were conducted (mental health settings or community services). The previous might also explain why our recommended cut-off values are also higher than those indicated in a recent Spanish validation of the online version of the OASIS in a subclinical sample with ED, where a cut-off of 7.5 was proposed (González-Robles et al., 2018). These differences could be, at least partly, attributable to the differences in the samples and the evaluation methods used in the investigations (i.e., individuals who attended a University clinic online version of the questionnaire in the study by González-Robles, as opposed to patients who attended public mental health settings and answered to the questionnaires on a paper and pencil version in our study. In favor of this conclusion, we observed that anxiety and depressive symptoms were more severe in the present study when compared to the investigation by González-Robles and colleagues (2018). Specifically, mean scores in our sample were significantly higher in the OASIS (t = 5.78, p < .001), the ODSIS (t = 6.21, p < .001), the BAI (t = 7.52, p < .001), and the BDI-II (t = 5.16, p < .001) when compared to the aforementioned investigation in Spanish settings.
Because sample characteristics are important factors in the determination of cut-off scores, we will now discuss our findings in relation to investigations including a sample of outpatients presenting a primary diagnosis of ED, similar to the present study. Our proposed cut-off point for the OASIS is the same as the one suggested by Bragdon and colleagues (2016), who with obtained a 76% of correct classifications with a cut-off of 10, and also close to the recommendation of Ito and colleagues (2015), who proposed a cut-off score of 9 with 68% of correct classifications. Somewhat lower cut-off values have also been proposed, including the work of Campbell-Sills and colleagues (2009), Moore and colleagues (2015), and Norman and colleagues (2011), who recommended a cut-off point of 8 for the OASIS, with a percentage of correct classifications of 87%, 67%, and 78%, respectively. Regarding the ODSIS, our results are similar to those obtained by Bentley and colleagues (2014) and Ito et al. (2015), who recommended a cut-off point of 8 (82% correct classifications) and 11 (82.3% of correct classifications), respectively. What these different cut-off scores indicate is that validation and replication studies, such as the present, are important to increase the reliability of findings across settings and countries.

A C C E P T E D M A N U S C R I P T
Results on the convergent and discriminant validity of the OASIS and the ODSIS were also satisfactory. The correlations between both measures and the remaining instruments were significant and in the expected directions. At this stage, it is important to note that, contrary to our expectations, the OASIS did not correlate stronger with a measure of anxiety (BAI) than with the measure of depression (BDI-II). This result may be due to high rates of comorbidity between these two constructs (Brown & Barlow, 2009). However, the correlations between the ODSIS and the BDI-II and the BAI were consistent with our expectations (i.e., stronger for the BDI-II), so another possibility is that people with a primary diagnosis of an anxiety disorder perhaps did not meet criteria for a diagnosis of depression, but presented high scores on depression symptoms. In fact, comorbitidy of symptoms in our sample was evident. For instance, note that the 209 participants (61.7% of the sample) who presented moderate or severe scores on the BAI, also presented moderate or severe scores on the BDI-II.
In the evaluation of sensitivity to therapeutic change, both changes in the OASIS and the ODSIS moderately correlated with changes with the BAI and the BDI-II, respectively. These findings suggest that both instruments can capture changes in anxious and depressive symptoms after cognitivebehavioral treatment. This analysis was also performed in the study of Norman and colleagues (2013) for the OASIS, with a sample of women who had been victims of partner violence. Their results also showed positive correlations with the reference measures, namely the Anxiety subscale of the Brief Symptom Inventory (BSI-18; Derogatis & Melisaratos, 1983) and the Clinician Administered Post-Traumatic Stress Disorder Scale for DSM-IV (CAPS; Blake et al., 1995). They found that the change scores in the OASIS correlated highly with change scores in both the CAPS (r=.64, p<.01) and the BSI-Anxiety (r=.92, p<.001). The authors noted that future studies should examine sensitivity to change of the measures in a sample that also included men and higher diagnostic variability, both of which we have addressed in the present study.
The present study is not without limitations. First, only a subsample of participants provided data for all assessment points, so the sensitivity to therapeutic change could only be calculated in a subset of participants. It would be interesting to replicate these analyses using a larger sample. Another limitation is that the distribution of the sex was not homogeneous, as women were more frequently represented in the sample. While this might have influenced the present investigation results, it should be noted that the prevalence of women with ED is significantly higher than that of men (WHO, 2017), so the study findings are actually representative of the distribution of EDs in the general population and therefore might be useful for general practice. Another shortcoming was that inter-rater reliability of diagnoses could not be investigated. This had implications, as the sensitivity to change could not be examined against diagnoses based on structured clinical interviews due to the inability to test their reliability. In addition, we were not able to analyze the test-retest reliability because all participants were enrolled in a research clinical trial about the efficacy and feasibility of the UP for transdiagnostic treatment of ED, which would affect the stability of scores (Blinded). It is also important to note that the cut-off points for the OASIS and the ODSIS were calculated according to the established cut-offs for the BAI and BDI-II, as opposed to using a sample of healthy controls and instead of using a clinical-based criterion. While the BAI and the BDI-II are well-established screening tools, the possibility that these measures introduce some bias in diagnosis should not be ignored and the combination of self-reports and clinical-based criteria should be addressed in future research. Finally, the cut-offs proposed in this study are based on a sample of people seeking treatment for ED in a specialized mental care in Spain and included patients with a primary ED diagnosis, so the findings may not be generalizable to other settings.

Conclusions
This study adds to the growing literature supporting the validity of the OASIS and ODSIS as screening tools for anxiety and mood disorders. It also provides data about their sensitivity to change after a psychological treatment, which makes them useful tools to be included in routine care to rapidly assess the effectiveness of interventions. Additionally, this investigation provides further evidence for the utility of both scales in a new country, namely, Spain.
The validation of the OASIS and the ODSIS in Spanish offers large potential benefits to public health services. First, the OASIS and ODSIS are easily administered screening tools with adequate psychometric properties, allowing practitioners to detect, in just three minutes, individuals who present anxious or depressive symptomatology. In addition, because these scales do not require specialized training for their application, the OASIS and ODSIS may help family physicians to make important decisions, such as referring a person who needs specialized care. In to the case of mental health professionals working in specialized care, these scales can be useful to monitor the evolution of anxiety and depression symptoms over the course treatment in a quickly and easy way.
In conclusion, the OASIS and ODSIS have the potential to help improve patients' mental health on a scalable level, facilitating specialized, efficient, and prompt care for those who need it.

ACCEPTED MANUSCRIPT
A C C E P T E D M A N U S C R I P T 10 -50 · · · · · · · · · (.89) Note: OASIS: Overall Anxiety Severity and Impairment Scale; ODSIS: Overall Depression Severity and Impairment Scale; BAI: Beck Anxiety Inventory; BDI-II: Beck Depression Inventory; NEO-FFI N: NEO-Five-Factor Personality Inventory Neuroticism; NEO-FFI E: NEO-Five-Factor Personality Inventory Extraversion; QLI: Quality of Life Index; SM: Scale of Maladjustment; PANAS POS: Positive and Negative Affect Scale-Positive; PANAS NEG: Positive and Negative Affect Scale-Negative. Minimum and maximum scores correspond to direct scale scores. Scale reliability corresponds to the Cronbach's alpha coefficient and is presented in parentheses. *p < .01 **p < .001

ACCEPTED MANUSCRIPT
A C C E P T E D M A N U S C R I P T In all cases, p < .001.