Ethical strategy focus and mutual fund management: performance and persistence

The aim of this study is to analyze whether managers, practitioners and individual investors could obtain higher risk-adjusted returns by allocating their investments to funds that integrate specific levels of socially responsible (SR) criteria in their portfolios. This is achieved by comparing the performance of mutual funds according to their SR characteristics: environmental, governance, social, and sustainability attributes. For a large sample of 3,920 equity SR mutual funds around the world, performance is measured using a multifactor model that incorporates relevant benchmarks according to the fund investment objectives, and using Carhart’s (1997) methodology to measure mutual fund performance persistence. In general, fund performance is not significant, the average being negative and close to zero. Funds achieving relatively high levels of SR attributes in their portfolios seem to experience overall worse performances. This evidence is, however, mainly driven by the behavior of worst-performing funds. Moreover, investing in the previous best SR funds could lead investors to greater overall returns in most areas and levels of SR attributes considered. This evidence highlights the role of managers in enhancing the returns of a portfolio with a well-defined SR investment policy. Therefore, there is no incompatibility between pursuing higher ethical (and sustainable) values as well as greater financial performances from investments—provided managers have the skills necessary to choose the right SR funds.


Introduction
Sustainable investing is transforming the panorama of the global financial industry. Interest in the perspective of long-term returns is growing, and the recently observed trends in favor of integrating sustainability concerns into investment worldwide are indicative of its momentum resulting in an explosive growth. Thus, according to the Global Sustainable Investment Alliance (GSIA)'s review, 1 from 2014 to 2016 the trend for sustainability embedded in investment represents a significant share of the market not only in Europe and Australia, where it accounts for approximately 50 percent of their professionally managed assets, but also in the United States and Canada, where its share of the market ranges from 22 to 38 percent. Investors and investment managers are aware of the opportunities provided by the Socially Responsible Investment (SRI); specifically, in 2017, more than one quarter of the world's professionally managed assets-roughly US$22.9 trillion-have some sort of sustainable investing mandate, with about US$8.7 trillion of that in the United States, US$12 trillion in Europe and the rest shared among other regions. 2 In this context, global growth in sustainable investing is aligned with an emergence of specialized literature on the expansion of organizations that meet environmental, social and governance (ESG) criteria into their strategies (see Humphrey et al., 2012;O'Rourke, 2003, among others). Specifically, the motivations for centering attention on the Socially Responsible (SR) mutual funds have been investigated by a number of authors, such as Renneboog et al. (2008b), Capelle-Blancard and Monjon (2012), and Silva and Cortez (2016), to cite just a few. Likewise, Busch et al. (2016) build on the idea that a reorientation towards a long-term paradigm for sustainable investments is important; their main argument is that embedding sustainable values into investments requires a progressive change in mindset toward long-term risk and opportunities arising from the incorporation of such perspective are countless. Moreover, Laurel-Fois (2016) provides evidence of a positive relationship between responsible investment and financial performance as per a risk mitigation effect derived from high screening intensity. The author argues that intensive screening let fund managers to gain advantages from selection strategies.
Despite this, there is a lack of studies analyzing the performance experienced by SR funds according to their SR strategy. In contrast to previous literature, our main goal is not 1 Global Sustainable Investment Alliance, 2017: "2016 Global Sustainable Investment Review". 2 Morgan Stanley's 2018 edition of the Sustainable Signals series: "Asset Owners Embrace Sustainability". 1 to compare SR to conventional funds 3 but rather to analyze the performance of SR funds with similar characteristics. In other words, we aim to analyze whether managers, practitioners and individual investors could obtain higher risk-adjusted returns by allocating their investments to funds that integrate specific levels of SR criteria in their portfolios. This is achieved by comparing the performance of socially responsible (SR) mutual funds according to their SR characteristics: environmental, governance, social, and sustainability attributes.
To address this issue, we first compare the performance of SR funds according to their SR strategy, reflected in the level of SR attributes accomplished in their portfolio. These characteristics refer to Environmental, Governance, Social, and Sustainability attributes. SR funds not reporting information on these characteristics are also included in the analyses as "undefined" funds. With the aim of avoiding any local bias related to the investment geographical area, we build groups of funds and repeat the analysis for each of them.
The empirical work focuses on equity SR mutual funds around the world for the period from January, 2000, to March, 2018. The sample is made by 3,920 funds. Among them, only 180 funds have data for the full sample period, meaning that the rest of the funds either disappeared or were set up during this period. Our analysis is therefore free from survivorship bias. We apply a multifactor model to these data in order to estimate abnormal performance.
In sum, this study contributes to the previous literature in several ways. Firstly, we analyze the risk-adjusted returns experienced by SR funds. Rather than comparing their financial results with those achieved by their conventional peers, we assess the SR fund performance in relation to the level of SR criteria integrated in their investment strategies by distinguishing among SR funds with higher and lower SR attributes in their portfolios. Secondly, we also differentiate SR funds according to their investment area in order to avoid any potential local bias. Bearing this aim in mind, we evaluate whether mutual funds achieving higher scores in their SR attributes underperform funds that focus less on their SR strategies. The analysis will go beyond the average (in order to prevent the emergence of effects mainly driven by the performance of the worst-performing funds), for which we follow a nonparametric approach based on kernel density estimation. As far as we know, this methodology has not been applied to analyze the differences in performance with regard to the SR characteristics of mutual funds, and constitutes the third 3 E.g., Bauer et al. (2005), Renneboog et al. (2008b) and Nofsinger and Varma (2014).
2 contribution of our study. Finally, our fourth contribution lies in assessing the SR funds' performance persistence through the recursive portfolio approach. This methodology has been frequently been applied to analyze the persistence of conventional funds but not of SR funds (see, for instance Bollen andBusse, 2005, 2001;Busse and Irvine, 2006;Cremers and Petajisto, 2009;Cremers et al., 2013;Gottesman and Morey, 2007). It reveals whether the performance differences between the best and the worst SR funds are persistent over time, and for which SR categories and areas analyzed this persistence holds. Providing responses to the objectives of the study might, a priori, be useful to professionals and investors aiming to enhance their risk-adjusted returns in order to identify skilled managers who can provide them with greater overall performances while integrating a relevant degree of ethical values in their investment portfolios.
The remainder of the paper is structured as follows: Section 2 presents a brief review of relevant literature and states the main hypotheses to be tested. Section 3 provides details on the methodologies used to measure fund performance and persistence. Section 4 describes the data used in the study, while Section 5 reports the results. Finally, Section 6 presents some concluding remarks. O'Rourke (2003) states that there are several reasons that explain the growth of SR funds in the financial markets around the world. Firstly, SR funds are a sophisticated option that adds value to the investment by following a non-purely financial orientation as other parameters are also considered (such as reputation, good corporate governance practice and environmental responsibility, among others). In this regard, Koellner et al. (2005) propose the basic principles and methods on which a sustainability rating for mutual funds could be based; specifically, they state that a variety of impacts-economic, social and ecological-should be considered in order to embed sustainability into investment processes. In addition, some authors analyze the intention to invest in a socially responsible manner (Palacios-González and Chamorro-Mera, 2018), and show that environmental as a sustainable criteria matter in the decisions of mutual funds because of their effect on the company's future financial performance (Said et al., 2013;Cai and Li, 2018). Moreover, Helminen (2000) introduces the concept of "eco-efficiency" linked to sustainable development as the integration of ecological, economic and ethical dimensions at the firm level; 3 indeed, ethics have also been integrated into mainstream business as a competitive strategy. In this sense, some studies highlight the impact of integrating ethics into the business strategy (Friede et al., 2015).

Literature review and hypotheses development
However, Jansson and Biel (2014) conduct a survey on SR fund industry and show that future SR investment is not influenced by social and environmental concerns. Rather, their findings suggest that financial beliefs about risk, as well as beliefs about increased market shares, are a relevant driving force underlying SR investment. From a financial perspective, SR mutual funds are investment vehicles which investors can easily access to, and their returns are comparable to those of conventional funds. Indeed, numerous studies have analyzed the financial performance of these portfolios by comparing them to their conventional peers, finding mixed results. On the one hand, some studies argue that SR funds outperform conventional funds (Galema et al., 2008;Kempf and Osthoff, 2007). Nonetheless, it appears that socially responsible investors are less sensitive to poor performance and the overall implicit benefit to the SRI practice dilutes any negative impact, as postulated by Bollen (2007), Benson and Humphrey (2008) and Renneboog et al. (2011), among others. On the other hand, studies such as Bauer et al. (2005) andRenneboog et al. (2008a,b) conclude that, in general, there are no significant differences between the performance of SR mutual funds and their conventional counterparts.
Up to this point, it is worth to note that SR mutual funds usually invest under different core values related to their ethical strategy. For instance, a SR mutual fund with a focus on environmental criteria does invest in companies that potentially contribute to the environment. Hence, attributes such as sustainability or environment should be implicitly related to the main investment objectives of a SR portfolio. Given that SR funds aim to maximize their financial results while achieving their SR goals, it is therefore of interest to observe the behavior of SR funds with the same characteristics or attributes. Most of the previous literature, however, does not focus on SR attributes. Some studies have compared the performance of different types of SR mutual funds, mainly in relation to green funds. In this line Mallett and Michelson (2010) find no differences in their comparison of returns of green and other SR funds. This same result is found by Climent and Soriano (2011), although this evidence varies over time. Furthermore, observing the level of SR attributes may help investors to distinguish between funds with the same label, but very different SR scores. This could be a solution to Silva and Cortez's (2016) proposal that investors should pay attention to the social performance of 4 green funds.
Our interest therefore lies in analyzing the relationship between SR attributes and performance. In this study, we aim to fill this gap in the literature by analyzing the performance of mutual funds focusing on similar SR attributes. Firstly, it should be noted that funds in the same SR categories (e.g., environmental) could achieve different scores in their SR portfolios attributes. In other words, two green funds, for example, can differ in the level of environmental characteristics related to the assets held in their portfolios. That is, one green fund could be considered "greener" than another. Accordingly, we should expect the first fund to yield higher SR scores related to environmental attributes than the second. Different levels of SR scores could therefore drive the performance distribution. For instance, in their comparison of US green funds with conventional funds Chang et al. (2012) find that green performance is not uniformly distributed across fund types. In another comparison, Nofsinger and Varma (2014) find that SR funds outperform (underperform) conventional funds during crisis (non-crisis) periods, but their most striking finding is that this asymmetric pattern is driven by SR attributes rather than differences in management or the characteristics in the fund portfolios.
We are interested in analyzing what can be an intricate relationship between SR scores and performance, in which it is not easy to comprehend what should be expected from it.
As Sharpe (1992) points out, the performance of a fund depends mainly on the evolution of the asset class it invests in. Therefore, different SR scores could lead to differences in performance. In this sense, higher SR scores in the portfolio could imply higher policy constraints that restrict their investment to very specific types of assets. As a result, the performance of the SR funds would be linked to the behavior of the specific constrained SR securities. In this line, Jin and Han (2018) found that green funds tend towards industry concentration rather than diversification. Additionally, previous evidence for conventional funds (Huang et al., 2011) shows that increasing investment concentration leads to higher specific risks and worse performance. Consequently, SR funds with high scores in their attributes, and therefore with more concentrated portfolios investing in specific securities, should face more difficulties in providing investors with greater financial performance.
The analysis of the differences in performance according to SR levels is therefore the first hypothesis of our study: Hypothesis 1 (H1): There are no differences in the performance of SR funds according to their SR attributes.

5
In order to better understand the behavior of SR funds over time, we are also interested in analyzing the persistence of their performance. This analysis is useful to distinguish whether mutual fund results are due to luck or to managers' ability. Thus, if there is persistence, the best-performing funds during a given period should experience greater subsequent performances than the worst-performing funds. The literature on performance persistence in conventional mutual funds is not conclusive. Some studies do not find general evidence of persistence (see Carhart, 1997;Cuthbertson et al., 2008;Massa and Patgiri, 2009) whereas other have uncovered some evidence of performance persistence in the mutual fund industry (see Brown and Goetzmann, 1995;Elton et al., 1996;Cremers and Petajisto, 2009).
The literature on the persistence of SRI funds is still scant. Leite and Cortez (2013), using Carhart's (1997) methodology, do not find evidence of persistence for a sample of French SRI funds; Lean et al. (2014) draw a similar conclusion-for a sample of SRI funds from the Asia-Pacific region. However, in Lean et al.'s (2015) study, although they found weak evidence of performance persistence for a sample of European and North American SRI funds (also using Carhart's methodology), the evidence is stronger when applying contingency tables. In this vein, Matallín-Sáez et al. (2016) show how, compared to using Carhart's method, contingency tables are biased towards finding evidence of persistence too easily.
Unlike some of the previous literature, we do not analyze the persistence for a regional sample but for a global sample of SRI funds by applying a more robust methodology-also based on Carhart (1997). In our study, we contribute by incorporating information on the funds' SR attributes into the persistence analysis. In fact, and as we will see, we obtain that persistence results are different depending on the level of mutual funds' SR scores.
Accordingly, the analysis of the performance persistence constitutes the second hypothesis of our study: Hypothesis 2 (H2): The differences in the abnormal performance between the best and the worst mutual funds, with similar SR attributes scores, are not persistent over time.

Performance measurement
This section is devoted to a succinct description of the measurement of mutual funds' performance and their persistence, for which we consider a linear model which adjusts each fund's returns for a set of given risk factors. This is a very popular approach in the literature, based on one of its seminal contributions (Jensen, 1968), although a successive number of contributions in the field have proposed some variations of it, in order to include more factors (Fama and French, 2015;Carhart, 1997). Multifactor models attempt to avoid the omitted benchmark bias in the performance evaluation. This bias, as pointed out by Pástor and Stambaugh (2002) and Matallín-Sáez (2006), is present when the performance model is not considered a relevant benchmark that proxies some asset classes in which the mutual fund invests. It must also be noted that the fund's return is the result of passive and active management. The return linked to active management is the value added by managers over the return from passive management. The return of passive management hinges critically on the funds' investment objectives. Thus, considering the characteristics of the mutual funds in the sample, and following Sharpe (1992) and Elton et al. (1993), we propose a multifactor model with specific benchmarks. Given that we aim at evaluating funds with a specific investment strategy as well as a broad geographical scope for investment, we have adopted a linear model such as the following one: In the above expression, r p,t is the excess return over the risk-free asset of the assessed fund, the constant in the model, α p , measures the fund's abnormal performance, and the risk factors are the excess returns corresponding to: (i) a global benchmark, which represents investment in different markets around the world (r w,t ); (ii) a specific benchmark, representing investment constrained by SRI fundamentals (r s,t ); and (iii) a specific benchmark for investment in the emerging markets, taking into account the characteristics of some of the funds being evaluated (r m,t ).

Performance persistence measurement
In order to measure performance persistence we will consider the so-called recursive portfolio approach (Carhart, 1997), which is probably the most popular approach in the literature to measure mutual fund performance persistence. Some successful variations of this approach have been proposed by Busse et al. (2010) and, most notably, Fama and French (2010). Carhart (1997) proposes to evaluate persistence by analyzing the abnormal performance of portfolios that invest according to mutual funds' past performance. Persistence is then calculated for two semiannual (half-yearly) symmetrical windows. The first of these windows estimates past performance, whereas the second one rebalances the recursive portfolio.
In addition, when estimating performance of a non-overlapping rolling window we will be allowing the model parameters to vary over time. This is an interesting feature, due to the substantial amount of literature on time-varying systematic risk.
Similarly to Abdelsalam et al. (2017), we apply the recursive portfolio approach by means of the following algorithm: 1. In the first step the performance of the SR funds is estimated by means of Equation (1) for the first sample period.
2. SR funds are ranked in increasing order according to the performance achieved in the period in order to form quintiles within each group of funds-according to investment area and SR attributes.
3. At the start of the following period we form five equally weighted portfolios according to quintile past performance, Q 1 , . . . , Q 5 , where the first portfolio (Q 1 ) invests in the worst performing funds in the previous period and, conversely, the last portfolio (Q 5 ) invests in the previous period's best funds. The same investment strategy is followed for the other deciles.
4. This process is repeated at the beginning of each period (i.e., we would restart in step 1). Therefore, each portfolio would represent a dynamic investment strategy that rebalances selected funds according to their previous performance.
5. We therefore compute the daily return of the five portfolios and then estimate the abnormal performance of the portfolio, also using model (1).

8
Our hypothesis is therefore that, should persistence in mutual fund performance exist, a portfolio with an investment strategy based on a poor (good) past performance will show a negative (positive) abnormal performance.

Data
The data used in this study are from equity mutual funds with SR conditioned investment policies. Specifically, we analyze 3,920 SR equity mutual funds in the world according to investment geographical area. Attending to the representativeness on the global sustainable investment arena, we cover separately the following zones: Europe, US and Canada, and "Other" (including emerging economies) to prevent from any local bias. Morningstar database provides information on daily returns for these funds. The sample period analyzed spans from January 1, 2000 to March 29, 2018. We report some characteristics of the sample funds in Tables 1 through 4.  on average, the highest growth rate is mainly concentrated in the category that includes emerging countries, grouped in "Other" with a value of 20.25%, being practically double compared to "Europe" and "US and Canada" (10.06% and 8.56%, respectively).
Since both surviving and non-surviving mutual funds are considered in the study, there is no survivorship bias in the results for performance and persistence. Rohleder et al. (2011) reviews the effects of this bias in mutual fund performance measurement. However, avoiding survivorship bias may also lead to other problems that are not usually addressed in the literature. Specifically, the inclusion of funds with limited data may reduce the ro-bustness of the analysis. In this regard, Rohleder et al. (2011) pointed out how individual fund performance measurement requires a return history of a certain length to generate reliable regression estimates. In addition, comparing funds with different periods of existence could add some bias if the mutual funds' performance is correlated with the period for which data are available-for instance, the performance could differ depending on the economic cycle or for bull and bear markets (Kacperczyk et al., 2009;Kosowski, 2011;Sun et al., 2013). In order to avoid this type of problem, our empirical strategy will take into account the following: we present performance and persistence results in two ways, i.e. for all mutual funds and for survivor funds only. when the fund shows data for the whole sample period, i.e., 36.5 semesters; S ≥ 4 for survivor mutual funds with at least four semesters of data, and S < 4 for survivors less than four semesters old. Also, considering semesters with data, non-survivor mutual funds are collated into two groups: NS ≥ 4 comprises mutual funds with at least four semesters of data, and NS < 4, the rest.
As indicated in Table 2, only 4.59% (180/3,920) of the SR funds have complete data over the sample period. The largest group is that denoted by S ≥ 4, specifically 42.78% (1,677/3,920) for the funds. funds with "High" and "Low" scores, respectively, and the middle 40% (between 30% and 70%) are categorized as "Mid". Interestingly, there are noticeable differences in the areas under analysis and Europe appears to concentrate the highest scores, while the lowest lie in "Other". For the US and Canada case the vast majority of funds are classified as "Mid", whereas "Low" outnumber "High". Table 3 reports some summary statistics corresponding to the mutual funds' sample.
Regarding geographical area of investment, most funds focus on Europe, the US and Canada; specifically, 77.09% (3,022/3,920). A mean-variance analysis reveals that in Europe funds with "High" scores perform better since, on average, they show a higher net return and lower risk. And also notable is that this finding is revealed for the four SR attributes. However, for the US and Canada this evidence is not as clear, although Low scores concentrate higher risk, the return is mixed depending on the attributes observed.
And contrary to Europe, the "Other" group reports a better balance for the "Low" scores.
By connecting the previous evidence from Table 2 where the number of predominant funds is high (low) for Europe ("Other"), Table 3 brings insights in regard the risk-return trade-off and reinforces the argument in favor of the maturity of Europe to be a more developed and experienced area in the SR management industry. Nevertheless, the "Other" group does not appear to follow adequate SR investment strategies since results are worse than in the case of "Low" scored funds. These descriptive statistics give an idea of both the segmentation and disparate evolution of the SR fund industry in different locations.
As mentioned in the methodology section, to evaluate mutual fund performance we apply the linear model (1) where funds' excess returns are adjusted to the excess returns corresponding to the types of assets in which the funds invest. Note that because the analyzed funds invest in very different geographical areas, the first benchmark is a global index representing global investments, specifically the FTSE World. We selected the DJ Sustain World to represent investments under SR conditions. A number of funds invest in less mature and developed markets, so we also included the FTSE index for emerging markets.
For these indices we calculate daily returns from information provided by Morningstar. We compute the excess return using the one-month Treasury bill rate as the risk-free asset. 4 Table 4 reports some summary statistics for the benchmarks used in expression (1). For the analyzed sample period, the most globalized indexes (for which financial markets in more advanced economies weigh more) show a more conservative mix of average return and risk than those for emerging markets (FTSE Emerging) for which there is higher risk and average return.

SRI mutual fund performance
In Table 5 we report results on funds' performance, not only for all funds (last row of the Unlike Tables 2 and 3 the grouping criterion in "High", "Mid" or "Low" attributes scores is not made on the whole of the totality of the funds, but within each geographical area (Europe, Usa and Canada, and "Other"). This procedure is due the lack of uniformity of the scores within the geographical zones. For instance, in Europe there were more "High" funds and in "Other" more "Low" funds. As one of the objectives of the study is to compare the performance of the funds based on their higher or lower score, there could be some bias in this comparison (i.e., if we maintained the previous grouping method, since comparing "High" vs. "Low" could mean comparing in some way Europe with "Other").
Thus, if there is a specific component in these geographical areas that is not captured by the linear model that adjusts to systematic risk sources, the High-Low comparison could be biased by that component.
Results indicate that, on average, the number of funds with significant performance different from zero is small. For all the funds, in the bottom row of Table 5, the percentage of funds with negative performance was slightly higher than that corresponding to funds with positive performance (53.34% vs. 46.66%). However, the percentage of funds for which results were significant was very low in both cases-5.46% and 3.27%, respectively. In addition, the funds do not outperform the market, there is an overall unweighted negative performance (−0.16%). This result is not very different from that achieved by most of the literature that has evaluated investment funds: a negative performance near zero is evident.
Results differ depending on the geographical investment area. On average, they are particularly bad for US and Canada (−0.85%), whereas for Europe they were only slightly negative (−0.46%). In contrast, for "Other", the average performance was positive (1.58%).
In view of these results, we can infer that more mature markets (USA and Canada, and Europe) involve some complexity for managers to beat the market, while in emerging markets, perhaps less efficient, there may be more opportunities for managers to provide greater added value to their management.
In contrast, looking at the weighted average performance by fund size, we observe that the total performance improves notably in Europe (0.69%) and in the USA and Canada (0.66%), so the negative average seems to be driven by the smaller funds. For funds in the "Other" area the weighted average by size is reduced, taking the highest value instead (1.14%). Therefore, part of the differences found between the different zones is partly due to the behavior of smaller funds; in Europe and, US and Canada there are more negative results while in "Other" area better results are found. It should be borne in mind that smaller funds have a greater capacity for active management than funds with large assets that implicitly approximate the average behavior of the market, given that they are an important part of it (Sharpe, 1992). In line with the above mentioned, it seems that more mature markets, such as Europe and, the US and Canada, do not provide value-added opportunities to these smaller funds, which can be inferred as a measure of the efficiency of that market. However, for funds from "Other" zones there are such opportunities.

Performance and SR attributes' scores
With the aim of testing the first hypothesis of this study, we now focus on the performance differences of funds with different levels of SR attributes. The specific comparisons between funds grouped according to attributes scores are reported in Table 6. In general, the sign of the difference between the performance of funds with higher and lower scores is negative, which indicates that on average the performance of funds with "High" scores is worse than those with "Low" scores. This evidence is clearly significant for the "Other" zone, and in most cases for the US and Canada. Whereas in Europe the sign of the difference 13 in performance is generally negative, in no case it is significant. After the diagnosis of the three geographic areas, we find a common pattern and this is an interesting finding that contributes to the SR literature: there are differences in performance among Sustainability, Environmental, Social and Governance attributes scores, in that the higher (lower) the SR attributes score, the worst (best) its performance. We can therefore reject the first hypothesis proposed in Section 2. In other words, we can infer that funds with weaker sociallyconscious schemes outperform those with a strong SR orientation. This result is consistent with Muñoz et al. (2014) for a sample of US and European socially responsible mutual funds in which underperformance is identified.
In this line, Silva and Cortez (2016) expanded the analysis of US and European towards green funds and their results highlight the idea that green funds present a tendency to underperform the benchmark. One explanation for these findings draws on the tenets of modern portfolio theory, which holds that the negative financial consequences from investing in socially responsible screened companies are due to increased information costs and the inherent difficulty of diversifying ethical fund portfolios (Martí-Ballester, 2015); both motives may lead to poorer risk-adjusted performance mainly for the US and Canada, but also in "Other" areas-as observed. However, concerning the general relationship between financial and socially responsible performance from the Morningstar SR attributes scores approach, the lack of research identified means this remains an open question.  (Figures 1b, 2b, 3b and 4b) the performance of the funds in each category is more alike, particularly for the latter, as shown by the much tighter densities. However, in the case of Europe and US and Canada, despite the higher homogeneity, densities reveal that some funds perform particularly well, as shown by some bumps in the upper end of the distributions-although some pockets of poor performance also exist, as revealed by some (fewer) bumps in the lower tail of the distribution.
This extreme behavior is also present in the "Other" geographical investment area, and particularly for the "Low" category.
In this sense, some specific "Low" funds perform particularly badly in Europe, as shown by the lower tail stretching beyond −0.3 for Europe funds (Figures 1a, 2a, 3a, 4a), whereas for US and Canada funds (Figures 1b, 2b, 3b and 4b) and "Other" (Figures 1c, 2c, 3c and 4c) this phenomenon is less pronounced. However, at the other extreme, the best funds are more frequently found in the US and Canada category, as shown by the relatively long tails corresponding to the solid and dashed lines in Figures 1b, 2b, 3b and 4b.
This overperformance is also found for the "Other" category, and particularly for "Low" funds. This result highlights the asymmetry in the performance differences according to "High"/"Low" categories-i.e., the main driver of the differences found is the behavior of the worst funds. Canada, for which the sign is reversed. However, results for this particular comparison are not significant. Therefore, it seems that the results achieved in Table 6 and Figure 1-4 are robust in the sense that they have not been driven by the performance of non-surviving funds.

Performance persistence and SR attributes' scores
We will now analyze whether funds' performance persists over time in order to test the second hypothesis of the study. As explained in the methodology section, we applied the recursive portfolio approach (Carhart, 1997). Then, we form portfolios which follow investment strategies based on past performance. Should performance exist, we expect that the quintile-portfolio that invests in the worst (best) funds in the past obtains a worst (best) performance. Results are shown in Figure 9 and Table 8 for all funds, and Figure 10 and Table 9 for surviving funds only. Figure 9 shows, for funds grouped according to SR attributes scores and investment area, the performance of the quintile-portfolios. Should persistence exist then we will expect lines with positive slopes. In general, this positive slope exists in most cases and the difference Q5 − Q1 is positive, which means that investing in the best funds (Q5) in the past provides better results than investing in the worst funds (Q1). These results support performance persistence evidence. More specifically, Table 8 reports the performance of the extreme quintile-portfolios (Q1 and Q5), and the magnitude and significance of the difference between both of them. For most of the cases being analyzed the difference is positive, implying that investing in the best past funds yields better performance the worst past funds. However, for the Europe zone not in all cases it occurs with the same intensity and significance. Specifically, for funds with "High" scores in no case there is significance, in fact the difference Q5 − Q1 takes smaller values-even negative. On the contrary, the persistence takes high and significant values for the funds with "Low" scores.
Thus, investing in funds with lower scores that performed better in the past provides a differential with respect to the worst between 5.64% per annum in the case of the Social attribute and 5.01% in the Governance attribute. This differential is mainly due to the worst performance of the funds in the first quintile, which ranges from −4.24% for the Sustainability attribute and −3.65% for the Environmental attribute. That is, funds with Low scores have a clear persistence in their bad results. This evidence also holds, in general, for surviving funds (Table 9 and Figure 10).
In the case of the US and Canada, previous finding is still maintained as funds with "High" scores are the ones with the lowest values of difference Q5 − Q1, and in no case they are significant. Unlike the previous analysis with Europe, the groups of funds with greater persistence and significance are the "Mid" over the "Low". Finally for the "Other" zone, those funds with High scores are the ones that have the highest persistence and significance. Thus, the difference in the performance of Q5 − Q1 takes values that oscillate between 7.24% for the Environmental attribute and 4.16% for the Governance. These results are especially driven by the high performance of funds in Q5. For the "Mid" and "Low" funds although in some cases the difference is positive and weakly significant, in others it takes negative but not significant values (Social and Governance Scores).
In conclusion, we find greater persistence in Europe and the US and Canada than in "Other", leading us to reject the second hypothesis of the study (H2, see Section 2). In other words, the differences in the abnormal performance between the best and the worst funds with similar SR attribute scores persist over time. In particular, it is observed that "Low" funds are the ones with the greatest persistence, while in "Other" those scored as High are the predominant.

Conclusions
The performance of SR funds has been and continues to be an issue of interest worldwide.
The popularity and reputation that these ethically oriented assets have been experiencing for decades brings to academics and investors a wide variety of research and investment opportunities, respectively. For both groups the issue as to the likely underperformance of SRI when compared to their conventional peers has always been at stake, and contributions in this particular field are diverse, being this question far from consensus.
The fact that SR investors are less sensitive to poor past performance suggests that so- Our study analyzes the financial performance of a sample of mutual funds around the world that integrate different levels of SR attributes in their investment portfolios. These attributes are related to several concerns, such as environmental issues, governance, social matters, and sustainability. Our results lead us to reject the first study hypothesis, indicating that most of the mutual funds achieving higher scores in these SR attributes, especially those investing in less-developed areas, experience lower overall performances than other similar funds that focus less on their SR strategies. This is in line with the higher constraints these funds experience in their investments, in relation to other SR funds characterized with lower SR grades.
The financial value added by fund managers, however, is essential in relation to the performance experienced in the fund portfolio, as shown in previous studies analyzing conventional funds. Then, assuming that funds that achieve greater extents of SRI in their portfolios will obtain lower performances is a distorted conclusion. In this line, we show that their overall underperformance is mainly driven by the worst-performing funds in each SR category.
A carefully analysis of SR fund performance over time leads us to reject the hypothesis related to the lack of persistence of SR fund performance by showing that some skilled managers are able to persistently provide investors with greater risk-adjusted returns. For instance, investing in the previous worst-performing funds usually leads to negative alphas, while investing in the previous best funds, whose managers are likely to show greater managerial skills, results in positive risk-adjusted returns. Moreover, the performance differences between the previous best and the previous worst funds are significantly positive for most of the regions and SR categories analyzed.
In sum, the evidence reported in this study shows that managers, practitioners and individual investors could obtain higher risk-adjusted returns by allocating their investments to the previous best-performing funds that integrate specific levels of SR criteria in their portfolios.
It follows, therefore, that a potential SR investor should be more careful when picking the SR funds for investing in. Given that some SR funds experience higher constraints in their investment decisions, the role of managers is then crucial to overcome these restrictions and to add value through their management. Hence, investors and managers willing to incorporate higher levels of SR criteria in their portfolios without experiencing worse risk-adjusted returns should invest in the previous best-performing funds in order to achieve both goals. In other words, pursuing higher ethical values in our portfolios should not imply a worsening in our financial objectives if we choose the right SR funds to invest in.

Implications for theory and practice
This study highlights the relevance of distinguishing between SR funds achieving higher and lower SR attributes in their portfolios, such as environmental or sustainability characteristics, among others. Overall, we find that funds achieving higher scores on their SR attributes seem to underperform their non-SR intensively oriented counterparts. This evidence is, however, mainly driven by the overall behavior of the worst-performing funds, and it is not associated with the sustainable levels accomplished in the funds' portfolios.
Accordingly, several funds with high SR scores do experience greater risk-adjusted returns over time, implying that assessing correctly the financial performance is essential 18 to improve the SR funds' results. These findings have several theoretical and practical implications.
For instance, the characteristics and specific risks related to the portfolio can determine the overall returns of a mutual fund. This is especially relevant for funds aiming to invest in sustainable assets to a greater extent, achieving higher socially-conscious objectives in their investments. Therefore, managers and investors should consider the specific features and SR scores related to each portfolio when assessing the relative performance of their funds; otherwise, their results and conclusions will be biased, potentially affecting the consistency and adequacy of their subsequent decisions.
Moreover, this study also highlights the importance of SR funds in the financial sector. As shown throughout the paper, several funds oriented to high levels of sustainability and SR treats are shown to persistently provide investors with greater performances. Accordingly, the access to information related to socially-responsible and financial features of these investment vehicles plays an important role in improving the investment decisions of all individuals who participate in the market. Therefore, the efforts made by policymakers must be focused towards an enhancement of the transparency, quality and availability of the data reported in the SR mutual fund industry. This implication would lead managers and investors to optimize their financial and sustainable investment decisions.
Finally, it should be noted that our study only assesses the financial performance of funds with specific levels of environmental, sustainable and other SR attributes in their portfolios. Nonetheless, this perspective also offers many avenues for future research lines, especially those related to managerial perceptions. Some examples would be to assess managers' abilities to detect "green" opportunities, or determine the sensitivity of inflows and outflows in SR portfolios to new environmental information. Moreover, given the evidence from this study about the persistence in the abnormal performance of socially-conscious portfolios, developing sustainable performance measures that combine both ethical and financial criteria in the mutual fund industry is a potentially key topic for academics, investors, and other stakeholders who want a better understanding of the behavior of sustainable portfolios.
19          High --Low ------Notes: All figures contain densities estimated using kernel density estimation for the selected funds. We chose a Gaussian kernel, and the bandwidths were implemented using the plug-in methods of Sheather and Jones (1991). The vertical lines represent the average for each category. High --Low ------Notes: All figures contain densities estimated using kernel density estimation for the selected funds. We chose a Gaussian kernel, and the bandwidths were implemented using the plug-in methods of Sheather and Jones (1991). The vertical lines represent the average for each category.