Output complexity, environmental conditions, and the efficiency of municipalities

Over the last few years, many studies have analyzed the efficiency of local governments in different countries. An accurate definition of their output bundles—i.e., the services and facilities they provide to their constituencies—is essential to this research. However, several difficulties emerge in this task. First, since in most cases the law only establishes the minimum amount of services and facilities to provide, it may well be the case that some municipalities go beyond the legal minimum and, consequently, might have an uncertain effect on efficiency when compared to other municipalities which stick to the legal minimum. Second, municipalities face very different environmental conditions, which raises some doubts about the plausibility of an unconditional analysis. This study tackles these problems by proposing an analysis in which the efficiency of municipalities is evaluated after splitting them into clusters according to various criteria (output mix, environmental conditions, level of powers). We perform our estimations using order-m frontiers, given their robustness to outliers and immunity to the curse of dimensionality. We provide an application to Spanish municipalities, and results show that both output mix and, more especially, environmental conditions, should be controlled for, since efficiency differences between municipalities in different groups are notable.


Introduction
Over the last few years, a wide range of studies has analyzed the efficiency of municipalities from multiple perspectives. The empirical evidence now available is increasing, and can be divided into two groups. On the one hand, some studies analyze efficiency in the provision of a specific service such as refuse collection (Brueckner 1981;Ruggiero 1995;Bosch et al. 2000). In some countries, however, this type of study presents certain disadvantages related to the difficulties in assigning the amount of input usage by each specific service. On the other hand, many studies have considered a global perspective, taking into account that local governments provide their constituencies with a wide variety of services and facilities from the municipal budget. The literature is fairly extensive, yet scattered in time-there has not been a continuous flow of research, rather studies have appeared sporadically over time. 1 For more comprehensive reviews, see Tang (1997), or De Borger andKerstens (2000).
In today's complex scenario of economic and financial downturn and increasing public deficits in several euro-area countries, efficient management of resources at all levels of government (central, regional, and municipal) is essential. As indicated by De Witte and Geys (2011), a reasonable way to deal with increasing tasks and tightening budget requirements is to improve productive or technical efficiency, understood in terms of providing a maximum amount of output for a given level of inputs (Koopmans 1951;Fried et al. 2008). Yet in the particular case of municipalities we are dealing with, a relevant problem (shared by the second category of studies referred to in the first paragraph) is the difficulty to accurately define, and measure, what it is that local governments produce. In most cases the problems arise due to the impossibility of directly quantifying the supply of public services. The Spanish case, on which we focus, is not free from that criticism, although its magnitude is lower, largely due to the availability of data on most of the public services that municipalities are legally bound to provide.
However, the law only establishes the minimum services and facilities each municipality must provide, which depends upon population. Nothing prevents a particular municipality from going beyond this legal minimum and providing not only more of each compulsory service (such as discretionally increasing the area of public parks), but also providing additional services and facilities whose input usage may be substantial. Although the implications for efficiency are not clear, it does not seem a priori fair to compare municipalities which stick to the legal minimum with those that go beyond it. These and other difficulties when modeling municipalities' provision of services and facilities have led some authors to talk about output complexity (Haynes 2003).
The difficulties when measuring local governments' service provision exacerbate not only when some municipalities go beyond the legal minimum, but also when some of them choose to provide services and facilities for which information is unavailable. Some studies that estimate the amount of these extra services and facilities for which there is no information acknowledge this complex reality, but results are generally based on surveys carried out on a limited number of municipalities (see Vilalta and Mas 2006). Previous studies dealing with similar issues in other countries found comparables results, i.e. that the additional costs incurred by many municipalities are quite large, although they also vary a great deal across observations (see, for instance Bennett and DiLorenzo 1982;Marlow and Joulfaian 1989;Merrifield 1994). These studies constitute further evidence to support the hypothesis that some municipalities provide their constituencies with a larger amount of services and facilities, not only because different constituencies have different needs but also because of the varying environmental conditions each municipality faces.
There is a large body of literature on how different environmental conditions facing DMUs (Decision Making Units) affect their efficiency in different contexts-not only in local government. Since the pioneering contributions by Banker and Morey (1986a, b), many studies have analyzed the issue. Following De Witte and Kortelainen (2008), one may consider four main families of studies to incorporate the exogenous environment in nonparametric efficiency analysis. The cited studies by Banker and Morey (1986a, b) can be classified in the first category of studies controlling for the environment, which considers a one-stage approach; this approach has some drawbacks shared by some of the later variants (see, for instance Färe et al. 1989;Ferrier and Lovell 1990). 2 The second category of studies considers a two-stage approach, examples of which would be Ray (1991) and Simar and Wilson (2007). There is a third category of studies based on the concept of metafrontier-or frontier separation. Although the concept has been popularized by Battese and Rao (2002) and Battese et al. (2004), among others, some of their ideas had been developed years before. 3 According to metafrontier approaches, the efficiencies of DMUs that operate under a given production technology are not comparable with those of other DMUs operating under different technologies. In our setting, this would imply that it is not possible to compare the efficiencies of municipalities in different groups formed according to different criteria. Battese and Rao (2002) propose a solution based on the metafrontier using stochastic frontier analysis (SFA), which was later refined by Battese et al. (2004). In their paper, Battese and Rao (2002) assume there are two different datageneration mechanisms-one with respect to the stochastic frontier, the other with respect to the metafrontier model. In contrast, Battese et al. (2004) assume that the metafrontier function is an overarching function that encompasses the deterministic components of the stochastic frontier production functions for those DMUs operating under different technologies. Despite the advantages of the metafrontier, some problems that will be described below remain. Finally, a fourth approach is based on the conditional efficiency measures by Simar (2005, 2007b), which provide some methods that developed ideas previously outlined by Cazals et al. (2002). Although the various advantages of these methods have contributed to their increase in popularity (see, for instance Bonaccorsi and Daraio 2008;De Witte and Dijkgraaf 2009;Daraio and Simar 2005, among other applications in different contexts), some questions such as the issue of bandwidth selection remain unsolved.
We propose a contribution to this literature in the particular sub-field of the metafrontier approaches, in combination with cluster analysis. Specifically, we propose a method which combines both the second and third approaches referred to here, i.e. the metafrontier method and the twostage method, since groups of observations are formed in a second stage. As indicated by De Witte and Kortelainen (2008), the metafrontier models have certain disadvantages, namely, they can only be applied to categorical environmental variables, in practice it is not possible to include several environmental factors, and comparison and statistical testing of efficiency differences between more than two classes can be at least challenging (De Witte and Kortelainen 2008). As we shall see below, our approach is more robust to these criticisms, since we consider a three-stage process in which efficiencies are estimated in the first instance, groups are built in the second stage (using cluster analysis), and finally we estimate conditional efficiency measures taking into account the information on cluster membership. This way to proceed was encouraged by (O'Donnell et al. 2008), who indicated the advantages of using multivariate statistical techniques when natural boundaries between groups are unavailable. However, as far as we know, few authors-if any-have combined the metafrontier with cluster methods.
In the particular case of metafrontier methods, most of the ensuing literature has been of an applied nature (see, for instance Bos and Kool 2006), although several theoretical refinements to the initial methodologies have also been proposed (see, for instance Ruggiero 2004). In the case of municipalities, the implications of metafrontier methodsas well as other methods for controlling for environmental conditions-would be related, among other issues, to the fact that different local governments face different constituencies in terms of economic and social conditions, to whose needs municipalities may be more or less responsive irrespective of the amount of services they are obligated to provide. Municipalities also have different characteristics in terms of geography (including, for instance, rugged terrain, or urban sprawl); different sectoral production characteristics (in terms of tourism, etc.); and other characteristics.
We consider municipalities may provide more services and facilities than the legally established minimum because of the different environmental conditions they face. Tourist municipalities may face budget strains due to their greatly increased personnel needs during the high season. Other municipalities may face higher costs because of urban sprawl. According to Hortas-Rico and Solé-Ollé (2010), the urban spatial structure of many Spanish cities not only has an environmental impact, but also a major impact on municipal finances. In other cases, reasons might be more involved, such as wealthy constituencies asking for additional services, or rising expenditures on security because of the rapid population increases experienced by some Spanish cities. As documented by Vilalta and Mas (2006), in a study applied to a sample from the province of Barcelona, more than 30 % of municipal expenditures were discretionary. In these circumstances, one may reasonably expect that some municipalities will be mislabeled as inefficient (or, at least, more inefficient than other municipalities) simply because of this wide variety of scenarios. Therefore, it would be more appropriate to compare only local governments facing similar environmental conditions. This study approaches these problems through a threestage analysis that first analyzes the productive efficiency of municipalities, second it divides them into groups according to different classifications; and thirdly conditional efficiencies are measured. In the second stage we consider three criteria to classify municipalities into different groups. The first one considers clusters according to local governments' output mixes, in order to compare the efficiency of municipalities with similar output bundles; in this way we can control for the fact that some of them might provide services and facilities beyond the legal minimum-hence, they will be more complex. The second criterion constructs groups of municipalities for which we include information on environmental conditions. The third criterion forms groups according to the level of powers, which is determined by the population living in each municipality (the levels of services each municipality must provide hinges on the number of inhabitants) since, as documented by Balaguer-Coll et al. (2010a, b), some links may exist between municipalities' efficiency and decentralization issues. In our application to Spanish municipalities, results show that both output mix and the different environments that municipalities face are issues to control for, since the differences between municipalities affiliated to different groups turned out to be statistically significant.
Therefore, in the second stage municipalities are classified into groups according to different criteria given that, according to the hypotheses formulated, local governments in each group might be facing different production opportunities. They respond by making choices from different sets of feasible input-output combinations. These so-called technology sets differ because of variations in available stocks of physical, human and financial capital, economic infrastructure, resource endowments and any other characteristics of the physical, social and economic environment in which production, or service provision, takes place. As indicated by O'Donnell et al. (2008), such differences have led efficiency researchers to estimate separate frontiers for different groups of DMUs. 4 In public sector applications such as the measurement of local government efficiency, most studies have used nonparametric techniques such as DEA (Data Envelopment Analysis), or its nonconvex version (Free Disposable Hull, FDH), for a variety of reasons (Fox 2001). O'Donnell et al. (2008) have recently filled this gap in the literature, by extending the metafrontier to DEA and alternative SFA approaches for estimating both metafrontiers and group frontiers. However, their solutions are not entirely satisfactory, mainly because of the curse of dimensionality that generally affects efficiency scores obtained using DEA. As indicated by Daraio and Simar (2007a), increasing the number of inputs or outputs, or decreasing the number of units being compared, leads to higher efficiencies, simply as a result of a statistical artifact. Multiple applications in disparate fields are affected by this issue (see, among many others Maudos et al. 2002), which is especially severe in fields such as mutual fund evaluation, where difficulties arise in defining the number of inputs and outputs (Joro and Na 2002). Our approach to deal with this problem is based in the order-m frontier initially proposed by Cazals et al. (2002). We modify their algorithm to control for the existence of municipalities facing different technologies (i.e., municipalities in different groups), in such a way that the efficiencies found are now comparable because the curse of dimensionality problem is alleviated significantly. Therefore, we contribute to this growing field of research in which, as pointed out by Battese et al. (2004) in their conclusions, ''further theoretical and applied studies with other models for technical inefficiency effects are clearly desirable.'' The plan of the paper is as follows. Section 2 provides details on the techniques used to measure efficiency, Sect. 3 specifies the particularities of the data employed, Sect. 4 presents and comments on the most relevant results. Finally Sect. 5 summarizes with some concluding remarks.
2 Measuring the efficiency of municipalities using nonparametric techniques

Free disposable hull
In the first stage we measure efficiency for all municipalities in our sample, regardless of their different characteristics. Therefore, we consider a common non-convex Free Disposable Hull (FDH, see Deprins et al. 1984 where TC s is total cost for municipality s, s = 1, …, S, and y s,i represents the value of its ith output, i = 1, …, I, and Z s denotes the intensity level at which the s observation is conducted. The FDH methodology is particularly suited to detect the most obvious cases of inefficiency as this technique is very stringent with regard to inefficiency measurement. For each municipality labeled as FDH-inefficient, at least one other municipality with superior performance can be found in the sample. Under some other technological assumptions (e.g., for the convex Data Envelopment Analysis, DEA, models) it may well be the case that the inefficiency coefficient depends entirely on the assumption of convexity.
2.2 Measuring efficiency using robust techniques: order-m partial frontiers At this point, two aspects of the FDH methodology deserve special attention: efficiency by default and outliers. In the absence of a sufficient number of similar municipalities for a comparison, a municipality is labeled as efficient by default. This ranking of efficiency does not result from any effective superiority, but rather is due to the lack of information that would allow pertinent comparisons. In addition, by construction, the FDH concept of efficiency applies both to the municipality that presents the lowest level of spending and to those with the highest values for at least one output indicator. This extreme form of the sparsity bias that characterizes the FDH technique leads to lack of discrimination among production units and constitutes a shortcoming of the FDH approach. As for outliers, by definition nonparametric frontiers are defined by the extreme values of the dimensional space of inputs and outputs. Thus, the appearance of outliers (atypical observations that differ significantly from the rest of the data) may considerably influence efficiency computations. It is therefore necessary to verify that the divergence does not result from evaluation errors. However, once the reliability of the data set has been confirmed this kind of information may prove to be valuable information.
Recent work has established the statistical properties of the FDH estimator (Kneip et al. 1998;Simar and Wilson 2000) so that inference is now possible either by using asymptotic results or by means of bootstrap. Simar and Wilson (2000) present a survey on this issue as well as a detailed examination of the statistical properties of the nonparametric estimators in a multivariate context. Like other nonparametric measures, FDH estimators suffer from the curse of dimensionality due to their slow convergence rate.
Taken together, the above mentioned problems may be serious enough to jeopardize the FDH estimates. To solve these problems some additional procedures are required in order to make FDH estimates more robust. Several approaches have already been proposed in the literature. Wilson (1993Wilson ( , 1995 introduced descriptive methods to detect influential observations in nonparametric efficiency estimations. More recent developments include the orderm frontiers (Cazals et al. 2002;Simar 2003). The orderm approach, based on the concept of expected maximal (minimal) output (input or cost) function, yields frontiers of varying degrees of robustness. Daraio and Simar (2007a, p.82), describe two ways to compute the order-m efficiency coefficients. In what follows, we briefly describe the second one, based on a Monte Carlo algorithm.
Consider a positive fixed integer m. For a given level of input (x 0 ) and output (y 0 ), the estimation defines the expected value of maximum of m random variables (Y 1 , …, Y m ), drawn from the conditional distribution of the output matrix Y observing the condition Y m C y 0 . Formally, the proposed algorithm (algorithm I) to compute the orderm estimator has the following steps: 1. For a given level of y 0 , draw a random sample of size m with replacement among those y sm , such that y sm C y 0 . 2. Compute Program (1) and estimate e a s . 3. Repeat steps 1 and 2 B times and obtain B efficiency coefficients e a b s ðb ¼ 1; 2; . . .; BÞ. The quality of the approximation can be tuned by increasing B, but in most applications B = 200 seems to be a reasonable choice. 4. Compute the empirical mean of B samples as: As m increases, the number of observations considered in the estimation approaches the observed units that meet the condition y sm C y 0 and the expected order-m estimator in each one of the b iterations ðe a b s Þ tends toward the FDH ða FDH s Þ. Thus, m is an arbitrary positive integer value, but it is always convenient to observe the fluctuations of the e a b s coefficients depending on the level of m. For acceptable values of m, a s m will normally present values smaller than unity (this indicates that these units are inefficient, as total costs can be reduced without modifying the production plan). When a s m [ 1, the s unit can be labeled as superefficient, as the order-m frontier exhibits a higher total cost.

Adapting the order-m estimators to different technologies
As mentioned above, the order-m estimation is an excellent tool to mitigate the problems of dimensionality and the presence of extreme observations and outliers. However, this evaluation will be of little use if part of the inefficiency found hinges on the output complexity or the different environmental conditions that local governments face, which could lead to biased estimates of the frontier, and hence misleading policy implications. Therefore, our objective is to define a process that can estimate the impact on efficiency of output complexity, environmental conditions and, in general, other options that consider classifying municipalities in different groups. This estimation is possible following the proposals of Battese et al. (2004) andO'Donnell et al. (2008) for estimating a metafrontier production function. This process (algorithm II) contains the following steps: 1. Use cluster analysis to classify the S units in S 1 , S 2 , …, S C groups. 2. Following the algorithm to estimate the order-m efficiency coefficients, complete steps 1 to 4 of algorithm I to estimate the efficiency coefficients ða m;S 1 s ; a m;S 2 s ; . . .; a m;S C s Þ for the municipalities classified in each one of the clusters S 1 , S 2 , …, S C . In order to facilitate the cross-comparison of the results, irrespective of the number of units classified in each cluster, the same value for m will be assigned in all the estimations. By doing this, the problems of dimensionality and the potential impact of the outliers will be neutralized. Figure 1 presents an illustration of a simple case with one output and the total cost. At a given output level, the technology gap ratio (TGR) is defined as the lowest possible cost within the metafrontier divided by the lowest total cost at the conditional specific cluster. 5 5 We have borrowed the concept of technology gap ratio from Battese and Rao (2002), Battese et al. (2004) andO'Donnell et al. (2008). They use the term technology because they consider DMUs operating under different technologies. Although we also talk about different technologies, in order to facilitate comparisons with their papers, we must acknowledge we do not explicitly test whether the municipalities in the different groups have different technologies. Therefore, when referring to technology we are only suggesting whether the effect of each hypothesis (output mix or environmental conditions) matters or not. J Prod Anal (2013) 39:303-324 307

Data, inputs, and outputs
We perform the analysis for a sample of Spanish municipalities with populations between 1,000 and 50,000 inhabitants for year 2000. Both input and output data are provided by the Spanish Ministry for Public Administration. Information on outputs is gathered through the survey on local infrastructures and facilities (Encuesta de Infraestructuras y Equipamientos Locales), which is performed with 5-year frequency and, consequently, constrains our sample period. Information on inputs basically consists of different types of costs, and is taken from local government budgetary data. This data is available for every year. The regions that meet our criteria (data for year 2000, and data for both inputs and outputs) were Andalusia, Aragon, Asturias, the Canary Islands, Cantabria, Castile-Leon, Castile-La Mancha, Extremadura, Murcia, La Rioja, and the Valencian Community. The final sample was made up of 1,198 municipalities. There was no information for the remaining regions for several reasons. At the time of the study, Madrid had not yet presented information on its outputs. Catalonia, the Basque Country and Navarra do not have to provide the Spanish Ministry for Public Administration with this information.
Measuring the production process at municipal level is usually more difficult than in other sectors/industries. We can distinguish three stages in this process of transforming inputs into outputs (Bradford et al. 1969). In the first stage primary inputs (labor, equipment and external services) are transformed into intermediate outputs (e.g., hours of traffic control or the extension of police services). In the second stage, intermediate outputs are transformed into direct outputs. This is what Bradford et al. (1969) call D-outputs, which are ready for consumption by the population. In the third stage, the direct outputs ultimately have welfare effects on consumers (e.g., increasing perceptions and feelings of safety and welfare). The third stage of the process can be directly captured by outcome indicators (labeled C-outputs by Bradford et al. 1969), which reflect the degree to which direct outputs translate into welfare improvements as perceived by consumers.
The efficiency of municipalities can be measured at each stage of this production process. However, under normal circumstances this will be difficult because data might be either unavailable or simply poor, making it difficult to distinguish between primary inputs, intermediate outputs, direct outputs, and final welfare effects. For this reason the analysis is usually confined to analyzing the first and second phases of this process, i.e., the links between primary inputs and direct outputs. We base our selection of outputs on the services and facilities provided by each municipality. All local authorities must provide public street lighting, cemeteries, waste collection and street cleaning services, drinking water to households, access to population centers, surfacing of public roads, and regulation of food and drink. In some cases we must select proxies for these services and facilities. As pointed out by De Borger and Kerstens (1996), population is assumed to proxy for the various administrative tasks undertaken by municipalities, but it is clearly not a direct output of local production. Other relevant outputs, such as provision of primary and secondary education, are not the responsibility of Spanish municipalities.
Spanish law requires municipalities to provide minimum services depending on their size. Some of the minimum services and facilities must be provided by all municipalities, while others are only binding for larger municipalities (with populations of over 5,000, 20,000, and 50,000, the boundaries that define the different categories). The second column in Table 1 reports information on the minimum services that each category of municipalities must provide. The third column indicates the selected output indicators to measure the different services and facilities. Our output choice was driven by the minimum services and facilities. The list of outputs for year 2000, along with summary statistics, are reported in Table 2. The choice was also driven by previous studies on local government efficiency in other European countries, since for the most part they are endowed with the same competencies. 6 Therefore, we will be considering eight outputs which measure the minimum amount of services and facilities that municipalities must provide to their constituencies. However, some of these outputs are not proper outputs but proxies for them. This is the case of population size (Y 1 ), which is used as a proxy for several services-those indicated in Table 2. We admit this assumption is crude, but the amount of available information, although relatively high, has to measure a very complex reality. In this line, it would presumably be a better indicator of ''needs'' or ''demand'' rather than municipal ''production''. This assumption has an wide acceptance in the literature, as shown by studies such as De Kerstens (1996) or Vanden Eeckaut et al. (1993), basically due to a generalized lack of information on municipal services. Other services and facilities can be measured more directly. This is the case of public street lighting (measured by the number of lighting points, Y 2 ), waste collection (measured by the tons of waste collected, Y 3 ), provision of market (measured by the market surface area, Y 6 ), or public parks (measured by the registered surface area of public parks, Y 7 ). Other outputs measure several services and facilities. For instance, street infrastructure surface area (Y 4 ) measures not only the paving of public roads and street cleaning but it is also a proxy for access to population centers, fire prevention and extinction, or urban passenger transport service, and public buildings surface area (Y 5 ) measures the provision of public libraries, the provision of social services, or the provision of public sports facilities. Finally, some services and facilities such as the provision of social services are measured by several outputs (total population, Y 1 , public building surface area, Y 5 , and the surface area of assistance centers, Y 8 ). 7   Borger and Kerstens (1996) or, more recently, Geys et al. (2010), although to a lesser extent than those focusing on the US case (see, e.g. Hughes and Edwards 2000). With respect to these previous contributions we have, in general, a more complete database which a priori would lead to more accurate results. However, some intricate issues still remain. For instance, in their recent paper, De Witte and Geys (2011) propose a methodology to analyze efficient public good provision in stage one only, as compared to the two stages we are focusing on. These authors consider that if the observed value of a given output such as, for instance, educational outcomes (although not in the case of Spain, where these powers are in the hands of regional governments) is demand-driven, then it would lie partially beyond the control of the public service provider. According to this line of reasoning, some particular services such as waste collection would fall in this category of outputs, since residents partially choose the amount of waste they produce-by recycling, etc. Although local governments still have a say here-they can, for instance, either facilitate or hamper recycling-we must admit it might not be as appropriate as some of the other variables chosen to measure municipal output. 8 Therefore, our view of public good provision partly coincides with several previous contributions, but differs from other recent ones such as De Witte and Geys (2011), 9 who adapt Hammond's (2002) approach, which accounts for the fact that municipal outcomes might not be solely determined by the public service provider, and that a distinction between service potential and final outputs should be made (two stages). The benefits of this approach are that we avoid the bias induced by using final outputs that are influenced by citizens' coproduction. Although undoubtedly interesting, this approach has some limitations such as: (1) the need to assume that public service providers provide services fitting to local preferences; and (2) a distinction between the first and second stage is not always clear, and public service providers could be thought to have some influence over the second-stage of the production process (De Witte and Geys 2011, p.322).
Our definition of inputs is based on budgetary variables reflecting the economic structure of Spanish local government expenditures (costs), details of which are determined by Spanish legislation, 10 that considers three basic categories: current or ordinary expenditures, capital expenditures, and financial expenditures. Within these, current expenditures are further divided into four chapters, or categories, which account for: (1) personnel expenditure; (2) current goods and services expenditures; (3) financial expenditures; (4) current transfers. Capital expenditures are also broken down into either real investments, or capital transfers. The former is what Table 2 refers to as capital expenditures (X 4 ), i.e., all expenditures local governments implement: (1) to produce or acquire capital goods; (2) to acquire necessary goods to provide local services in the right conditions; or (3) financial expenditures that are suitable for amortization. On the other hand, capital transfers (X 5 ) refer to the payments to institutions to finance certain investments. Descriptive statistics for year 2000 are provided in Table 2. Since our analysis is entirely confined to overall cost efficiency, the fact that some local government departments may be actually sharing some costs does not raise any particular issue. 11 Another issue concerning the data is the consideration of the aggregate total costs (in monetary units), or overall cost, as opposed to considering the amount of the inputs (in physical units) and their correspondent input prices, that can be non competitive. In such circumstances, following Camanho and Dyson (2008) what we are estimating is a coefficient of economic efficiency that is dependent on two basic sources of inefficiency: market efficiency-dependent on the input prices-and Farrell efficiency, dependent on the physical units consumed. Unfortunately, the lack of data on physical units prevents us from decomposing the economic efficiency, but the meaning of the economic inefficiency is maintained as the cost excess due to the two aforementioned basic factors.
We must also select those variables to include when carrying out the cluster analysis for classifying municipalities into groups according to the different hypotheses. This task is not easy given that we face certain relevant constraints. First, there is no well-established theory as to which variables constitute the ''environmental conditions'' that might impact on each municipality's cost structure. Second, the available information is also limited. In a number of contexts the relevance of considering groups is unquestionable. As indicated by O'Donnell et al. (2008), in most practical settings DMUs can be grouped a priori on the basis of geographical, economic and/or political 8 We are grateful to one of the referees for this comment. 9 Other approaches have been reviewed by, for instance, Cordero-Ferrera et al. (2008). 10 See Ministerial Order (Orden Ministerial), September 20th, 1989. 11 There is a certain amount of subcontracting going on in some Spanish municipalities. However, this information is not publicly available. The only way to account for it would be to assume that an abnormally high value for X 2 (expenditures on goods and services) and a low value of X 1 (wages and salaries) would signal the existence of subcontracting, but we consider such an assumption to be too heroic. This and related issues have been considered recently by, for instance, Zafra Gómez et al. (2011). boundaries, to name a few. 12 However, it is not entirely clear how groups must be formed. In the absence of ''natural'' boundaries for the different groups, multivariate statistical techniques such as cluster analysis are available for determining both the number of groups and group membership. In our particular setting, the a priori classification or ''natural'' boundaries would be those based on size, since the limits of each group are determined by the extent of their powers. In contrast, we need the cluster analysis technique to classify municipalities according to either output mix or environmental conditions.
Regarding the classification based on output complexity, the variables selected to construct the groups are similar to those chosen as outputs, divided them by population. This helps to control for the fact that, as pointed out in the introduction, some municipalities might go beyond the legal minimum and provide an amount of services which does not correspond to their size. Therefore, we only compare municipalities with similar output mixes, i.e., only those in the same group. We have selected as many variables to construct the clusters as outputs. The details are reported in Table 3.
Regarding the classification based on environmental conditions, we have chosen some of the variables provided by the Anuario Estadı´stico de La Caixa. 13 Although they can be partly judged as ad hoc, and although some factors influence the amount, allocation and distribution of local public spending, we consider these provide a rough idea of the environmental conditions facing each municipality that might have an impact on municipal finances. Summary statistics are reported in Table 4, and the particular definition of each variable considered for performing the cluster analysis follows: 14 Total surface area: (divided by population). This indicator is similar to density, which is usually defined as urbanized land per person. Although some authors (Hortas-Rico and Solé-Ollé 2010) consider it is more appropriate to use urbanized land per person, this variable was not available for all municipalities in our sample. We consider this is an appropriate proxy for urban sprawl, although other indicators can be considered (for instance, street surface area divided by total surface. (ENV1) Tourist index: this index may contribute to rising municipal expenditures, at least during some months of the year. Comparing these municipalities only with their peers might be more appropriate. This index is constructed by taking into account information on the economic activity tax (''Impuesto de Actividades Económicas'', IAE), based on the type of touristic accommodation (number of rooms and annual occupancy rates). The value of the index indicates each municipality's share (of 100,000) of total national revenues due to this type of activity. 15 (ENV2) Economic status: municipalities with wealthier populations might be facing higher requirements. These constituencies may be willing to pay more taxes but, in return, they will demand more services and facilities. Some additional reasons are explained in the next section. This is an indicator of gross disposable income per capita, estimated for each municipality, which is divided in ten classes, from the poorest (level 1) to the richest (level 10). We define this indicator as the sum of total family income in the analyzed period, which is estimated using several covariates. (ENV3) Industrial activities: (divided by population). Municipalities in which industrial activity is high may be affected by a different, probably higher, cost structure, since this type of activity requires higher investments in infrastructures, security, anti-pollution policies, etc., which may-although not necessarily-be offset by higher tax revenues. This variable is defined as the number of industrial activities subject to taxes. This number is very similar to the number of industrial establishments per municipality. (ENV4) Total number of cars: (divided by population). This might be related to ENV3. Some wealthy constituencies also have higher levels of education (in relative terms) and might have a preference for non-polluting means of transport. (ENV5) Unemployment: higher unemployment might entail higher crime and, therefore, increased demands for security. This variable is lower than what the official statistics indicate because it is defined as total unemployment divided by total population instead of working population. We cannot use the unemployment figures issued by the Spanish Bureau of Statistics (INE, Instituto Nacional de Estadística) because they are constructed at the provincial, regional and national levels, but not at municipal level. (ENV6) Total population growth, 1991-1998 (percentage): municipalities facing higher population growth might have had to increase the services and facilities at their 12 As also indicated by O'Donnell et al. (2008), if the analysis were conducted in an SFA framework, it would also be possible to conduct statistical tests concerning the number of groups. El- Gamal and Inanoglu (2005) propose other methods to circumvent the use of multivariate analysis techniques. 13 An annual report provided by the largest Spanish savings bank, La Caixa. 14 More specific details on the definitions of the variables and the methodologies used to construct them are available from La Caixa Foundation upon request. 15 Please note that we only have information for municipalities with population between 1,000 and 50,000. own expense, because of the speed at which the population's demands have increased. If central and regional governments have not reacted promptly, they might face a sudden imbalance between their revenues and their costs. (ENV7) Construction: (divided by population). Municipalities where construction was higher have also raised high tax revenues, which might have led to inefficient management of these increased revenues (see, for instance Silkman and Young 1982). In addition, the population levels in these municipalities might have increased sharply, 16 which may have driven some municipalities to increase their social expenditures, as well as other expenses such as civil protection, security, etc., some of which are not included in the list of minimum services municipalities must provide. This variable is constructed using analogous criteria as those used for the industrial activities. (ENV8) Agricultural vehicles/Total number of vehicles (percentage): this information would indicate whether it is a rural municipality whose needs might differ from others with different sectoral specializations. It might also be a proxy for urban sprawl. (ENV9) Number of bank branches: (divided by total population). This would indicate another type of specialization and, in addition, it might proxy for the economic level of the municipality (ENV10).
We admit the selection of these variables is somewhat ad hoc. However, we consider they represent a realistic summary of the different socioeconomic conditions affecting each municipality.
Some decisions involved in performing cluster analysis are the measure of similarity as well as the clustering method. Regarding the former, one of the most popular choices is the Euclidean square distance. Regarding the latter, although there are several alternatives, the Ward hierarchial clustering method has the advantage of maximizing intra-group homogeneity and inter-group heterogeneity. In addition, the technique is robust to outliers and groups are not too dissimilar in size. However, the criterion that must ultimately guide the decision on both the methodology and the optimal number of groups is whether the final groups are sensitive in any way, and whether statistical differences exist among group centroids. Battese et al. (2004) point out the importance of analyzing whether all municipalities share the same technology. If all municipality-level data were generated from a single production function and the same underlying technology, there would be no good reason for estimating the efficiency levels of municipalities relative to a metafrontier. We assume that if statistically significant differences existed among the different groups, it would constitute evidence in favor of comparing municipalities with those in their group only.

Results
Table 5 provides summary statistics on unconditioned efficiency for order-m efficiency scores. Results are reported for all municipalities, and also for the different size categories, given the differences in their powers. Results have been obtained by specifying a common frontier-the metafrontier-for all 1,198 observations. Thus, although results are split into different municipality size categories, they correspond to the same common frontier. The results corresponding to all municipalities, regardless of their powers, are displayed in the first row. Average efficiency is 91.18 %, which is a high value given that municipalities would become fully efficient if they were able to decrease their total costs by 8.82 % only. However, this is an average effect which varies across municipalities. The values at both tails of the distribution suggest that a remarkable variety of behaviors exist, since the minimum is 23.44 %, whereas the maximum is 142.07 %. In the former case, cost inefficiency is high, whereas the latter refers to cases of super-efficiency-units which lie beyond the frontier and can be regarded as outliers. This finding is important, since it constitutes a clear advantage of the order-m frontiers over DEA or FDH, which are strongly affected by the existence of outliers. In the case of order-m frontiers these extreme observations are labeled as super-efficient and do not affect the efficiencies found for other observations. Table 5 also reports order-m efficiencies for the different categories of municipalities split by population and, consequently, levels of powers. The smallest municipalities in the sample (those with populations between 1,000 and 5,000) are, on average, the most inefficient. Mean efficiency is 89.04 %, close to the global mean value of 91.18 %. These results are partly similar because this is, by and large, the category with the most observations. In contrast, medium sized municipalities (with populations between 5,000 and 20,000) and large municipalities (with populations over 20,000) show higher efficiencies. Not only is average efficiency higher (94.17 and 97.87 %, respectively), but also the number of municipalities lying on the frontier (i.e., fully efficient municipalities) is much higher (59.79 and 90.14 %, respectively). 17 However, the most interesting results are those obtained for the different clusters, constructed either using output mix or environmental variables. Descriptive statistics (medians) on the variables used to build the clusters are provided in Table 3 (clusters based on output mix) and 4 (clusters based on environmental variables). In order to facilitate interpretations, we provide a lower panel below each table that reports a summary of the variables included to form the clusters.
Regarding the clusters based on output mixes, as reported in Table 3, differences between the municipalities in each group (in terms of the selected variables) are noteworthy, even though the size of some of these clusters is remarkable. For instance, group 1 is made up of 300 municipalities, roughly 1/4 of the total sample. Ideally, it would be desirable to have clusters containing fewer observations to facilitate comparisons. However, some clusters were difficult to split into further groups, despite considering the Ward method to cluster observations (which tends to form equally-sized clusters). This is an interesting finding, which would corroborate the fact that many municipalities indeed do different things, making comparisons misleading.
In some cases, the medians for some clusters and variables differ substantially; this is the case for cluster 2 in OUTMIX5, cluster 3 in OUTMIX6, cluster 4 in OUTMIX3, cluster 5 in OUTMIX4, cluster 6 in OUTMIX2 and OUT-MIX5, cluster 7 in OUTMIX6, cluster 8 in OUTMIX2, and cluster 9 in OUTMIX5. Therefore, the clusters excel in some particular variables, even taking into account that some of them contain many observations-compared with the total sample size. In addition, although there is a wide consensus that the multivariate technique of cluster analysis is flawed, especially because of the multiple decisions it involves, the MANOVA analysis indicated that the differences between the identified groups and variables were indeed significant. 18 As indicated in Table 4, the differences found among the clusters based on environmental variables (with respect to the variables included in the analysis) are also noteworthy, and significant at the 1 % level. In this case, differences are more difficult to distinguish, because of the narrow range of variation for some of these variables (for instance, ENV2 or ENV3). However, certain groups excel in some variables. See, for instance, cluster 3 in ENV1, cluster 8 in ENV4, cluster 2 and 6 in ENV7, cluster 8 in ENV8, cluster 7 in ENV9 or cluster 6 in ENV10.

Describing the clusters
As indicated in the introduction, Battese et al. (2004) propose a method for comparing the efficiencies of DMUs in different groups in the context of Stochastic Frontier Analysis, which has been extended to the DEA context by O'Donnell et al. (2008). Our methodology, based on orderm indicators, allows us to control for group membership in the context of efficiency measurement via nonparametric techniques and at the same time, takes into account the severity of the curse of dimensionality. We report the results obtained following our approach in Table 6. On average-and this result holds for both categories of clusters-municipalities are much closer to their frontiers, as documented by average efficiencies closer to unity. This result holds for both clusters based on output mix variables (99.10 %) and clusters based on environmental variables (98.92 %). In the case of clusters based on size, or level of powers, as one might a priori expect, results are quite similar to those of the unconditioned case (Table 5).
However, Table 6 also reveals that classifying municipalities into different groups does not per se explain away the remaining efficiencies. The maximum values for the three clustering criteria are well above the unity, suggesting that a non-negligible number of outliers exists. 19 These are what Andersen and Petersen (1993) call super-efficient units. 20 More specific results are reported in Tables 7, 8 and 9. They report basic summary statistics of the technology gap ratio, the efficiencies obtained from the group frontiers (CE g ), and the metafrontier ðCE Ã Þ. In the case of clusters based on output mix, the widest gap between group efficiencies and metafrontier efficiencies corresponds to municipalities in cluster 2, for which average CE 2 g = 1.0508 and average CE Ã 2 ¼ 0:9071. As a result, the technology gap ratio is the lowest (TGR 2 = 0.8776). In contrast, for cluster 8 the technology gap ratio is the highest (TGR 8 = 1.0009), which should be interpreted inversely, i.e., the efficiencies for municipalities in this group are very similar for the group frontier (CE g = 1.0006) and metafrontier ðCE Ã ¼ 1:0016Þ. In general, although some groups show remarkable discrepancies between group efficiencies and metafrontier efficiencies (clusters 1 and 2), for many others the gap is narrower (well above 0.90). Although this result might constitute evidence against our initial hypothesis, we should bear in mind that clusters 1 and 2 are indeed the largest ones, with 300 and 407 observations, respectively. Therefore, for more than half of the municipalities in our sample, it is more reasonable to compare them only with the municipalities in their output mix group (Table 7). Table 8 reports analogous information to that in Table 7 for clusters constructed using environmental variables. In this case, supporting evidence for our initial hypothesis is stronger as, on average, the technology gap ratios are much lower than in Table 7. The widest gap is found for cluster 8, whose average TGR 8 = 0.8718, whereas the lowest gap is found for cluster 3 (TGR 3 = 0.9421). However, in this case the clusters are, on average, much closer to their respective group frontiers than in the case of output mix clusters-on average, most of them show CE g values in the vicinity of 1. Therefore, regardless of the cluster considered, the environmental conditions faced by the different municipalities play a remarkable role, leading us to mislabel them as inefficient. Note also that the differences found between CE g and CE Ã are irrespective of the number of observations in each cluster.
The cluster with the highest (average) discrepancies between the group efficiency and the metafrontier is cluster 8 ðCE Ã 8 ¼ 0:8476Þ, although there are also discrepancies within the group (CE 8 = 0.9655). Multiple reasons related to the characteristics of the cluster could explain this relatively low level of efficiency. This group is characterized by high levels of economic status (ENV3), industrial activities (ENV4), number of cars (ENV5), population growth (ENV7) and, especially, construction activities (ENV8), or a high number of bank branches (ENV10), i.e. municipalities populated by rich constituencies and whose levels of economic activity is quite high. In some particular cases (ENV3, ENV4, ENV5 and ENV8) the values are the highest among the different groups. It is especially remarkable the case of population growth (ENV7), with a 16.15 % increase between 1991 and 1998, which might partly be due to both foreign-born and Spanish national migration, 21 attracted by the relatively high industrial and construction activities of the towns in this cluster. Although this is an hypothesis that would require specific testing, some recent papers such as those by Hierro and Maza (2009) or Peri and Requena-Silvente (2010), although dealing with different issues, indicate it is a plausible conjecture. Such a remarkable population increase in a relatively short period of time might have caused some municipalities to increase the amount of services and facilities provided to their inhabitants proportionally, without this being reflected in the list of outputs (civil protection, etc.) which, as stated in the introduction, would a priori lead to a deterioration in their efficiency levels.
The low efficiency levels could also be due to the increased revenues of municipalities resulting from the increased construction activities which could be managed inefficiently. This view is supported in the literature by several studies such as Spann (1977) or Silkman and Young (1982), who indicate that, ''at the local level, higher incomes increase the fiscal capacity of municipalities and may foster featherbedding of politicians and public managers, thereby increasing the scope for inefficient operation'' (De Borger and Kerstens 1996). Another cluster whose mean efficiency is relatively low is cluster 6 ðCE Ã 6 ¼ 0:8805Þ. It is also made up of relatively rich municipalities since the median for the economic status variable (ENV3) is among the highest and the unemployment rate (ENV6) is the lowest (2.4000). The variables most strongly linked to economic activity (industry and construction, i.e. ENV4 and ENV8) are also among the highest-second only to group 8. Of special note is the number of bank branches per inhabitant (ENV10), which presents the highest median (2.1816). The low efficiencies found for the municipalities in this group may have a myriad of causes, among which we might mention not only those pointed out in the previous paragraph but also other rationale by De Borger and Kerstens (1996), who point out that the incomes and wealth of citizens affect the incentives of both politicians and taxpayers to monitor expenditures, or the fact that rich constituencies might ask for more services and facilities. These patterns are similar for cluster 4, whose environmental conditions are similar to those found for cluster 6, although less marked (for instance, the economic status, ENV3, is slightly worse). Consequently, in this case the the average group efficiency (compared to the metafrontier) is higher ðCE Ã 4 ¼ 0:9176Þ. In contrast to clusters 6 and 8, cluster 2 has a higher average efficiency (compared to the metafrontier, CE Ã 2 ¼ 0:9144) and, in addition, disparities within the cluster are lower. The trends found for population (ENV8) are similar to those found for cluster 8, since they are much higher than the rest of the clusters-for cluster 2, its median value is even higher (25 %). These similarities are also shared by other variables such as the tourist index (ENV2) or the economic status (ENV5). The median number of cars (ENV5) is also similar, which was foreseeable because it is related to economic status. However, this variable is also strongly related to urban sprawl which, as indicated by Hortas-Rico and Solé-Ollé (2010), may have a relevant impact in terms of costs (and efficiency) for the municipality. Compared to the metafrontier, this cluster's mean efficiency is higher than that of cluster 8 ðCE Ã 2 ¼ 0:9144 [ 0:8476 ¼ CE Ã 8 Þ, as well as the number of efficient municipalities in it (46.30 vs. 31.82 %). This superior relative efficiency might also be related to the lower levels found for the variables reflecting economic activity, either industry (ENV4) or construction (ENV8), which require the provision of some services and facilities, despite raising some additional revenues (in terms of higher taxes) to the municipality.
Cluster 3 has the highest average cost efficiency (metafrontier), CE Ã 3 ¼ 0:9326. Some of its main characteristics in terms of environmental conditions differ substantially from those corresponding to the clusters described above. Unemployment is higher, and economic activities (industry, construction and agriculture) are relatively low. As a consequence, the number of bank branches divided by population (ENV10) is also much lower. The relatively lower level of economic activities might have translated into lower municipal tax revenues. Although the average density of this cluster is the lowest (ENV1 is the inverse of population density) and, as indicated by the literature (De Borger and Kerstens 1996), cost inefficiency rises with lower population density, in this case the lower revenues that require more efficient management might have offset the effect of lower density.
The rest of the clusters based on environmental variables (clusters 1, 5 and 7) also have some characteristics which differentiate the municipalities included in them from those in other clusters. Nonetheless, these characteristics are not as clear as those for the groups described above. However, the cluster analysis performed indicates that the differences among them are indeed significant.
In contrast to the clusters described in the previous paragraphs, clusters 1 and 5 are those with the lowest economic status (ENV3). which is also proxied by the number of cars (ENV5). This variable is also the lowest of all the clusters (0.4057 and 0.4087 for clusters 1 and 5, respectively). The economic activity variables, not only industry (ENV4) and construction (ENV8) but also agriculture (ENV9)-measured by the number of agricultural vehicles-, are also the lowest. In the case of cluster 1, the tourist index (ENV2) is also the lowest. In these dismal environments one may reasonably expect that the levels of unemployment will be among the highest, especially for cluster 5 (5.7 %), and that population growth (ENV7) will either be negative (cluster 1) or moderate (cluster 5). The levels of mean efficiency in these clusters (compared with the metafrontier) are among the highest (0.9183 and 0.9204 for clusters 1 and 5, respectively), and clearly higher than those found for clusters 6 and 8, whose main environmental conditions are in direct contrast. Therefore, the reasons that might explain the relative underperformance of clusters 6 and 8 apply here, with the opposite sign.
4.2 Densities for each specific cluster Figure 2 shows densities estimated using kernel smoothing of unconditioned and cluster-specific frontiers. The tighter probability mass at unity shows that observations are indeed much closer to those in their groups than to observations in other groups. These differences are also significant, as indicated by the p values in Table 10, which were obtained by applying the Simar and Zelenyuk (2006) test, whose details are provided in ''Appendix''. The substantial amount of probability mass found at the upper tail of the output mix distribution (Fig. 2a) indicates that controlling for group membership contributes in a more modest way of explaining efficiency differentials than in the case of clusters formed using environmental variables. Figures 3, 4 and 5 provide analogous information to that in Fig. 2, revealing that the differences exist for most groups, especially when conditioning either for output mix or environmental conditions. This is shown by tighter densities when conditioning either for output mix or environmental group membership-solid lines in Figs. 3, 4.

Concluding remarks
Over the last few years, a relevant area of research in the field of public economics and regional science and urban economics has been the analysis of the efficiency of lower layers of government such as regional governments or, as in the present study, local governments. The topic is not only relevant per se, but also because of recent events such as the economic and financial downturn. In some euro-area countries, economic activity has stalled, resulting in a sharp reduction of tax revenues and a simultaneous rapid increase in public spending. In these circumstances, the efficient management of public resources at all levels of government becomes even more important. This is particularly pertinent from a viewpoint of European integration and the Stability and Growth Pact (SGP), in order to facilitate and maintain European Economic and Monetary Union.
Interest is even higher in certain European countries such as Spain, where municipalities face tighter budget constraints since the approval of the law on budget stability (''Ley General de Estabilidad Presupuestaria'', 2001), which establishes mechanisms to control public debt and public spending in pursuit of a balanced budget. In addition, the current economic and financial scenario that has emerged since 2008 has led this country into a recession of unprecedented depth and duration. A critical consequence has been the deterioration of the country's fiscal position, which has worsened dramatically since 2007, contrasting with the regular and significant improvement since the recession of 1993-from a surplus of 1.9 % of GDP in 2007, the fiscal balance moved to a deficit of 11.1 % in 2009. Since most fiscal deterioration is structural, stronger measures to cut expenditures or raise taxes are necessary to improve the fiscal balance. These measures are contemplated in the update of the stability program for Spain (''Plan de Acción Inmediata (immediate action plan) 2010'' and ''Plan de Austeridad (austerity plan) 2011-2013''), which combines a tight control of public expenditures with a modest increase of revenues, in line with empirical research which shows that expenditure-driven consolidations tend to be more sustainable than revenue-driven consolidations (Guichard et al. 2007).
There is now a relatively large body of literature devoted to the analysis of municipality efficiency. However, although the number of theoretical and applied contributions is high, some decisive issues still remain unsolved. One refers to the definition of municipality output. This is a thorny question, especially if we take into account that municipalities are facing budget constraints while at the same time the law requires them to provide a minimum amount of services and facilities. A related issue is the remarkably varied environmental conditions-in our case defined as socioeconomic variables-that each municipality faces, which may have a marked effect on their performance.
We deal with these issues using a three-stage procedure. In the first stage, efficiency is assessed unconditionally. In the second stage, municipalities are classified into groups using cluster analysis taking into account variables based on their output mixes and environmental conditions. We identify groups of municipalities that share some important features in these fields, and the differences among them were statistically significant. In the third stage we assess how results vary when comparing each municipality should be compared with those in its group rather than with all municipalities.
The efficiency literature has dealt with the existence of groups and varying environmental conditions following different criteria. Several approaches now exist to handle this and related issues (see, for instance, Banker and Morey 1986a, b;Ray 1991;Daraio and Simar 2007b;De Witte and Kortelainen, 2008, among others). We focus on one which has gained popularity over the last few years, namely that proposed by Battese and Rao (2002), Battese et al. (2004), or O'Donnell et al. (2008, who compares the decision making units in different groups under the assumption that technology differs across groups and, therefore, one should estimate both group frontiers and a metafrontier. For our particular interests, one of the advantages of these methods is, precisely, their ability to handle different observations in different groups. We deal with these concepts in the context of the order-m frontiers proposed by Cazals et al. (2002), in order to tackle relevant problems such as the existence of outliers, or the curse of dimensionality (see 20 Simar and Wilson 2008, p.441).
Our results indicate that both hypotheses are relevant, especially that referring to the importance of variations in environmental conditions-i.e., it is essential to control for the environment surrounding each municipality. This is particularly important in contexts, such as Spain, with a very high number of municipalities, which differ in several dimensions that go beyond the definition of inputs and outputs, and that cannot be considered as determinants of efficiency. If we factor in the financial strains some municipalities are now experiencing, especially due to the drop in municipal revenues brought about by the burst of the housing bubble (Spanish municipalities are responsible for approving construction permits) we realize the importance of having an additional instrument to measure their performance more accurately.
Although the literature on efficiency has proposed some ways to deal with environmental conditions, our methods are more robust to outliers, and they alleviate the curse of dimensionality. However, we must admit that the groups constructed, both using output mix and environmental variables, were somewhat subjective in terms of number of groups and composition because of the cluster analysis technique employed in their formation. Ideally, the research agenda should address how to define groups more objectively when natural boundaries between them do not exist. and Lahdelma 1995). We consider some methods which provide us with more accurate information (see, for instance Li 1996;Li et al. 2009). If we based our interpretations on a number of summary statistics only, we would miss a considerable amount of relevant information. Most of these methods are based on kernel smoothing to nonparametrically estimate the density functions corresponding to both a s m and a m; where b ¼ 1; . . .; B is the number of bootstrap replicates.
The p values are then adapted to our context, where the true efficiency scores are replaced by order-m estimates. Simar and Zelenyuk (2006) consider somewhat ad hoc methods to solve the discontinuity problem generated by the spurious probability mass at unity-recall that, by construction, at least one observation will always be on the frontier, and in most circumstances the number will be quite large. We adopt one of their proposed methods (Algorithm II; see Simar and Zelenyuk 2006), based on computing and bootstrapping the Li (1996) statistic using the sample of order-m estimates where those equal to unity are ''smoothed'' away from the boundary. We add a small noise, within, say, 5 % of the empirical distribution ofâ m s , disregarding those equal to unity, but with an order of magnitude smaller than the noise of the estimation. The smoothing procedure is performed via: where e i ¼ Uniformð0; minfS À2=ðIþ1þ1Þ ; agÞ (for the DEA estimator), a is the a-quantile of the empirical distribution ofâ m s ignoring those equal to unity.