The effect of income on democracy revisited a flexible distributional approach

We reexamine the effect of economic development on the level of democracy based on the data sets of Acemoglu et al. (Am Econ Rev 98(3):808–842, 2008) with a novel regression specification utilizing a zero–one-inflated beta distribution for the response variable democracy. Contrary to the results of Acemoglu et al. (2008), some support for a positive association between income and democracy is found when assuming that the variance of explanatory variables is heterogenous. In particular, our results show that rising income is associated with a probability of becoming fully democratic, but income is not generally associated with the mean level of democracy.


Introduction
The relationship between income and democracy has been widely investigated since the beginning of the twentieth century. While Acemoglu et al. (2008) already argued that there is a positive association between both factors more than twenty centuries ago, Lipset's law formalized it by stating that higher economic growth leads to a higher democracy level (Lipset 1959). This law is (likely) the foundation of the modernization theory that asserts economic development as the major factor influencing the political environment. A number of authors, including Barro (1999), Dahl (1971), Huntington (1993), or Stephens et al. (1992), additionally contributed to the findings showing that higher incomes are associated with higher levels of democracy.
Nevertheless, recent empirical findings show a less clear story. Some support for a positive association between income and democracy is indeed found by Londregan and Poole (1996) when using panel data to estimate a causal relationship as stated by Lipset (1959) but only after considering leadership type and political context as control factors. Murtin and Wacziarg (2014) observe that the transition to democracy is linked to a fractional shift of illiterate to primary school graduates and, to a lesser extent, to income per capita. Moral-Benito and Bartolucci (2011) show instead a nonlinear effect between income and democracy. Fayad et al. (2012) specifically distinguish between income from natural resources and other income. By applying heterogeneous panel techniques, the authors find that only when income comes from non-resource sources is it significant in explaining democracy. Meanwhile, evidence of no causal relation has also been found by other authors. Przeworski et al. (2000) do not find any significant relationships between income per capita and transition to democracy when using a Markov transition model. This lack of evidence challenging Lipset's law is supported by Acemoglu et al. (2008) who use a panel data approach. Their study concludes that a causal effect from income to democracy cannot be found. However, a similar approach from Cervellati et al. (2014) reveals that the effect of income on democracy exists and it is heterogenous for former colonies and non-colonies.
One of the reasons why findings are inconclusive could be that the assumptions underlying the theoretical developments are inadequate. In this paper, we assume that causality goes from economic performance to democracy. In this setting, an important issue is the choice of distributional assumption to approximate democracy when modeling its mean in a regression specification. In particular, most quantitative research assumes that the democracy variable is an unbounded continuous variable that has a homogenous variance which fits with the normal distribution implicitly assumed in least squares estimation. Nevertheless, democracy measurements are in general finite with the upper limit stated as "democratic" and the lower limit as "autocratic." Hence, the main novelty of this paper is to focus on the distributional assumption of democracy, which has not yet been investigated in the related literature.
We focus on the framework of Acemoglu et al. (2008) and contribute to the understanding of this topic by evaluating the distributional assumption of democracy and its influence on the estimates. The main results indicate that when democracy is mod-eled with a zero-one-inflated beta regression (Ferrari and Cribari-Neto 2004), higher incomes in the past increase the probability of a country being democratic. This finding is robust to changes in the data sources in most cases.
The paper is organized as follows. In Sect. 2, we briefly discuss why the research in this field generally comes to different conclusions and how this could be related to our primary concern, namely distributional assumptions that are questionable. Zeroone-inflated beta distribution and regression are outlined in Sect. 3. We present our methodology in Sect. 4. The main results are presented in Sect. 5. Concluding remarks are given in Sect. 6.

Distributional specification
The recent empirical literature on the income democracy nexus has dealt with causality identification and omitted variable bias by using lags of the explanatory variables instead of levels in the right-hand side. Additionally, country fixed effects are used to control for time-invariant unobserved heterogeneity [see, e.g., Acemoglu et al. (2008Acemoglu et al. ( , 2014]. However, there are other issues, namely other sources of endogeneity, incomplete data, measurement error, and the distributional assumption for the variable democracy, all of which have not been fully addressed or even ignored. In the related literature, some attention has been given to endogeneity, incomplete data, and measurement error (Acemoglu et al. 2008;Moral-Benito and Bartolucci 2011;Treier and Jackman 2008). Conversely, in this paper, we focus on the latter to explore the zeroone-inflated beta distribution as an alternative distributional assumption for democracy. A parametric regression model relies on a specific distribution to derive the results. Assuming the normal distribution for the response variable given the explanatory variables is a handy approximation to fulfill the parametric assumption in the class of linear models. However, violations of this assumption makes any results questionable. Moreover, a bounded variable is by definition not normally distributed particularly when most observations are close to the boundaries. If this is the case, the variable of interest should not be used as a dependent variable in an ordinary least squares regression, which (at least implicitly) assumes normality for inference.
For illustration purposes, Table 1 reports summary statistics of the variables representing the level of democracy from the Freedom House Political Right Index and Polity IV data set as proxies for the level of democracy in a particular country. 1 The arithmetic mean is a natural characterization of the central tendency of a data set in particular for normally distributed variables. 1 Freedom House and Polity IV democracy variables are from Acemoglu et al. (2008). Among the various proxies of democracy that are available, we stick to Acemoglu et al. (2008) perspective by using their standardized indices from Freedom House and Polity IV for comparison purposes. The Freedom House index is based on a rating system ranging from 1 to 7, where smaller numbers represent a higher Freedom Rating. Polity IV is a multidimensional measure of political environment that is compressed into a scalar ranging from − 10 to 10. Positive numbers are in favor of democracy while negative numbers symbolize autocracy. Standardization transforms both scales into the identical range between zero and one. The trimmed mean is an arithmetic mean that discards sample at both tails of the distribution. This table discards the lowest 5% and the highest 5% values Having the normality assumption in mind, the usual interpretation of a mean around 0.5 is that most of the countries are half democratic. The next step is to plot a histogram and a density estimate to examine whether these approximate something close to a bell shape, which would indicate a normal distribution for the democracy variables. Figure 1 illustrates that neither Freedom House nor Polity IV show such a bellshaped curve. Instead, their distributions are closer to a U-shaped curve with two peaks. As a consequence, the unimodal interpretation no longer holds and the arithmetic mean does not represent the true central tendency, because it is a product of a compromise between two modes that center around zero and one. Therefore, it is the shape of the distributions and not the means that tell us something well known, which is that most of the countries are either highly democratic or highly autocratic. A few data points are in between, and some of them could be the countries in transition to democracy or to authoritarian regimes. If the conclusion is misleading for the arithmetic mean with the misspecified distribution, it will also be potentially misleading for the parameters of a regression model based on the misspecified distributional assumption.
An additional issue is that the values of democracy are bounded. Without considering this aspect when modeling the distribution of the data, the fitted values could lie outside the interval [0, 1]. In this case, we should consider nonlinear models that take care of the nonlinearity and the bounded characteristics of the response variable.
It is important to take note of another prominent feature shown in Fig. 2. In particular, the plot of the distributions indicates that the world is polarized into two clear political regimes. We visually tested whether the lower mode comes from non-OECD countries and the higher one depicts OECD countries by plotting the subset of OECD and non-OECD according to Freedom House and Polity IV in The visual examination of Fig. 2 suggests that the OECD group approximates the upper mode of the distribution, while the non-OECD subsample represents the lower mode. Moreover, the OECD group shows more variability. We anticipate that the high variation within the OECD subsample comes from the earlier period of the sample, seeing how nowadays all OECD countries are democratic. We will incorporate these features into the model to assess the statistical differences between both groups in the following parts.

Zero-one-inflated beta distribution and regression
A number of issues related to the suitable modeling strategy for bounded response variables have been discussed by Papke and Wooldridge (1996) under the heading of fractional response models. Possible extensions have also been recently summarized by Ramalho et al. (2011). The authors find that it is not reasonable to assume that the effect of explanatory variables is constant throughout the entire range of the response variable when the latter is bounded. They also argue that a beta distribution is not suitable for modeling bounded responses if values on the boundaries are observed with nonzero probability. However, while allowing for values on the boundaries, fractional response models only restrict the expectation of the response to the interval (0, 1) and not the complete distribution. Rather than using a fractional response specification, we therefore inflate the beta distribution with point masses in zero and one to account for the nonzero probability of observing these boundary values.
The mixed discrete-continuous density of a zero-one-inflated beta random variable is given by where B(a, b) is the beta function with parameters a and b given by The zero-one-inflated beta regression where the zero-one-inflated beta distribution is considered as the conditional distribution of the response was introduced by Ospina and Ferrari (2010). For the sake of interpretability, they propose a parameterization based on the expectation μ = a a+b and the scale parameter vector σ = 1 a+b+1 with μ ∈ (0, 1) and σ ∈ (0, 1). They also replace the probabilities for zero and one by the parameters ν = p 0 / p 2 and τ = p 1 / p 2 , where p 2 = 1 − p 0 − p 1 is the probability observing a response from the continuous part of the zero-one-inflated beta distribution. This parameterization ensures that the probabilities for zero, one, and the continuous part add up to one.
Furthermore, we let y it be independent random variables where each y it follows the density in (1) with mean μ it , unknown scale parameter σ it , and zero/one inflation parameters ν it and τ it , while t = 1, . . . , T and i = 1, . . . , N index the time dimension and the individuals, respectively. To relate the parameters of the zero-one-inflated beta distribution to regression predictors, we apply suitable link functions, i.e., where η μ it , η σ it , η ν it , and η τ it are regression predictors constructed from a set of covariates. The logit transformation applied to the mean and scale parameter enables a log odds ratio interpretation for two observations that only differ by one unit in the variable of interest. In contrast, the natural log transformation for the zero/one inflation parameters is directly interpretable since it is approximately proportional to differences. 3 Note that the model allows us to account for heteroscedasticity due to the regression effects on σ it and μ it since the variance of y it is also a function of the mean μ it and proportional to the scale parameter σ it = 1/(1 + a it + b it ). Even though the approach by Papke and Wooldridge (1996) also does not exclude the boundary values, it is more suitable when the truly fractional component of the response is dominant. Conversely, the inflated beta regression better matches our data sets because we observe a large fraction of zeros and ones. Furthermore, the fully parametric approach used by assuming a beta distribution for the fractional response variable leads to more efficient ML estimators (Ospina and Ferrari 2010).

Model specification
Our study estimates a similar model to Acemoglu et al. (2008). 4  values. Hence, we have the combination of two democracy variables and two income per capita variables. We add a dummy variable for OECD membership, which acts as an additional regressor in each model. The OECD dummy is used as a parsimonious way to control for other factors that could impact democracy-besides income-and are also associated with economic development. Nevertheless, since being an OECD member is surely associated with income levels, in order to obtain also the full impact of income on democracy-and not only the partial impact-the model is also estimated without the OECD dummy. Moreover, another version of the model is estimated with the OECD dummy lagged several periods to avoid endogeneity issues.
We implement a linear model structure with fixed effects under the assumption that the response follows the zero-one-inflated beta distribution where the basic predictor structure is given by where x 1it−s is log income per capita of country i at time t − s, x 2it is the OECD dummy of country i at time t, ϑ i is a country-specific fixed effect, δ t is a time-specific fixed effect, and the predictor is linked to the parameters of the response distribution via the link functions discussed above. For the lagged part in the predictor, we used s = 1 for yearly data, 6 s = 5 for 5-year, s = 10 for 10-year, and s = 20 for 20-year data, respectively. We use 5-year averages of data t =x 5 and their first lag in Eq.
(2) to mitigate endogeneity. We also employ the lagged values of explanatory variables for the same purpose. To fit zero-one-inflated beta regression models, we used the R-package gamlss (R Core Team 2016; Rigby and Stasinopoulos 2005;Stasinopoulos et al. 2008).
Because the zero-one-inflated beta regression allows us to estimate not only the mean as a function of the explanatory variables but also the scale parameter, which is proportional to the variance, and the two probabilities for zero and one inflation, we can infer the causes of potential nonconstant variance, as well as other distributional features of democracy at time t. Despite having a relatively suitable distributional assumption and some treatment for other statistical challenges, we do not claim that our estimation has a rigorous causal interpretation. Instead, our intention is to provide a benchmark for future-related research.

Key findings
The main results of our model for different time intervals are presented in Table 2. The first column shows the model estimated with 5-year data (model M1), the second to third with 10 (M2), 20-year (M3) intervals data, and the last column is for 5-year average data (M4). In each model, estimated coefficients are presented for the equation for μ, which represents the mean of the beta distribution, the equation for σ which  The coefficients are in logit form for the equations for μ and σ , in log form for the equations for ν and τ .
The equation for σ only shows the direction of relationship and its significance level. Significance levels are 0.1 (*), 0.05 (**) and 0.01 (***). Standard errors are in parentheses with "qr" type, where qr denotes an assumption that there is no correlation among the parameters. Models M1-M3 are estimated using 5-, 10-, and 20-year intervals, respectively relates to the scale parameter of the beta distribution, and the equations for ν and τ which relate to the probabilities for zero and one inflation, respectively. 7 The estimated coefficients for income per capita in the equation for μ are only significant in model (M2), in which a 10-year interval and a 10-year lag structure are used. In the equation for σ , income is significant in model (M1), (M4), and yearly data, suggesting that for annual, 5-year and 5-year average data, income influences the variance of democracy. The negative and significant income coefficient found for the 5-year, 5-year average, and 10-year lag in the equation for ν indicates that a higher income per capita level leads to a lower probability of a country having a value of zero (autocracy) than a value between zero and one in the next 5 and 10 years. The evidence comes from the equation for τ . The positive and significant coefficient of income (for 5, 10, 20 year, and 5-year average lags) suggests that a higher income induces a higher probability of a country having a value of one (democracy outcome) than a value between zero and one. 8 The OECD dummy is also significant in the equations for μ and σ in some cases. The positive sign in the equation for μ reflects the higher level of democracy on average for OECD members relative to non-OECDs. Meanwhile, the positive sign in the equation for σ indicates that the OECD group has a higher variance. This confirms the findings in Fig. 2. The diagnostic plots for 10-year intervals are provided in Fig. 3.
As a comparison, we provide results for the Polity IV data in Table 3. 9 Table 3 suggests that our findings are not robust for the equations for μ and ν, yet it is more robust for the equations for σ and τ . Past income explains the nonconstant variance of democracy through the equation for σ , and the probability of being democratic is consistently significant through the equation for τ . Further, the latter evidence from τ also indicates that in most cases, rising income is significantly associated with the probability of a country to achieve complete democratization, whereas decreasing income is only in a few cases associated with the probability of a country becoming 7 The result for yearly data is available on request.
8 Yearly data shows mixed signs. 9 See Tables 5 and 6 in the "Appendix" for the results obtained using other data set combinations.

Normal Q−Q Plot
Theoretical Quantiles Sample Quantiles fully autocratic. This fact indicates the existence of an asymmetry in the way countries move along the "degree of democracy" line.
The difference between the OECD and non-OECD groups is less apparent here. The dummy for OECD countries is significant and positive in the equation for μ in only two cases. The OECD dummy is also positive and statistically significant in the equation for τ in one case.
Results for the overall sample from the two alternative data sets generally indicate a similar effect of lag income for the equations for σ and τ . 10 Additionally, the sets were to a large extent robust for the OECD dummy in the equations for μ and σ . Nevertheless, a detailed examination suggests that there is a sort of selection bias. The differences in results mainly depend on which income variable is used in the model. On the one hand, when using income data from the Penn World Table, a positive association between income and democracy is found more often than when using  The coefficients are in logit form for the equations for μ and σ , in log form for the equations for ν and τ .
The equation for σ only shows the direction of relationship and its significance level. Significance levels are 0.1 (*), 0.05 (**), and 0.01 (***). Standard errors are in parentheses with "qr" type, where qr denotes an assumption that there is no correlation among the parameters. Models M1-M3 are estimated using 5-, 10-, and 20-year intervals, respectively income data from Maddison. On the other hand, Maddison GDP favors significance for the OECD dummy. Hence, we conclude that even though the democracy indices are subject to measurement error, in our model specification they are more robust than the income per capita variables. 11 Our further estimation for the OECD versus non-OECD subsamples (see Table 4) shows that the positive association between income and democracy is only statistically significant in the OECD countries when using 10-years interval (mean equation). Whereas, the probability is significant for both subsamples (one inflation equation). However, there is no evidence of positive association between OECD membership history and democracy. 12

Discussion of the results
In this subsection, we provide specific examples that will allow us to help with inference and with the interpretation of the sizes of the coefficients provided in the main table of results (Table 2).
Firstly, in order to infer to what extent a higher level of income increases the level of democracy, we make use of a predictive analysis. Two countries with an identical level of democracy but different level of income are selected. Those are India, which represents lower middle-income countries, and Brazil, which represents upper middleincome countries. In 2000, both appeared to be at the upper level of democracy, but never committed to be completely democratic. Figure 4 shows the predicted probabilities of being fully democratic, given the top five deciles of income for the whole sample. It suggests that provided with the artificial higher levels of lag income, Brazil is more likely to become fully democratic than India. The results using 5-year interval data, 5-year average data, 10-year interval data, and 20-year interval data of Brazil are in favor of full democracy in Brazil when 11 We rerun these regressions only for the sample of countries where all data are available, and the results still differ depending upon which data source(s) is used. Results are available on request. Therefore, we conclude that it is more likely that differences in data are driving results as opposed to differences in the countries in the sample. 12 See Table 7 in "Appendix."

Table 4
Freedom House and Penn World Table GDP per   The coefficients are in logit form for the equations for the income drastically increases, i.e., at least in percentile 80%. Meanwhile, India's fully democratization is only supported by two data sets. 13 In fact, the original levels of income (see Table 8 in "Appendix") could not boost the likelihood of becoming fully democratic. The probabilities for Brazil never exceed more than 0.5, while the chance for India is virtually zero. The outcome that Brazil has a higher probability to become democratic than India could come as a surprise for some readers. However, the recent social demonstrations in Brazil show that its democracy is robust and vibrant, whereas in India, the levels of corruption are still considerable and it is a younger democracy than Brazil, who returned to democracy in 1985 after 21 years of military dictatorship. In India, it was only in 1991 when a number of economic reforms transformed its economy from a restrictive state-driven model to a more open system. The income differences between Brazil and India, and the fact that the predicted probabilities for full democracy increase when using the high level of artificial income, indicate that income is an important factor in determining the probability to reach a fully democratic regime.
Our findings show that income is also not generally associated with the mean level of democracy. We suggest that this lack of significance could be because a transition period is occurring between the two extremes and income is not so strongly associated with this transition. To check this intuition, we report the number of countries that are located in between the two extremes. Those are countries not achieving full democracy in the sample period, but remaining in a state of "partial democracy," which could be closer to full democracy in the latest years of the sample. By examining the data, we observe that the number of countries that were never fully democratic or fully autocratic is 53 countries (122 countries) when the democracy proxy comes from Freedom House (Polity IV) (see Table 11 in "Appendix" for the list of countries). The size suggests that there is a moderate fraction of countries according to the first source (a large fraction according to the second ) that have always been partially democratic during the period analyzed. The pattern of democracy path over time is provided in Fig. 5. The results using both sources (Freedom House in the left side of Fig. 5 and Polity IV in the right side) consistently support a similar story, for instance, there is no sign of mean reversion for countries that were partially democratic. Instead, after a sharp decrease in the early 1970s, there is a gradually upward trend from a lower baseline to a more democratic regime on average. In particular, the transition seems to be slower for Freedom House than for Polity IV, especially in the period from 1990 to 2000. Our estimations are in line with this visualization, indicating that there is a good opportunity for countries in transition to become fully democratic because they do not appear to get persistently trapped in the middle level of the democracy score.

Concluding remarks
In this paper, we claim that the usual distributional assumption for democracy as a response variable could be inappropriate. In particular, the use of an unbounded distribution-such as a normal distribution-for a bounded variable that has dominant observations around the boundaries of its domain could cause problems. Furthermore, the conclusions derived from an analysis that rely on the wrong underlying assumptions could be misleading.
Although we find almost no support for income causing democracy when modeling the mean of democracy, we find that heteroscedasticity is an issue and that higher lag income increases the probability of a country being democratic. As the baseline evidence shows, we only find partial support for a positive correlation between income and democracy when modeling the mean of democracy with data every 10 years and using income from the Penn World Table and democracy from Freedom House. We acknowledge the fact that we do not address endogeneity issues in the way is usually done in the literature (using instrumental variables approaches). Hence, we should not strictly talk about causality, but correlation.
We also find systematic differences between OECD and non-OECD samples in the mean, variance, and probabilities of zero and one inflation. OECD countries are on average more democratic, and evidence that higher income is positively associated with higher democracy is only present for this group. This finding support the literature that the relationship between income and democracy is heterogenous. Moreover, we find that using Maddison GDP, being an OECD member increases the probability of being completely democratic while this is not the case when using Penn World Table data for income. The differences encountered when using Penn World Table and Maddison data indicate that economic measurement seems to matter and can influence the inferences that we draw. A caveat of our approach is that we are unable to address the potential existence of a selection bias, since countries accession to the OECD is partly based on their income per capita. However, as pointed out in the discussion, the results are robust to the exclusion of the OECD dummy in the model.  The coefficients are in logit form for the equations for μ and σ , in log form for the equations for ν and τ .
The equation for σ only shows the direction of relationship and its significance level. Significance levels are 0.1 (*), 0.05 (**), and 0.01 (***). Standard errors are in parentheses with "qr" type, where qr denotes an assumption that there is no correlation among the parameters. Models M1-M3 are estimated using 5-, 10-, and 20-year intervals, respectively  The coefficients are in logit form for the equations for μ and σ , in log form for the equations for ν and τ .
The equation for σ only shows the direction of relationship and its significance level. Significance levels are 0.1 (*), 0.05 (**), and 0.01 (***). Standard errors are in parentheses with "qr" type, where qr denotes an assumption that there is no correlation among the parameters. Models M1-M3 are estimated using 5-, 10-, and 20-year intervals, respectively  The coefficients are in logit form for the equations for μ and σ , in log form for the equations for ν and τ .
The equation for σ only shows the direction of relationship and its significance level. Significance levels are 0.1 (*), 0.05 (**), and 0.01 (***). Standard errors are in parentheses with "qr" type, where qr denotes an assumption that there is no correlation among the parameters. Country fixed effects and year fixed effects are used only when the algorithms converge. Models with odd numbers use Freedom House variable, and models with even numbers use Polity IV variable. The income variable for all models is from Penn World Table

Table 11
List of countries in Fig. 5 No.

Country
No.

Country
No.

Country
No.