Explaining German outward FDI in the EU: a reassessment using Bayesian model averaging and GLM estimators

The last decades have seen an increasing interest in FDI and the process of production fragmentation. This has been particularly important for Germany as the core of the European Union (EU) production hub. This paper attempts to provide a deeper understanding of the drivers of German outward FDI in the EU for the period 1996–2012 by tackling the two main challenges faced in the modelization of FDI, namely the variable selection problem and the choice of the estimation method. For that purpose, we first extend previous BMA analysis developed by Camarero et al. (Econ Model 83:326–345, 2019) by including country-pair-fixed effects to select the appropriate set of variables. Second, we compare several estimation methods in their multiplicative form, namely four versions of the generalized linear model. The results of the empirical application indicate that Gamma pseudo-maximum likelihood is the best performing estimator. Furthermore, our results point to horizontal-ness as the primary strategy for German investment in core EU countries, while vertical-ness seems to prevail in peripheral EU countries.


Introduction and motivation
Over the past two decades, the global economy has witnessed an upsurge in Foreign Direct Investment (FDI) as well as in the process of production fragmentation across borders, referred to as Global Value Chains (GVCs). Three interconnected production hubs have been established around the world: North America (centered in the USA), Asia (with China playing a dominant role) and Europe (with Germany as the core). Overall, the participation of Europe in GVCs is significantly higher than in North America and Asia and it has steadily increased with the creation of the Single Market and the launching of the euro (Huidrom et al. 2019). Indeed, euro area countries are more integrated into regional than into global supply chains and thereby, value chains participation at a regional level has major economic implications for the euro area economy (Gunnella et al. 2019). As countries have increased their participation in GVCs, more and more firms have decided to relocate some of its production through FDI (Amador and Cabral 2016). 1 This has been particularly important for Germany, the core hub country in Europe, which has seen a sharp rise in outward FDI. Despite the global pattern of German FDI stocks, the European Union (EU) has been the largest recipient being Western and Eastern European countries natural trading partners. According to UNCTAD's Bilateral FDI Statistics, developed countries accounted for the 87% of its overall FDI stocks at the end of 2012 from which more than half were held by the EU. Within the EU, the distribution of German FDI presents a core-periphery pattern, with the bulk of foreign investments concentrated in core countries. Nonetheless, with the acceleration of the economic integration process, in the EU peripheral countries have gained prominence. Since tariffs and non-tariff barriers were already eliminated in the 1990s, the accession of the Central and Eastern European Countries (CEECs) into the EU in 2004 provides a quasi-natural experimental setting that can be used to investigate the importance of behind-the-border barriers across integrated markets. Motivated by these developments and the role of Germany as a major hub in the EU, it is increasingly important to understand the factors underlying German investment decisions across European countries using robust statistical techniques.
However, to ascertain the drivers behind FDI is not an easy task: As it was made clear in the knowledge-capital model (Markusen and Maskus 2002), FDI is a combination of both vertical FDI (VFDI) and horizontal FDI (HFDI). In the first case, firms split activities between different geographical regions, while in the second, firms replicate domestic activities in a foreign country. The expansion and complexity of GVCs led Yeaple (2003) to coin the FDI generated by these mixed motives as "complex FDI" and more recently, Baldwin and Okubo (2014) developed the concepts of "horizontalness" and "vertical-ness" to systematically account for these more complex forms of FDI. 2 Many researchers employ the gravity model to approximate the cross-country patterns of FDI, which has proved to be solid not only with respect to the good fit to the data, but also considering the underlying theoretical foundations. The earliest and most influential theoretical contributions include Bergstrand and Egger (2007) and Head and Ries (2008), who derived general equilibrium theories for FDI. Later, Kleinert and Toubal (2010) showed that gravity equations can be used to discriminate between different theoretical approaches. Recent developments in the literature that set the ground for structural gravity models can be found in Yotov et al. (2016) and Anderson et al. (2019Anderson et al. ( , 2020. However, the empirical literature suggests a lack of consensus on the drivers of FDI due to the variety of model specifications, which amounts to the choice of variables, and estimation methods applied by researchers. This problem is referred to in the statistical literature as the model uncertainty problem and involves two main challenges. The first one, also known as the variable selection problem, has been addressed by employing model averaging techniques by studies such as Blonigen and Piger (2014), Eicher et al. (2012) and Camarero et al. (2019). The latter applies a Bayesian model averaging (BMA) analysis to identify the long-run correlates of German outward FDI. More specifically, this approach consists of attaching probabilities to any of the possible model specifications over the model space. Overall, these studies found that the robust FDI specification is a more parsimonious one than that previously suggested in the literature.
The second challenge is related to the uncertainty on the econometric specification of the FDI gravity model, and more specifically to the choice of the estimation method. New developments in the literature (such as new theoretical approaches, the use of panel data and other econometric improvements) have highlighted several empirical problems in estimating the gravity equation and generated a debate with divergent opinions about the best performing estimator. A primary concern is related to the econometric problems encountered by estimating the gravity equation in its additive form (i.e., log-log form). Santos Silva and Tenreyro (2006) argued that the conventional practice in the literature of log-linearizing the gravity model and subsequent estimation in its additive form through ordinary least squares (OLS) could not deal with zero-valued bilateral FDI observations and heteroskedasticity in the data and thereby, it led to misleading estimates. Consequently, they propose to estimate the gravity model in its multiplicative form. Another concern involves the choice of the most suitable estimation method that allows to deal with zero-valued bilateral FDI observations and the presence of heteroskedasticity in the error term that may give rise to biased estimates. Zero values are frequent in FDI data, and neglecting them might provide inconsistent estimates. Moreover, estimating the gravity model in its log-linear form rather than in levels can lead to very misleading conclusions in the presence of heteroskedasticity as the log transformation affects the disturbances in the sense that the errors will be generally correlated with the covariates in the case of heteroskedasticity. Therefore, several alternatives on how to address this issue have been proposed in the literature. The most successful and frequently used has been the Poisson pseudo-maximum likelihood (PPML) estimator, a special case of the generalized linear model (GLM) framework, posit by Santos Silva and Tenreyro (2006). They argue that the PPML estimator naturally deals with zero FDI observations and is consistent in the presence of heteroskedasticity.
Nevertheless, although the literature has pointed toward the use of GLM estimators, particularly the PPML, as opposed to OLS, more recent studies have questioned the choice of the PPML by default and point to a model selection approach to choose the best GLM estimator in each specific case. Empirical contributions to address this issue for the gravity model of trade include Martin and Pham (2008), Burger et al. (2009), Siliverstovs and Schumacher (2009), Westerlund and Wilhelmsson (2011), Martínez-Zarzoso (2013, Gómez-Herrera (2013), Head and Mayer (2014) and Egger and Staub (2016). The results obtained are still not concluding.
Against this backdrop, we aim to contribute to the literature on the drivers of FDI by tackling the model uncertainty problem related to two empirical issues: the variable selection and the choice of the estimation method. Given our interest in explaining the drivers of outward German FDI in the EU for the period 1996-2012, our paper delivers new evidence in two respects. First, we extend previous analysis developed by Camarero et al. (2019) by including country-pair-fixed effects in the BMA analysis to select the appropriate set of variables. We follow the approach proposed by Moral-Benito (2012) for BMA, and the recommendations posit in Yotov et al. (2016). Using time and country-pair-fixed effects to account for any unobservable time-invariant FDI and trade cost components has been proven to be a better measure of the bilateral costs than the standard set of gravity variables. Moreover, this will also deal with the endogeneity of preferential trade agreements by accounting for the observable and unobservable linkages between the endogenous trade policy covariate and the error term. Second, considering the results obtained in the BMA with fixed effects, we analyze the performance of different estimators in a GLM framework: Poisson pseudo-maximum likelihood (PPML), Gamma pseudo-maximum likelihood (GPML), negative binomial pseudo-maximum likelihood (NBPML) and Gaussian GLM. Once the appropriate estimator is considered, we provide a robust estimation of the main drivers of German FDI in Europe.
The remainder of the paper proceeds as follows. Section 2 presents the analytical framework and the econometric specification. Section 3 briefly lays out the BMA methodology as well as the alternative estimators considered together with the data. Section 4 reports the results, and Sect. 5 concludes.

Model uncertainty in FDI gravity model estimation
The gravity approach to FDI describes the volume of bilateral FDI between two countries as positively related to their economic sizes and negatively to the distance between them. During the last decade, some of the literature on FDI tried to generalize the use of the gravity approach to analyze FDI patterns (Brainard 1997;Eaton and Tamura 1994). Nonetheless, there was a lack of theoretical foundation for the gravity equations for FDI. Since Bergstrand and Egger (2007), such a theoretical foundation does exist. They extend the 2 × 2 × 2 knowledge-capital model in Markusen and Maskus (2002), by adding an extra factor and country, and derive a specification for the FDI gravity equation that explains its empirical fit to the data. This seminal paper has been followed by Head and Ries (2008), Kleinert and Toubal (2010) and more recently, as structural gravity models in Yotov et al. (2016) and Anderson et al. (2019Anderson et al. ( , 2020. Nowadays, the theoretical justification of the gravity model for FDI is no longer questioned. In this paper, we adopt as a starting point the approach by Kox and Rojas (2019) that is based on the above-mentioned structural gravity model of Anderson et al. (2019Anderson et al. ( , 2020. Although we augment the classical gravity covariates with other variables suggested by the literature, the most important difference with the above-mentioned models is that our specification cannot be structural, as we analyze the German outward FDI stock instead of bilateral FDI. The value of bilateral FDI originating from country i and hosted in country j is represented by FDI stock i j , where i is Germany in our case. This variable is positively affected by the size of the origin country (E i ), because larger economies tend to invest more in capital. Likewise, FDI stock is positively affected by the size of the destination country (Y j ), because larger economies can in principle absorb more foreign investment, and inversely related to the amount of technology capital in the source country (M i ), due to diminishing returns. At the same time, the bilateral FDI is hindered by barriers or frictions. Those include first the standard bilateral trade frictions (PTAs, distance, contiguity, common language and colonial ties), but also the explicit barriers to FDI. The latter refers to all other possible bilateral FDI frictions between i and j, such as infrastructure, legal system (enforceability of contracts), government characteristics (regulatory quality, government effectiveness, political stability and corruption), labor quality and labor costs, corporate tax rate and other restrictive measures, as well as the impact of Bilateral Investment Treaties (BITs) and currency unions.
The following equation summarizes these relationships: where Y j /M i measures country j's potential absorption capacity for FDI-related technology having its origin in country i, whereas ω η i j is the degree of openness of country j to i's technology (German, in this case). ω G j is assumed to take values between 0 and 1 and captures the above-mentioned explicit barriers or frictions to FDI. In addition, parameter η is the elasticity of FDI revenue flows to openness. Finally, α represents a set of fixed parameters from the theoretical model and P i denotes the multilateral resistance as described by Anderson et al. (2020) or Kox and Rojas (2019). 3 3 It should be noted that the strategy followed to account for the multilateral resistance term was somewhat restricted by the focus of interest of the study. Particularly, we include importer fixed effects (and year dummies) instead of the standard approach proposed by Anderson and van Wincoop (2003), consisting on the introduction of exporter-and-year and importer-and-year fixed effects. This is because the inclusion of these country-specific dummies will absorb all other observable and unobservable characteristics which are country specific and time varying-including various national policies, institutions and exchange rates-As we are interested in the long-run drivers of the outward FDI, we employ stock data, which fluctuates less and is in general more reliable than year-to-year FDI flow data. This is also in line with the theoretical approach adopted in the paper. Indeed, the aforementioned models build on the technology capital or knowledge-capital interpretation of FDI. Since knowledge-capital flows are largely intangible and hence difficult to measure, the stock of FDI is used as a proxy for the flow of knowledge (technology) capital between two countries. The effects of technology capital on FDI are captured by country-specific levels of GDP as in Nguyen et al. (2020).
In this study, we estimate an enlarged version of Eq. (1) which includes some additional country-specific variables along the lines of Blonigen and Piger (2014). More precisely, we also consider other related GDP and population measures, factor endowments and productivity, economic risk and exchange rate variables and trade openness measures. To the extent that we include variables capturing horizontal and vertical FDI motivations or a combination of both, we are able to give some hints regarding German FDI strategies in the EU. 4 In relation to the variables included, it should be noted that despite our efforts to capture all possible drivers of FDI highlighted by the literature, it may still exist other relevant factors (such as environmental or geopolitical variables which are omitted here) that may impact FDI. It is thus possible that the omission of such additional covariates may lead to omitted variable bias in the empirical specification (Blonigen 2005). Nevertheless, despite this potential limitation, this study provides a robust analysis of a broad set of plausible correlates of FDI.
Inspired by the framework represented by Eq.
(1), we would arrive to a regression of the type: where FDI G jt denotes outward FDI stock from country G (Germany) to country j in any period t. Matrix X jkt denotes all k FDI long-run drivers specific to the destination country, for example, Productivity, Skilled labor, Education level, etc.; Z G jt contains bilateral covariates such as Similarity of HOST and PARENT real GDP, Squared GDP difference or squared education difference. 5 Additionally, we include host country fixed effects λ j and time fixed effects γ t ; lastly, i jt is an error term such that i jt ∼ N (0, σ 2 ). Despite the well-established theoretical foundation of the FDI gravity model and its popularity in the empirical literature, its estimation suffers from a number of econometric issues, which, as reviewed in this section, has led to a debate on the estimation method. In particular, heteroskedasticity in the data and how to deal with zero values in the dependent variable are the two most common specific problems often encountered in gravity model estimation (Matyas 2017).
Footnote 3 continued because of perfect collinearity (Yotov et al. 2016). However, we are aware that this approach partially controls for multilateral resistance and hence may lead to biased coefficient estimates for the time-varying country-specific characteristics by not accounting for time variation in the multilateral resistance term (Berden et al. 2012). Silva and Tenreyro (2006) demonstrate that the commonly used OLS estimation of the log-linearized gravity model provides biased and inconsistent estimates. This is because, with heteroskedastic data, the expected values of the log-linearized error term (E[ln( i j )]) will depend on the regressors, thus leading to fallacious inferences. Furthermore, the OLS fails to model zero FDI flows as these observations are dropped from the sample when taking logarithms. Consequently, Santos Silva and Tenreyro (2006) recommend estimating constant-elasticity models (such as the gravity model) in its original multiplicative form and propose the use of the PPML estimator instead. The PPML estimator is a special case of the GLM framework, in which the variance is assumed to be proportional to the mean. This implicitly assumes that the PPML estimator equally weights all observations. According to Santos Silva and Tenreyro (2006), the PPML estimator has a number of interesting properties. First, it provides a natural way of dealing with zero-valued FDI observations as the functional form allows to include the dependent variable in levels. Second, even though the proportionality assumption does not usually holds, it provides consistent estimates in the presence of heteroskedasticity once a robust covariance matrix is considered. In light of these considerations, the literature has turned toward multiplicative functional form estimators (namely, GLMs), and more precisely, the PPML estimator, which has been considered the "workhorse" estimator of the gravity equation since then.

Based on Jensen's inequality, that states that E[ln( i j )] = [lnE( i j )], Santos
This would be equivalent to taking logarithms in Eq.
(2) and obtaining an equation of the form: where ζ G jt are the new residuals. Later on, the availability of panel data allowed to improve the estimation of the gravity model by including exporter-and-year and importer-and-year fixed effects in addition to dyadic fixed effects, thus controlling for the unobserved heterogeneity and endogeneity issues (Baldwin and Taglioni 2007;Martínez-Zarzoso et al. 2009). However, the inclusion of such a number of fixed effects may lead to computational difficulties. One strand in the literature has shown that the nonlinear estimation of the gravity model with different levels of fixed effects may potentially introduce an incidental parameter problem (IPP), leading to biased estimates (small sample bias). In this respect, the literature has proposed bias correction methods for fixed effects estimators. 6 In the gravity model context, Weidner and Zylkin (2020) show that analytical bias corrections account for this bias in a "three-way" fixed effects PPML estimator.
On the other hand, some literature has raised some concerns about the PPML estimator and questions its ad hoc choice for gravity model estimation. Based on the argument that it allows for overdispersion, some researchers have recommended the use of the NBPML (Burger et al. 2009;Egger and Staub 2016). In particular, Egger and Staub (2016) compare the performance of GLM estimators conducting a set of Monte Carlo simulations together with an empirical application and found that the NBPML is the preferred estimator for the chosen specification of the trade gravity equation. 7 The NBPML estimator assumes that the variance is a specific quadratic function of the mean, which implies it down-weights observations with larger means. As highlighted by Egger and Staub (2016), this could lead to efficiency gains whenever those observations with larger means exhibit also a larger variance (i.e., noisier observations). The primary reason for applying NBPML is to improve efficiency as it comprises both the PPML and GPML assumptions (Bosquet and Boulhol 2014). 8 Other studies have pointed toward the use of GPML. 9 The GPML estimator is based on the assumption that the variance is a function of higher powers of the mean and thereby, this estimator also down-weights observations with larger means. In relation to this, Santos Silva and Tenreyro (2006) point out that GPML might give excessive weight to the observations that are more prone to measurement errors. 10 In light of the alternative estimators proposed by the literature, Santos Silva and Tenreyro (2011) extend their simulation study in Santos Silva and Tenreyro (2006) and demonstrate that their results validate the use of the PPML estimator even under overdispersion. Even though they acknowledge that the PPML estimator can be outperformed by alternative estimators in some applications, they consider it still should be the benchmark against which alternative estimators should be compared.
Similarly, Head and Mayer (2014) show that PPML and GPML are consistent in the presence of overdispersion. Nonetheless, they posit that GPML performs better for certain empirical applications than PPML. In a recent contribution, Pfaffermayr (2019) proves that the standard errors of the PPML estimated parameters are downward biased in cross section data.
Aside from these estimators, the Gaussian GLM has also been applied in the gravity model empirical literature. 11 The Gaussian GLM estimator assigns more weight to noisier observations (in the sense of a larger variance) and thus leads to a reduction in efficiency. It has been found to perform very badly under heteroskedasticity and presents sample selection bias (Santos Silva and Tenreyro 2006). Despite these limitations, it has been frequently used in the literature alongside alternative estimators for the sake of comparison. A comprehensive survey of alternative estimation techniques for the trade gravity model can be found in Gómez-Herrera (2013).
In what follows, we explain the methodological approach followed in this paper to address both variable selection and estimation method uncertainty.

Bayesian model averaging
Following the latest developments in the FDI gravity literature, we apply a model averaging approach to deal with the variable selection problem on the drivers of FDI. We rely on the same variable selection approach as Camarero et al. (2019) which is implemented in R using the package BayesVarSel (Garcia-Donato and Forte 2018).
The BMA approach is based on the notion that we are uncertain which of the potential competing models M γ generated the data, and thus, the information contained in the posterior probability Pr(M γ | y)-summarized in the posterior inclusion probability (PIP)-explains this uncertainty. Furthermore, model averaged estimated coefficients (known as the posterior mean) can be obtained by averaging over all entertained models using the posterior probabilities as weights. 12 In an attempt to improve the BMA implementation, our analysis explicitly accounts for the panel data structure of the data at hand and the potential endogeneity of variables. As Desbordes et al. (2018) argue, data poolability may not be a valid assumption in panel data applications. Accordingly, following the approach proposed by Moral-Benito (2012), and unlike in Camarero et al. (2019), we include fixed effects in order to capture unobserved common factors across countries. This implies that all the variables are in deviations from their cross-sectional mean. In doing so, the coefficients of the explanatory variables with low or no-time variability (such as distance, population or land area among others) could not be estimated because of perfect collinearity with the fixed effects (Baltagi et al. 2014). In relation to endogeneity concerns, the fact that we consider German outward FDI stock by host countries instead of bilateral FDI stocks might lead to potential endogeneity problems of variables. To address this problem, we include the one year lagged values of those variables suspected to be endogenous, that is, GDP and trade-related variables.

Generalized linear models
While the literature points toward the multiplicative functional specification of the gravity model, there is uncertainty about the optimal nonlinear estimator. Alternatively to the PPML, some studies recommend other exponential family models. 13 Furthermore, these studies all claim that the proper estimator for the gravity model largely depends on the data, and thereby, there is scope for additional empirical analysis. To contribute to this strand of the literature, which has attracted the interest of many researchers, we compare several estimators in a GLM framework.
GLMs estimate the gravity models in their multiplicative form as: 12 We refer the interested reader to Camarero et al. (2019) for more details on the BMA methodology.
13 See Martínez-Zarzoso (2013), Head and Mayer (2014) and Egger and Staub (2016). where E( i |x) = 1, y i jt is outward FDI from country i to country j in time t, x i jt are the explanatory variables, β are the parameters to be estimated, and i jt is a composite error term which includes time-invariant host country fixed effects as well as time fixed effects together with the remainder of the error term. One should keep in mind that the estimation of such models may suffer from an IPP, causing a small-sample bias. To correct for this bias, the literature has proposed several methods. A feasible approach to the IPP is to implement analytical and jackknife bias corrections as derived by Fernández-Val and Weidner (2016). However, we argue, based on recent work by Hinz et al. (2019), that in our setting the issue is of minor importance. 14 GLMs estimators are maximum likelihood estimators based on an assumed linear exponential family (LEF) density, a linear predictor and a link function-which provides the relationship between the linear predictor and the mean (Nelder and Wedderburn 1972;McCullagh and Nelder 1989). Our modeling framework includes GLMs with a logarithmic link function and four exponential family distributions, the key attributes of which is the assumption on the functional form of V [y i |x].
Accordingly, all four estimators are consistent as long as the model is correctly specified and the question of which one performs better refers to relative efficiency gains which depend on the specification of the variance function-see Head and Mayer (2014). Table 1 shows the conditional mean-variance relationships of each of the LEF of distributions studied here. We obtain the PPML estimator under the assumption that the conditional variance is proportional to the conditional mean. GPML and NBPML, in turn, are obtained when the variance is a function of higher powers of the mean, whereas the Gaussian GLM is obtained when the variance equals 1. In the following subsection, we describe the data.

Data
The analysis makes use of the data set described in Camarero et al. (2019), which provides information on German outward FDI stock over the period 1996-2012 in 59 destination countries (38 developed and 21 developing). For the purpose of the current study, we focus only on European Union countries and time-demean the data.
Note that our study can be divided into two parts. First, we employ the variables reported in Table 2 to conduct the BMA analysis. 15 Second (and most importantly), we consider those variables that exhibit a PIP above the recommended threshold of 0.50 in the conducted BMA (see Table 2 or Fig. 3 in "Appendix" for an intuitively easier representation) as our explanatory variables for the comparison of the alternative GLM estimators. A detailed description of variable definitions and data sources is presented in Table 5, while Table 6 reports the countries included.
A short note should be made regarding our FDI measure. We rely on FDI stocks extracted from the Bilateral UNCTAD FDI Statistics. Nevertheless, we bear in mind that this FDI measure may be somewhat distortive due to corporate accounting practices and valuation methods across countries, and hence, results should be interpreted with caution. Although UNCTAD FDI statistics do not report FDI to special purpose entities (SPEs), it may still be capturing statistical artefacts, such as round tripping. 16 In 2014, the IMF's Balance of Payments and International Investment Position Manual (BPM6) and the fourth edition of OECD's Benchmark Definition of Foreign Direct Investment (BD4) provide new guidelines for FDI compilation in order to improve the quality of the data. However, Blanchard and Acalin (2016) posit that these practices do not completely remove the uncertainty surrounding the quality of FDI data. They examine the correlation between FDI inflows and outflows for the USA, as well as the correlation between outflows and the US policy rate, and provide evidence of the speculative nature of FDI measures. More recently, Dellis et al. (2017) using a new OECD database on FDI statistics (OECD BMD4) that filters out the distortive effects from the data together with the approach proposed by Blanchard and Acalin (2016) found that their results were robust to the use of the "non-cleaned" FDI data set from UNCTAD. Therefore, even though our data might not be completely filtered out, it allows us to provide insights on the long-run behavior of investment decisions.
In the next section, we focus on the choice of the most appropriate estimation method for the FDI gravity model and present the results in two subsections. In Sect. 4.1, we report the results for the comparative assessment of the alternative GLM estimators. Once the appropriate estimator is considered, Sect. 4.2 reports the drivers of German outward FDI in the EU distinguishing between core and peripheral EU countries.

A comparative assessment of GLM estimators
In this section, we report the results obtained using the alternative GLM estimators. In order to assess the performance of the different GLM estimators, we rely on different measures of goodness of fit. 17 First, the Ramsey (1969) Regression Equation Specification Error Test (RESET) is computed to assess the general misspecification of the  estimators. If the test rejects the null hypothesis of a good specification, it would mean either that the model is inappropriate due to its functional form or that some relevant information is missing. 18 We compare also the deviance and dispersion of the residuals in the different GLM families. Smaller values indicate a better model fit. We also provide three goodnessof-fit functions: the bias, the mean squared error (MSE) and the absolute error loss. The latter is considered more appropriate than the bias as shown in Martínez-Zarzoso (2013). Finally, we also compute the mean absolute percent error (MAPE) and the root-mean-square error (RMSE). In both cases, a smaller value generally indicates a better model fit.
Tables 3 and 4 report the estimated coefficients as well as the goodness-of-fit statistics for core and peripheral EU countries, respectively. Furthermore, we discuss also graphical techniques to assess the validity of the models. We provide plots of the residuals most widely used in model selection for GLMs, the Pearson and deviance residuals (McCullagh and Nelder 1989). To informally check the validity of the assumed variance function, we examine the scatterplots of the Pearson residuals in the upper half of Figs. 1 and 2. An incorrectly specified variance function will result in a trend in the mean. Therefore, we should expect mean independence of the Pearson residuals of the conditional mean (i.e., a horizontal line) for a proper specification of the variance function. Nevertheless, the deviance residuals are generally preferred to the Pearson residuals as pointed out by McCullagh and Nelder (1989). Thus, we plot the density of deviance residuals for the different GLM estimators in order to gain further insights on the adequacy of the variance function in the lower half of Figs. 1 and 2. The deviance residuals are approximately normally distributed if the model is correctly specified. Following Egger and Staub (2016), we plot the kernel density of deviance residuals illustrated by the black dashed curve together with a normal density plot based on the same variance for readability.
Comparing the results of the different estimators, we observe that for both samples, GPML and NBPML yield the same results with similar estimated coefficients and signs. Something similar happens for the PPML and Gaussian GLM estimators. Country pair clustered standard errors are in parentheses * , * * and * * * denote significance at 10%, 5% and 1% significance levels, respectively. Posterior inclusion probabilities larger than 0.5 The smallest values of the goodness-of-fit statistics are highlighted in bold, as they denote a better model fit Because we consider GLMs with a logarithmic link function, the estimated coefficients can be interpreted as semi-elasticities (Cameron and Trivedi 2009). 19 Country pair clustered standard errors are in parentheses * , * * and * * * denote significance at 10%, 5% and 1% significance levels, respectively. Posterior inclusion probabilities larger than 0.5 The smallest values of the goodness-of-fit statistics are highlighted in bold, as they denote a better model fit Among the alternative estimators, the Gaussian GLM fails to pass the RESET test at the 1% significance level for the sample of core EU countries, whereas for the peripheral countries' sample all the estimators pass the test.
Concerning the overall deviance and its dispersion, the GPML estimator presents the best fit. As regards the goodness-of-fit functions, our results show that all the estimators exhibit a bias, variance and error loss of similar magnitudes. However, Gaussian GLM and GPML display the lowest bias for core and peripheral samples, respectively. The smallest variance is shown by GPML for both country groups. The least error loss is exhibited by NBPML for both country groups. 20 Finally, point to quite similar results, as the lower values of MAPE and RMSE are displayed by GPML for core countries, whereas for peripheral countries NBPML exhibits the lowest values. According to the graphical evidence, the Pearson residuals indicate that GPML and NBPML perform better than PPML and Gaussian GLM. Likewise, the deviance residuals provide further evidence for GPML.
Overall, the goodness-of-fit criteria might seem to provide in principle some conflicting results; yet it should be noticed that the difference between GPML and NBPML is negligible as one should look at 3rd and 4th decimals to identify the best estimator. Moreover, the graphical techniques and, more precisely, the deviance residuals point toward GPML as the best estimator in line to previous results. 21 Moreover, taken together both the goodness-of-fit statistics and the visual inspection of the residuals point at GPML as the best performing estimator.
Our findings concur with previous empirical studies that question the ad hoc estimation of the gravity model by PPML and recommend that it should be compared against alternative estimators. In particular, our results are supported by Martínez-Zarzoso (2013) who shows through Monte Carlo simulations that GPML outperforms PPML. Likewise, Egger and Staub (2016) compare several estimators of the gravity model of trade through a Monte Carlo experiment and conclude that NBPML appears to be the best estimator for their application.
Following the study of Camarero et al. (2019), we tackle the heterogeneity in FDI destinations by disaggregating the European region in two different country groups. In the next subsections, we describe the results obtained by our preferred estimation method: GPML. In Sect. 4.2.1, we report the results for core EU countries, whereas those of peripheral EU countries are reported in Sect. 4.2.2.
Our findings show that, consistent with previous literature (Faeth 2009), the drivers of German outward FDI do not lie in one specific FDI theory but a combination vertical and horizontal FDI drivers.

German FDI in EU countries
The estimated coefficients for core and peripheral EU countries are depicted in Tables 3 and 4, respectively. Columns (1), (3) and (4) show results for the competing GLM estimators. In this section, we will discuss the results for GPML-which are displayed in column (2)-as it turns out to perform the best in the assessment conducted in the previous section. For the sake of comparison, the posterior means of the robust variables identified by BMA are also included in column (5).

German FDI in core EU countries
The first point worth mentioning in Table 3 is the fact that out of the 9 variables singled out by the BMA analysis, only 4 are found to be significant FDI drivers once the specification of the model is refined by using the appropriate GLM estimator and including the allowed set of fixed effects. Comparing GPML and BMA estimates (namely columns (2) and (5)), we see that the magnitude and sign of the coefficients remain stable with a few exceptions.
As expected, the effect of market size of the host country as measured by GDP (Lag GDP) is positive and highly statistically significant. This finding suggests that German outward FDI follows horizontal motivations in core EU countries. Concerning factor endowments, the level of education is found to be slightly significant, yet the estimated sign is contrary to the expected effect. A plausible explanation for this finding might be that the variable is acting as a proxy for wages and hence providing some evidence for vertical motivations.
As regards telecommunications infrastructure, the coefficient estimate of the number of fixed telephone subscriptions (Telephone subscriptions) of about −0.715 supports the notion that in recent times, mobile cellular subscriptions have increased in detriment of the fixed ones. This result may reflect that a modern telecommunications infrastructure is required for attracting new investments. Furthermore, telecommuni-cations can also reduce transaction costs, facilitate business operations and thereby increase efficiency (Gholami et al. 2006).
Finally, FDI positively responds to the quality of institutions as captured by the regulatory quality index (Regulatory Quality). The estimated parameter of about 0.014 is aligned with the findings in Berden et al. (2012), although with a slightly smaller size.
Overall, we may conclude tentatively from our results that the main strategy of German outward FDI in core EU countries is market-driven (horizontal FDI). However, this motivation seems to coexist with vertical motives as well as the quality of institutions.

German FDI in peripheral EU countries
For peripheral EU countries, Table 4 shows that 4 variables, out of the 7 posited by the BMA analysis, are found to be statistically significant by our GPML estimation. The magnitude and sign of the coefficients are quite similar to the BMA estimates.
In line with previous studies, host country population (Population) is negative and significant at the 10% level-see Brenton and Di Mauro (1999) or Gutiérrez-Portilla et al. (2019) among others. This is because the higher the population is, the lower will be the GDP per capita and will thus have a negative effect on FDI. This outcome is consistent with the gravity model and points to horizontal FDI motivations.
On the other hand, we find that a real depreciation of the host country currency has a positive effect on German outward FDI. The estimated parameter of 0.419 for the exchange rate (Exchange rate) is in line with previous findings by Blonigen (1997). The rationale behind this outcome is that a depreciation reduces the price of the domestic assets for foreigners, hence lowering production costs, which should enhance FDI-see Froot and Stein (1991), Blonigen (1997) or Chakrabarti (2001. It is also found that globalization as measured by the KOF Globalization Index (KOF Globalization Index) has a positive and statistically significant impact on outward FDI. The results support the notion that globalization and openness factors promote economic growth and therefore attract FDI, as in, for example, Potrafke (2015).
Finally, the number of fixed telephone subscriptions (ln_h_lines) exerts the same sign and similar magnitude as those reported for the country group of core EU countries. Again, this finding reflects that advanced telecommunications technology might be considered as a driving force in attracting FDI and is associated with cost reductions.
In summary, our results allow us to infer that German outward FDI follows mainly vertical motivations in peripheral EU countries. Indeed, with the acceleration of the EU economic integration process and the resulting reduction of trade costs, German MNEs have relocated parts of the production process in new EU members, primarily the Czech Republic, Hungary, Poland and Slovakia (IMF 2013). However, not only cost efficiency matters for investment in these countries, but also horizontal motivations.

Concluding remarks
Since the 1990s, the global economy has been affected by two major trends: a sharp rise in FDI and a gradual increase in countries' participation in GVCs. In this context, Germany has established itself as the core of the EU production hub. Consequently, an accurate estimation of the factors that drive German FDI across European countries is of upmost importance to policy makers seeking to attract FDI to enhance new job opportunities and growth.
The gravity model has become a popular tool to identify the drivers of the bilateral distribution of FDI. Even though the theoretical foundations of the FDI gravity model are nowadays well established, there is no consensus concerning its empirical estimation. In this respect, the literature on the drivers of FDI faces a model uncertainty problem that involves two main challenges: the choice of the variables considered as its drivers and the estimation methods. The first problem has recently been addressed using model averaging techniques. Yet, there exists controversy regarding the second one.
Since Santos Silva and Tenreyro (2006), the golden rule in the empirical studies has been the implementation of multiplicative functional form estimators and, in particular, the PPML estimator. However, some literature has also argued that the PPML estimator is not always the best performing estimator and thus, additional estimators of the same GLM family have been suggested.
This paper contributes to extant discussion in the literature and adds to current knowledge in several respects. First, we build up on previous BMA studies adding fixed effects to select the appropriate set of variables to include in the FDI gravity model. Second, we follow a model selection approach based on several goodness-offit statistics and graphical techniques in order to assess the performance of different GLMs and show that GPML is the estimator best matched to our data. Third, our refined GPML estimation provides some guidelines on German outward FDI motivations in core and peripheral EU countries for the period 1996-2012.
Our findings suggest that, consistent with recent theoretical and empirical contributions, German outward FDI does not follow a single FDI motivation. Nonetheless, we may conclude with caution that different FDI motivations seem to prevail for each group of host countries. More precisely, the strategy that better captures the pattern of German outward FDI in core EU countries corresponds to horizontal FDI, whereas efficiency-seeking (vertical FDI) seems to prevail in peripheral EU countries.

Appendix A
See Tables 5 and 6

Institutions
Regulatory quality Regulatory quality, in percentile rank (ranges from 0 (lowest) to 100 (highest)) World Governance Indicators (WGI), World Bank