Residual Kriging for Functional Spatial Prediction of Salinity Curves

Recently, several methodologies to perform geostatistical analysis of functional data have been proposed. All of them assume that the spatial functional process considered is stationary. However, in practice, we often have nonstationary functional data because there exists an explicit spatial trend in the mean. Here, we propose a methodology to extend kriging predictors for functional data to the case where the mean function is not constant through the region of interest. We consider an approach based on the classical residual kriging method used in univariate geostatistics. We propose a three steps procedure. Initially, a functional regression model is used to detrend the mean. Then we apply kriging methods for functional data to the regression residuals to predict a residual curve at a non-data location. Finally, the prediction curve is obtained as the sum of the trend and the residual prediction. We apply the methodology to salinity data corresponding to 21 salinity curves recorded at the Ciénaga Grande de Santa Marta estuary, located in the Caribbean coast of Colombia. A cross-validation analysis was carried out to track the performance of the proposed methodology.


Introduction
In the last years the number of situations where the data to be analyzed are functions have increased. Since the beginning of the nineties, functional data analysis (FDA) (Ramsay and Silverman, 2005) has been used to describe, analyze, and model this kind of data. Functional versions for a wide range of statistical tools (ranging from exploratory and descriptive data analysis to linear models and multivariate techniques) have been developed (González-Manteiga and Vieu, 2007). An overview about the state of the art in methodological, practical, and theoretical aspects of functional data analysis can be found in Ferraty and Romain (2011).
The standard statistical techniques for FDA are focused on assuming independence among functions. However, in several disciplines of applied sciences there exists interest in modeling correlated functional data: this is the case when functions are observed over a discrete set of time points or when these functions are observed in different sites of a region. In this article, we focus on spatially correlated functional data, and particularly in predicting curves in sites of a region with spatial continuity. Several works have been devoted to solve this problem. Giraldo et al. (2011) proposed an ordinary kriging predictor for functional data whose parameters, as in the univariate case, are scalars. Other approaches are based on considering kriging predictors with functional parameters Nerini et al., 2010). All of these methods assume that the spatial functional process considered is stationary, that is, the mean function is constant (no trend), the variance function is constant, and the covariance function depends on the distance between the locations. However in many practical applications, particularly when we analyze environmental data, the assumption of constant mean function could be not realistic.
In classical geostatistics there are several alternatives to solve the problem of spatial prediction for nonstationary processes when the mean is not constant. Universal kriging (UK), kriging with external drift (KED), and residual kriging (RK) are used to perform spatial prediction when there is a trend in the mean. All of them are based on the estimation of regression models with spatially correlated errors. The semivariance and covariance must be estimated from the detrended data (Gotway and Hartford, 1996). UK is the name used when the coordinates are the predictors. If some auxiliary variables are used, rather than the coordinates, the term KED is preferred. In the case of UK or KED, the predictors are included in the kriging solution system and additional unbiasedness constraints are to be considered. In the RK approach, the drift and regression residuals are considered separately, and then ordinary kriging based on residuals and the trend are summed up (this method is also known as regression-kriging). Any of these methods have theoretical problems when ordinary least squares (OLS) is used to estimate the regression parameters, because the estimation of the semivariance based on residuals is biased. To solve this problem Cressie (1993) proposed using median-polish kriging. This procedure is similar to RK but in this case the trend is estimated by median-polish, and an ordinary kriging is applied on the median-polish residuals. Also restricted maximum likelihood method (REML) can be used to estimate both the trend model and the covariance parameters of the residual process from the data (Gotway and Hartford, 1996;Minasny and McBratney, 2007). Another possibility is to compute the semivariogram from studentized or recursive OLS residuals (Schabenberger and Gotway, 2005). These approaches allow to solve the problem of using OLS residuals to estimate the semivariance. However real data studies have revealed that the difference between UK, KED, and RK based on OLS residuals in comparison with other methods is not significant (Knotters et al., 1995;Minasny and McBratney, 2007). This is a consequence that the bias of a residual-based variogram estimator is small at lags near the origin, but more substantial at distant lags; but the kriging is carried out in local neighborhoods, and the fitted variogram is evaluated at smaller lags, precisely where it has been well fitted (Cressie, 1993). From a practical point of view, RK is easier than UK or KED, and has proved to be a robust technique in practical applications (Minasny and McBratney, 2007). This methodology has been widely applied in modeling environmental data (see, for example, Knotters et al., 1995;Alsamamra et al., 2009). In this article, we consider the extension of RK to the case where data are functions, that is, when we have a spatially correlated data set where the mean function is not constant through the region of interest. Initially a functional regression model is used to detrend the mean. OLS method is used to estimate the functional parameters (Ramsay and Silverman, 2005). Then we apply kriging methods for functional data  to the OLS regression residuals to predict a residual curve at a non data location. Finally, the prediction curve is obtained as the sum of the trend and the residual prediction. This approach does not consider bias correction because the use of maximum likelihood (ML) methods for estimating parameters of functional regression models is, to the best of our knowledge, an open problem in FDA, particularly when we have a functional regression model with functional response.
We analyze a real data set corresponding to salinity curves obtained at 21 monitoring stations of the Ciénaga Grande de Santa Marta estuary located in the Caribbean coast of Colombia. A cross-validation analysis was also carried out. The results show that the methodology proposed is a good alternative to perform spatial prediction of functional data when the mean function is not constant through the region of interest. A comparison between three alternatives for kriging-based prediction with OLS regression residuals is shown. The results indicate that the most simple method (based on ordinary kriging for functional data) is the best option in this case.
This article is organized as follows. Section 2 gives a brief overview about kriging predictors for functional data and describes the methodology proposed. In Sec. 3, the analysis of the salinity data is shown. The paper ends with a brief discussion and suggestions for further research.

Residual Kriging for Functional Data
In this section we first show the basics of three tentative methods to perform kriging prediction of functional data assuming stationarity. Then we show how these methods can be extended to the nonstationary case developing the residual kriging predictor for functional data.
Let χ s (t), t ∈ T , s ∈ D ⊂ R d be a spatial functional process ) defined on some compact set T of R, where s is a generic data location in the d-dimensional Euclidean space (d is usually equal to 2) and χ s (t) are functional random variables (Ferraty and Vieu, 2006), defined as random elements taking values in an infinite dimensional space (or functional space). We assume that these random elements live in a separable Hilbert space H of square integrable functions defined on T. The kriging predictors for functional data  assume that the functional random process is secondorder stationary and isotropic, that is, the mean and variance functions are constant and the covariance and semivariance depends only on the distance between sampling points. Formally, we assume that Three kriging predictors based on stationarity are recalled. The simplest is called ordinary kriging for functional data (Giraldo et al., 2011), and has the same expression of a classical kriging predictor but considering curves instead of data. This predictor is defined asχ The parameters λ i , i = 1, . . . , n are found as the solution of the linear system ⎛ This function can be estimated by a method-of-moments aŝ where Once we have estimated the trace-semivariogram for a sequence of L values h l , we fit a parametric model (any of the classical and widely used models such as spherical, Gaussian, exponential or Matérn could well be used) to the points (h l ,γ (h l )), l = 1, . . . , L, as if they were obtained in the classical geostatistical setting. A second alternative considers a kriging predictor with functional parameters where the influence of curves on the prediction is given in the same argument t ∈ T . This predictor is considered in Giraldo et al. (2010) and it is called continuous time-varying kriging for functional data. This predictor is defined by the expression The estimation of the functional parameters λ i (t), i = 1, . . . , n, is carried out by using an approach based on the use of basis functions. The curves χ s i (t) and the functional parameters are represented in terms of K basis functions. Thus a new parametrization is given in which we have K new variables (formed with the coefficients of fitting the basis functions to the data) and n × K parameters (given by the coefficients of fitting the basis functions to the functional parameters). In order to estimate the parameters, a linear model of coregionalization between the new variables must be estimated. A detailed overview to this approach is given in Giraldo et al. (2010). A third alternative for functional kriging allows the functional parameters to be defined in T × T . Then, the predictor of χ s 0 (t) is given bŷ such that λ 1 (t, v), . . . , λ n (t, v) : T × T → R. This kriging predictor has been separately proposed by Giraldo (2009)  An overview about the estimation of the parameters in (4) can be found in Giraldo (2009).
Here, we propose a new predictor which is useful when the mean function is not constant. Thus we now avoid the assumption of stationarity and assume that the mean function depends on the location, that is, when E(χ s (t)) = μ s (t). Our proposal follows three steps. Initially, a functional regression model (FRM) (Ramsay and Silverman, 2005) is used for detrending the mean. We then apply some kriging method for functional data to the regression residuals to perform prediction of a residual curve at a non-data location. Finally, the predicted curve is obtained as the sum of the trend and the residual prediction. Now we want to show a more detailed description of each step. Ramsay and Silverman (2005) have shown several alternatives to perform functional regression analysis, that is, to estimate a FRM where either the response or predictor variables are functions. The first step of the proposed methodology is to estimate a FRM with functional response and scalar covariates (Ramsay and Silverman, 2005, Ch. 13). Specifically, we use the model where Z s i (t), i = 1, · · · , n are the functions at visited locations, s i = (x i , y i ) are the geographical coordinates, α(t), β 1 (t), β 2 (t) are the functional parameters of interest, and (t) is a white noise for each t ∈ T . The parameters are estimated by least squares using an approach based on the use of basis functions (Ramsay and Silverman, 2005). The model in Eq. (5) has scalar covariates. However, other FRM could also be applied, for instance considering explanatory functional variables. Also, approaches based on nonparametric and semiparametric models for functional data could also be considered (Ferraty and Romain, 2011;. In the same sense there is the possibility of using regression trees for functional data (Nerini and Ghattas, 2007).
Once estimated the regression model, in a second step we obtain the residuals and based on them, we can perform spatial prediction of a residual curve at non-visited locations by using any of the predictors defined in Eqs.
(1),(3), and (4), respectively. Thus we predict a residual function by using one or several of these predictorŝ where e i (t), i = 1, . . . , n are the residual curves obtained from the estimated FRM.
Finally, in a third step the prediction at a nonvisited location is achieved bŷ whereχ s 0 (t) is the function predicted on the location s 0 ,Ẑ s 0 (t) =α(t)+β 1 (t)x 0 +β 2 (t)y 0 is the trend estimated on the location with coordinates s 0 = (x 0 , y 0 ), andê 0 (t) is the prediction of a residual function at a non-visited location s 0 . We callχ s 0 (t) in Eq. (10) residual kriging predictor for functional data. We have mentioned above that there are several alternative models for estimating the trend in data. However, in practice it could be difficult to have data on explanatory variables on the prediction site. For this reason the model in Eq. (5), although rather simple, could be preferible because the geographical coordinates are always available.

Spatial Prediction of Salinity Curves
We analyze salinity data measured at 21 monitoring stations of the lagoonal-estuarine system comprised by the Ciénaga Grande de Santa Marta (CGSM) and Complex of Pajarales (CP) (Fig. 1). The CGSM-CP is the largest coastal lagoon system in Colombia. Several studies have shown that salinity is one of the variables that better describe the changes of this ecosystem (Blanco et al., 2006). In semi-closed ecosystems, as is the case in CGSM and CP, salinity is highly variable due to river discharges, winds, and the movement of tides. Identification of spatial and temporal variability in salinity provides important ecological information and for this reason it is important to know its spatial distribution. Classical univariate and multivariate geostatistical methods have been used to this aim. Here, we provide a different modeling strategy to predict the spatio-temporal salinity behavior. Biweekly data from October 1988 to March 1991 were recorded (left panel, Fig. 2). Data were smoothed by using a B-splines basis with K = 15 functions (Fig. 2). The number of basis function was chosen by cross-validation (Ramsay and Silverman, 2005).  According to left panel in Fig. 2 the salinity levels in the monitoring stations of CP (west side in the right panel of Fig. 1) are higher than those levels in the CGSM. We want to outline that there are two stations at CGSM showing considerably lower salinity values than others. These correspond to stations RFU and RSE which are located at the mouths of Fundación and Sevilla rivers, respectively. This result indicates empirically that there is a decreasing trend from west to east. Thus the stationarity assumption with this data set could not be valid and the application of the predictors from Eqs. (1), (3), or (4) to predict salinity curves at non-visited locations from CGSM and CP could be inappropriate. In this case, the application of the residual kriging for functional data seems to be a better option. Taking into account this result we carried out a functional regression analysis with functional response (the smoothed salinity curves) and scalar covariates (the geographical coordinates). We assume the population model We estimate the functional parameters in model (11) by ordinary least squares, following Ramsay and Silverman (2005). We use the R library fda (Ramsay et al., 2010) to fulfill this task. Specifically we obtain the estimated model where Z i (t) are the smoothed salinity curves (Fig. 2, right panel),Ẑ i (t) are the regression estimations of the salinity curves,α(t),β 1 (t), andβ 2 (t) are the estimations of the functional parameters, and e i (t) are the residual curves. A plot with the estimated functional parameters is shown in Fig. 3. We observe that parameters are significantly different from zero (only β 1 is not significantly different from zero in some time periods). The estimation of the meanα(t) shows that the level of salinity increased in the period of study. According to the sign ofβ 1 we note that the salinity decreases from west to east, that is, when the longitude increases (right panel in Fig. 1) the salinity level decreases (in those time periods whereβ 1 has negative sign and is significantly different from zero). This is coherent with the result mentioned before, that the level of salinity is higher in CP stations than in CGSM stations (Fig. 2). The sign ofβ 2 indicates that the salinity level increases from south to north (Fig. 1). This trend is particularly due to the stations located in the south of the system (RFU and BRF, Fig. 1) that have a low level of salinity as they have a direct influence of the Fundación river.
Once estimated the functional regression model we used the residual regression curves e i (t), i = 1, · · · , 21 (left panel in Fig. 4) to perform spatial prediction of a residual curve at a nonvisited location (with coordinates (945000, 1692000)) by means of the predictors given in Eqs. (7), (8), and (9). As in the case of the original salinity curves, a functional cross-validation analysis (Ramsay and Silverman, (2005) indicated that a B-splines basis with K = 15 functions was appropriate to smooth the residual data (left panel, Fig. 4).
For comparison purposes we also obtained predictions applying directly the predictors from Eqs. (1), (3), and (4) to the salinity data. The predictions obtained by the six methods are shown in the right panel of Fig. 4. We note that there are some important differences between the predictions. Two questions arise naturally by observing this plot: (1) Which method is better? (2) Which methods based on residual kriging are better than those methods for stationary data? To give a proper answer and to verify the goodness-of-fit of the proposed predictors we used a functional cross-validation analysis. Each individual smoothed curve was temporally removed, and further predicted from the remaining ones. This procedure was followed for each one of the six predictors (three based on the stationarity assumption and three considering the residual kriging approach), and then we evaluated the crossvalidation predictions obtained by each method. Summary statistics of the sum of squared errors of functional cross-validation (SSE F ) resulting from applying the six methods to the salinity data set are shown in Table 1. The summary statistics of SSE F values (Table 1), and in particular the sum of SSE F values indicated that those methods based on residual kriging (ROKFD, RCTVFD, and RFKTM) had better performance than other stationary-based predictors. We also note that though the differences between methods based on residual kriging are small, ROKFD provided slightly better results when using the sum of SSE F values as goodness-of-fit indicator. In all cases LBA, located in the CGSM (Fig. 1), represented the station with the worst prediction. According to the maximum SSE F values, RCTVFD and ROKFD  methods had the best behavior in LBA station. The salinity level in this station was as high as that in those stations located in CP (separated from LBA more than 15 km). However, the salinity at stations closer to LBA (as RJA and BRS stations) showed a very different behavior across time. Taking into account that in stationary-based kriging prediction curves from locations closer to the prediction site will have greater influence than others more separated, it was then expected that RJA and BRS stations also should show bad predictions. But this was not the case, and the use of the residual kriging prediction, in particular of RCTVFD and ROKFD methods, allowed to improve the prediction in those cases. There was a significant difference between the maximum of SSE F values under the methods based on stationarity compared with those methods based on residual kriging. In the latter case, the effect of the estimated functional parameterβ 2 (t) in Eq. (12) (which was positive and indicated that there was a trend from south to north) allowed to improve the prediction by kriging. The differences of SSE F values between ROKFD, RCTVFD, and RFKTM were very small in a high proportion of sites. However, taking into account that from practical and computational points of view, ROKFD is simpler than RCTVFD and RFKTM we considered that in this case ROKFD was the best option. A plot of cross-validation residuals obtained by ROKFD is shown in Fig. 5. The residual standard deviation is greater at the end of the considered time period (Fig. 5) where the smoothed and the predicted curves had greater variation. The residual mean varied around zero indicating that the predictions obtained by ROKFD were unbiased (Fig. 5). Similar results were obtained with RCTVFD and RFKTM. regression estimation with functional kriging predictions based on the regression residual in order to perform prediction at non-visited locations. We applied the methodology proposed to a data set corresponding to salinity curves obtained in a estuary located in the Colombian Caribbean. Results indicated that although several alternatives to perform functional kriging prediction with the regression residuals could be used, the method based on ordinary kriging for functional data was the best alternative. In univariate geostatistics several works have shown that the use of residual kriging (and other alternatives such as UK and KED) with OLS estimation causes biased estimations for the covariance parameters. The solution proposed in the literature is based on the use of maximum likelihood estimation for estimating simultaneously both the regression parameters and the covariance parameters. In our approach based on residual kriging for functional data we propose to estimate the parameters from the functional regression model by OLS. The effect of using OLS in this case must be studied. In particular we need to establish in the case of OKFD, the effect of estimating the trace-variogram function by using Eq. (2) based on functional regression residuals, or in the case of CTKFD and FKTM, which is the effect of estimating a linear model of coregionalization based on the coefficients of fitting a basis function to functional regression residuals. We consider that an intensive simulation study must be carried out to solve these questions. The alternative of using maximum likelihood estimation of functional parameters when we have a functional regression model with functional response and spatially correlated errors is yet an open problem in FDA.
We considered a functional linear regression model (Ramsay and Silverman, 2005) for estimating the mean. However other alternatives such as functional nonparametric  and semiparametric (Ferraty and Romain, 2011) models could be considered to fulfill this task. In particular, the use of Kernel regression with a functional response is a very attractive possibility. Another option if we consider many explanatory variables which interact in complicated, nonlinear ways, could be using regression trees for functional data (Nerini and Ghattas, 2007).
Results of cross-validation analysis carried out with the salinity data set showed a good performance of the proposed predictor, and indicated from a descriptive point of view that our method can be adopted as a valid method for modeling spatially correlated functional data when the mean function is not constant through the region of interest.