High throughput FT-MIR indirect analysis of sugars and acids in watermelon

aUnidad Mixta de Investigación Mejora de la Calidad Agroalimentaria UJI-UPV. COMAV. Universitat Politècnica de València, Cno. De Vera s/n, 46022 València, Spain. E-mail: raumarre@upvnet.upv.es (R.M) jaicecor@btc.upv.es (J.C.C) bRijk Zwaan Iberica S.A. Ctra. Viator, Pje. El Mamí, S/N. 04120 La Cañada, Almería, Spain. E-mail: g.sanchez@rijkzwaan.es (G.S.) cUnidad Mixta de Investigación Mejora de la Calidad Agroalimentaria UJI-UPV. Departament de Ciències Agràries i del Medi Natural, Universitat Jaume I, Avda. Sos Baynat s/n, 12071 Castelló de la Plana, Spain. Email: valcarce@uji.es (M.V.), rosello@uji.es (S.R.) 1These authors contributed equally to the work


Introduction
The shape, size, color and gloss of a fruit or vegetable represent the first impact on the consumer and thus, the decision to purchase them. But flavor, taste and aroma, has the largest impact on acceptability and desire to consume it again (Barrett, Beaulieu, & Shewfelt, 2010). Although certain consumer segments are mainly driven by price, it seems that perceived quality would be more important than price in determining consumer choice. In fact, consumers are more willing to pay for quality when they are confident on the premium taste that they are going to obtain. Several examples are available on this sense with model produces, including i.a. orange juice (Lange, Issanchou, & Combris, 2000) or apple (Harker, Gunson, & Jaeger, 2003). As a result, market demand for premium quality characterizes modern consumer societies and, in order to reinstate flavor in fruits and vegetables, it is necessary to turn the improvement of flavor into a main objective in breeding programs .
In watermelon, Citullus lanatus (Thunb.) Natsum. & Nakai a key factor determining its quality and its commercial value is sweetness (Kyriacou, Leskovar, Colla, & Rouphael, 2018). The compounds responsible of this trait are mainly the soluble sugars fructose, glucose, and sucrose, with a total sugar amount that ranges from 24 to 91.0 mg g -1 fresh weight (fw) in watermelon cultivars, with a high influence of the environment and the use of different rootstocks (Yoo, Bang, Lee, Crosby, & Patil, 2012). Fructose is usually the main sugar found in ripe fruits with a concentration ranging from 24 to 51 mg g -1 fw, followed by sucrose with mean contents of 10 to 42 mg g -1 fw, and finally glucose with contents ranging from 9 to 23 mg g -1 fw (Fredes et al., 2017;Yoo et al., 2012). Nevertheless, sugar profiles are variable and watermelon genotypes can be divided into high fructose and glucose and high sucrose genotypes (Yativ, Harary, & Wolf, 2010;Yoo et al., 2012).
Crosses between watermelon and wild relatives demonstrated a high genetic variability for sugar profile, and it has been suggested that sucrose accumulation is affected by phloem unloading and sugar metabolism, with a high negative correlation with insoluble acid invertase activity and a positive correlation with sucrose synthase activity (Yativ et al., 2010).
Sugar profiles also vary during the ripening. The concentration of fructose and glucose decrease during the maturation, reaching the lowest level at 50 and 45 days post-anthesis respectively. Sucrose concentration follows an inverse trend, increasing its content and reaching a maximum at 45 days post-anthesis (Soteriou, Kyriacou, Siomos, & Gerasopoulos, 2014).
Traditionally, the sweetness has been indirectly measured as the total soluble solids (TSS) content, because a high percentage of soluble solid content are mono and disaccharides (Maynard, 2001). Despite TSS providing a rapid indirect measure of sugar accumulation, it does not reveal the relative content of each soluble sugar. Considering the unequal contribution to sweetness by each sugar, this gross measurement hinders harnessing the genetic potential for sweetness improvement in watermelon. In fact, in other crops, such as tomato, sucrose equivalents calculated as the weighted sum of sugar concentration using the relative sweetening power of each sugar (1 for sucrose, 1.73 for fructose and 0.74 for glucose) has been more closely related to overall acceptability and sweetness perception than TSS (Baldwin et al., 1998).
Organic acids are also present in watermelon, but to a lesser extent. Among them, malic acid accumulation is considerably higher in this species, followed at a distance by citric acid (Çandir, Yetişir, Karaca, & Üstün, 2013;Fredes et al., 2017). Although acid accumulation in watermelon is usually overlooked due to its insipid acidity, its variations may modulate the perception of sweetness (Soteriou et al., 2014). Apart from a genotypic effect, total acidity and the accumulation of citric and malic acid is affected by grafting in watermelon (Fredes et al., 2017;Soteriou et al., 2014).
A reliable quantification of individual sugar and acid accumulation would be then necessary in order to supply high quality markets, either developing breeding programs considering taste improvement, or assessing the effect of environment or different agricultural practices, such as grafting. Instrumental determinations including high performance liquid chromatography (HPLC) coupled with different detectors and capillary electrophoresis can provide such quantifications (Cebolla-Cornejo, Valcárcel, Herrero-Martínez, Roselló, & Nuez, 2012;Cunha, Fernandes, & Ferreira, 2002;Ma, Sun, Chen, Zhang, & Zhu, 2014;Xu, Liang, & Zhu, 2015). However, the time and cost requirements of these methods are not practical for the development of such programs, as a high number of samples must be analyzed in a short period of time, with a minimum cost.
In this context, the interest of indirect determinations using, for example, infrared spectroscopic methods is undeniable. Flores et al. (2008) used near infrared (NIR) spectroscopy as non-invasive method to measure TSS content in melons and watermelons, successfully classifying melons and watermelons into low, medium and high sweetness levels. However, the classification into different categories do not reflect the concentration of individual compounds. Nevertheless, thick rinds as in the case of watermelon have been reported to interfere with the measurement of internal quality using non-destructive NIR methods (Arendse, Fawole, Magwaza, & Opara, 2018).
In other crops, Fourier-transform mid-infrared (FT-MIR) spectroscopy showed better performances than NIR spectroscopy for the indirect quantification of individual compounds such as fructose, glucose, sucrose, total sugar, and citric acid (de Oliveira, de Castilhos, Renard, & Bureau, 2014). This may be justified as MIR spectra provide information from frequencies of fundamental molecular vibrations and it is less sensitive than NIR to factors influencing light diffusion (de Oliveira et al., 2014). FT-MIR has been successfully applied to the quantification of individual sugars in different crops, including i.a. apricot (Bureau et al., 2009), grapes (Barnaba, Bellincontro, & Mencarelli, 2014), peach (Bureau et al., 2013), tomatoes (Ścibisz et al., 2011;Wilkerson et al., 2013) and passion fruit (de Oliveira et al., 2014).
Little information regarding indirect quantification methods is available in the case of watermelon. This was, in fact, the objective of this work: to assess the performance of partial least square (PLS) regression models using FT-MIR spectra to evaluate the content of soluble sugars (fructose, glucose, and sucrose) and organic acids (malic and citric acids) in watermelon in order to provide a rapid an accurate discrimination and indirect quantification of sugar and acid composition in large number of samples, such as those obtained in breeding programs or production controls targeted to provide high quality fruits.

Plant material and cultivation
Three sample sets were used in the study. The first two sets included samples of breeding lines grown and provided by Rijk Zwaan Ibérica S.A. In this case, cultivation was performed in greenhouse in Paraje el Mamí, Almeria, Spain (36.851988N;-2421842W) in the spring cycle, with a spacing of 1.4m x 1m and following commercial practices in the area. Ripe fruits were harvested at 20-45 days post pollination. The assays were performed during two consecutive years. In the first year (2017) 59 different breeding lines were grown, representing the variability available in watermelon for the development of new varieties. These varieties represented different fruit sizes: 33 belonged to the 1-3 kg group, 18 to 4-8 kg and 8 of them had fruit weight higher than 8 kg. Different skin types were represented including the "Tiger Stripe", "Sugar Baby", "Charleston Gray" and "Crimson Sweet" types. During the second year (2018), 64 breeding lines were grown, including those of 2017 (except one of them) and five new lines. The same skin types were included and the distribution considering fruit weight was the following: 37 in the 1-3 kg group, 19 in the 4-8 kg group and 8 in the >8 kg group.
A third sample set was obtained in 2018. In this case, 60 fruits belonging to the 4-8 kg group were obtained from local markets in order to represent commercial materials of different varieties and areas of production.

Sampling
From each breeding line, one representative fruit was sampled. Ripe watermelons were collected and a crosssection of 5 cm was obtained from the equatorial plane of each fruit. The edible part was obtained discarding the pericarp and approximately 2mm of flesh. After removing the seeds (if present) the samples were blended in a crusher until they were completely homogeneous and then stored at -80⁰C until analysis.
Before the analyses sample supernatants were obtained by centrifuging the defrosted samples at 13.000 rpm during 5 min at 4⁰C to remove any pulp using a microcentrifuge 5415R with fixed-angle rotor F45-24-11 (Eppendorf, Hamburg, Germany). Resulting supernatants were divided into three equivalent aliquots. The first one of them was used for the determination of total soluble solids content (TSS) using an electronic refractometer PAL-1 (ATAGO, Tokyo, Japan) with 0.1⁰ Brix precision. The rest of aliquots were used for FT-MIR analysis and sugars and acids analysis by capillary electrophoresis (CE).

Analysis of FT-MIR spectra
A portable Cary 630 FTIR spectrometer (Agilent Technologies, Waldbronn, Germany), equipped with a temperature-stabilized DTGS detector, and a 5-bounce ATR crystal with ZnSe beam splitter was used to record the mid-infrared spectral range of 4000 -700 cm -1 . The effective path length at 1700 cm -1 with this ATR is 13.0 µm. Data were acquired operating at a spectral resolution of 4 cm-1 using Microlab FTIR Software B.05.3 (Agilent Technologies, Waldbronn, Germany). To improve the signal-to-noise ratio a total of 64 scans of each measure were averaged.
The spectral acquisition was performed in less than 2 min placing 150 µL of sample directly on the crystal (ATR). Spectra were independently measured twice and the crystal was cleaned between samples with distilled water and cellulose tissue.
To perform the models, only data from the fingerprint spectral region from 1500 -900 cm-1 was included.
This spectral region is associated with C-O and C-C stretching modes and O-C-H, C-C-H, and C-O-H bending vibrational modes (Irudayaraj & Tewari, 2003;Stewart, 2004). Moreover, the selection of this spectral region avoids to include the O-H stretching modes from water (Wilkerson et al., 2013).

Analysis of sugar an acid content
Prior to the capillary electrophoresis analysis, the supernatants were diluted 1:20 with ultrapure water and filtered using a 0.22 µm-Nylon centrifuge tube filter (Costar Spin-X, Corning, NY, USA). The quantification of organic acids and sugars was performed using a 7100 CE system equipped with diode array detector and thermostated sample compartment (Agilent Technologies, Waldbronn, Germany). The procedure described by Cebolla-Cornejo et al. (2012) was followed using uncoated fused silica capillaries of 67 cm total length, 60 cm effective length, 375 µm od, 50 µm id (Polymicro Technologies, Phoenix, AZ, USA). Prior its first use, capillaries were flushed with NaOH 1M during 300 s at 50⁰C, NaOH 0.1M during 300 s, and water during 600 s. Each working session started with a rinse of the capillary with SDS 58mM during 120 s and running buffer during 300 s. The running buffer was prepared using a 20mM PCA solution with 0.1% (w/v) HDM and adjusted to pH 12.1. A hydrodynamic injection at 3400 Pa during 10 s was used. The voltage applied for separation was -25 kV at 20⁰C with indirect detection at 214 nm.

Data analysis
PLS regression models were obtained for each sample set, as well as a general model with the whole set of samples (183). For each model 75% of the samples were used as a calibration group (calibration and crossvalidation). The remaining 25% of samples were included in a validation group to obtain predictions using the model. Samples were randomly included in the calibration and prediction groups of the specific models.
In order to enable a comparison between the specific models and the general model, the last included in the calibration and validation groups the same samples of the specific models.
PLS calibration models correlating FT-MIR spectral absorbance data (X matrix) and the concentration of measured compounds (Y vector): TSS, malic acid, citric acid, fructose, glucose, sucrose, total sugars, sucrose equivalents, or citric acid equivalents were obtained. Sucrose equivalents were calculated as the weighted sum of sugar concentration using the relative sweetening power of each sugar: 1 for sucrose, 1.73 for fructose and 0.74 for glucose (Baldwin et al., 1998). Citric acid equivalents were calculated as the weight sum of citric and malic acid considering their relative sourness: 1 for citric acid and 1.14 for malic acid (Stevens, Kader, & Algazi, 1977). All Partial Least Square (PLS) models were performed using Matlab v 9.4 (Mathworks Inc, Natick, MA, USA) and the PLS Toolbox 8.2.1 for Matlab (Eigenvector Research Inc, Wenatchee, WA, USA).
Prior to modelling, data set from the X matrix was transformed using the multiplicative scatter correction (MSC) function, while Y matrices were autoscaled (with mean and standard variation). These pretreatments were selected as they improved the performance of the models compared to other alternatives and they had been used in previous studies dealing with MIR spectra (Wilkerson et al., 2013). The spectral data were then analyzed by PLS using Venetian blinds as cross-validation method. Resulting calibration model performance was evaluated in terms of outlier diagnostics, the number of latent variables (LV), coefficient of determination of calibration (R 2 C ), root mean squares error calibration (RMSEC), and root mean squares error of cross-validation (RMSECV). Selection criteria used to choose the most suitable calibration model was focused on minimizing the RMSECV and number of LV and maximizing R 2 values (Wilkerson et al., 2013).
Outlier identification was performed using a graphical evaluation of Q residuals and leverage. Any outlier point that showed a large Q residual or unusual distribution was removed and the model was recalculated.
Normalized residuals and leverage parameters were also considered for outlier identification (values <-3 or >3) and elimination in response variables.
Once the final model was constructed, prediction matrices were used to evaluate the expected error when the model was applied to predict new samples (validation group). For that purpose, root mean squared errors of prediction (RMSEP) and correlation coefficient of prediction (R 2 P ) were calculated. %RMSEC and %REMSEP were also calculated as a percentage of the mean values of each group in order to contextualize the results.
%RMSEP (maximum) was calculated using the maximum value to provide a reference for selection programs considering high content samples.
To evaluate the predictive capacity of the model, residual prediction deviation (RPD) was calculated as the ratio between the standard deviation of the prediction group and RMSEP. This ratio is an adaptation of the original RPD description which uses SEP instead of RMSEP; the use of RMSEP offers lower RPD value, thus representing a more conservative approach. The models are usually considered useful when RPD values are higher than 2 (Fearn, 2002).

Variation present in the samples used for modelling
The TSS values obtained in the samples of the breeding lines grown in 2017 ranged between 5°Brix and 12.7°Brix and those of breeding lines of 2018 between 9.5°Brix and 14.1°Brix (Table 1). The commercial materials from 2018 ranged from 4.6°Brix to 11.7°Brix. Commercial high-quality watermelons are supposed to be included in the range of 10-14°Brix (Wehner, 2008) though the lower limit can be as low as 4-5% in specific materials (Yoo et al., 2012). Therefore, the usual commercial range of variation was covered with the different sample groups used in the study.
Similar mean values were obtained in both years of fructose and sucrose contents in the breeding lines (Table   1). The accumulation of glucose was much lower than the rest of sugars. Although higher contents of fructose and glucose were found in the breeding lines during the 2018 season, the difference was especially important in the case of glucose (Table 1). Samples from the second year tended to have higher ratios of glucose to fructose and glucose to sucrose. Another important difference between both years was that samples from 2018 tended to have higher citric acid (Table 1). The range of variation for fructose, glucose and glucose contents of the commercial materials was narrower and, in general, it was included in the range of variation of the breeding lines (Table 1). Though a lower minimum fructose content was registered in the commercial materials. In the case of the acids, commercial materials reached higher accumulation of malic acid and lower accumulation of citric acid.
Considering all the samples, that is the general model, the range of fructose accumulation (11.13-55.44 mg g -1 ) exceeded the limits of previously reported variation, between 24.0 and 51.0 mg g -1 (Fredes et al., 2017;Yoo et al., 2012). A wider difference between the range of variation in the present study and previous reports was obtained for the ranges of accumulation of glucose (5.48-37.83 mg g -1 vs. 9.0-23.0 mg g -1 ) and sucrose (0.00-69.65 mg g -1 vs. 10.0-42.0 mg g -1 ). In the case of the acid content, the range of variation represented in the general model for citric acid (0.08-1.71 mg g -1 ) was wider than that observed in previous literature (Çandir et al., 2013;Fredes et al., 2017) while a for malic acid minimum and maximum contents (0.77-3.71 mg g -1 ) were lower than previously reported contents (1.63-4.57 mg g -1 ).
As regards the level of variation, for the three sample sets and the general model, the %RSD for fructose was, in general, considerably lower than that for glucose and sucrose, especially in the breeding lines. The highest levels of variation were always detected for sucrose. A similar imbalance in the %RSD applied to the organic acids, with higher levels of variation detected for citric acid, the one with lower mean contents.
The statistical parameters of the samples of the validation group had similar values to those of the calibration group of each sample set.
A correlation analysis was performed among the variables (Supplementary Table 1). As expected, total soluble solids were highly correlated with total sugar accumulation. Among the sugars a higher correlation was observed between TSS and sucrose. Unexpectedly, a high correlation (0.71) was found between citric acid contents and TSS. Consequently, moderate correlations were found between this acid and total sugars. Malic acid showed low or non-significant correlations with the rest of variables. Glucose and fructose showed a positive moderate correlation (0.51), and both of them a negative correlation with sucrose accumulation.
This is an expected result. As reviewed by Soteriou et al. (2014), during the initial development of watermelons α-galactosidase and acid invertases keep increasing contents of reducing sugars and restrict sucrose accumulation, while later sucrose accumulates at the expense of fructose and glucose due to an increased activity of sucrose phosphate synthase and sucrose synthase and a reduced activity of soluble acid invertase.

PLS regression models.
The model obtained for the 2017 breeding lines was excellent ( values were higher than 2 for TSS and sugars and slightly lower than 2 for citric and malic acids. The model for commercial materials offered a similar performance, with better results for TSS and sugars than for acids (Table 2). In this case, though the R 2 P for fructose was low (0.44), the %RMSEP was similar to the previous models. In the case of sugars, %RMSEP ranged from 8.8% for glucose to 11.7% for fructose. RPD values were higher than 2 for all the compounds except for fructose (1.4) and malic acid (1.5).
For the three specific models, a PLS models were also obtained for total sugars, sucrose equivalents and citric acid equivalents. In the case of total sugars and sucrose equivalents, %RMSEP values obtained were similar to the one corresponding to the best individual compound, while in the case of citric acid equivalents %RMSEP values were better than for individual compounds. These results are interesting, as using these values a selection can be performed with an error level lower than the one that would be obtained combining the error of the individual compounds that participate in their calculation.
A general model was calculated to compare the performance of a model with a higher number of data and wider levels of variation with the specific models. R 2 P values obtained with the general model were similar to those obtained with the best specific model ( Table 2). The %RMSEP values were on average a 30% higher than the best model. Specifically, the predictions with the general model were worse for fructose and citric acid compared to the best model. Alternatively, the performance was better than the worst specific model (%RMSEP on average a 12% lower), especially for glucose, sucrose and citric acid. For the general model, RPD values were higher than 2.5 for TSS and sugars and lower for citric and malic acid. The RPD values for total sugars and sucrose equivalents were also higher than 2. It is possible to say then, that a general model would be preferred to specific models, as it represents a compromise between the best and the worst specific models and it would have a wider applicability. The performance of this general model was especially good for sugars, the main target of breeding programs and the commercialization of watermelon, as %RMSEP values for fructose, glucose and sucrose were 11.3%, 11.1% and 11.7% respectively.
In order to try to improve the resulting models, loadings of the PLS model were reviewed ( Supplementary   Fig. 1), and new models were developed with those wavelengths with higher absolute loadings. With a similar purpose, reverse interval PLS models were also developed. In both cases, the performance of the models obtained were lower and no improvement was achieved with these model variations. Models considering a wider spectra were also considered. Wilkerson et al. (2013) in their work with tomato also considered including the wavelengths between 1800 cm -1 and 1500 cm -1 for certain models, but the authors already alerted about the high absorption of water in this region. The addition of these wavelengths did not improve the models. Additionally, it was considered the possibility of removing the pretreatment of the spectra with the MSC correction, but this pretreatment increased considerably the performance of the models for organic acids. On the other hand, the possible noise of the CE analytical procedure on the performance of the models was discarded, as mean %RSD for the determinations for citric and malic acids and fructose, glucose and sucrose were respectively 2.4%, 1.6%, 1.2%, 1.3% and 1.0%.
FT-MIR has been applied to the analysis of more than 40 genera of plant species, but most works have dealt with authenticating products and identifying adulteration (Bureau, Cozzolino, & Clark, 2019). Nevertheless, FT-MIR has been applied satisfactorily to predict individual sugar and acid contents in different species. For example, Bureau et al. (2009) developed prediction models for apricot using eight varieties, obtaining R 2 values ranging from 0.74 for fructose to 0.97 for malic acid, with %RMSEP values about 12% for glucose, malic acid and citric acid, 16% for sucrose, and 18% for fructose. Barnaba et al. (2014) working with a single grape variety "Sangiovese", obtained R 2 P values of 0.93 and 0.92 for glucose and fructose respectively, with %RMSEP values of 4.5% and 5.5%, which represented RPD values around 2.5. In this case, the prediction for malic acid was notably worse, with an RPD value of 1.15. In peach, Bureau et al. (2013) Taken together, the performance of the predictions made in the present work in watermelon can be considered as excellent, considering previous results in other crops. In the case of the acids, the performance decreases, especially in the case of citric acid. But in this case, it should be considered that MIR spectroscopy is generally regarded insensitive to compounds present at concentrations lower than 1 mg g -1 (Bureau et al., 2019), as it is the case for this acid in several of the analyzed samples. In fact, other works in tomato using FT-MIR spectra revealed a lower performance in the case of acids, which has been related to their lower concentrations (Ścibisz et al., 2011;Ibañez et al., 2019).

Applicability of PLS models: selection pressure and external assays.
One of the main applications of indirect prediction of individual compounds in watermelon is related with the development of breeding programs, in which a high number of samples is to be processed in the shorter period of time possible, and with minimum costs. In this case, it would be interesting to know if the models are reliable when applied to select the samples with the highest contents, given a certain selection pressure.
For this purpose, the general model was applied to identify the 20% or 30% of the samples of the validation group with the highest values ( Table 3). The values of sensitivity (true positive rate) for a selection pressure of 30% were higher than 90% for individual sugars, with specificities (true negative rate) higher than 95% (Table 3). Sensitivities for acids were higher than 70% and for derived variables higher than 75%. More interestingly, the observed mean percentile of selected samples were close to the expected mean percentile considering 100% sensitivity. That means that those samples that were wrongly selected presented high contents close to the threshold of selection. With a selection pressure of 20% the sensitivities were somewhat lower, but probably due to the small number of samples considered with this pressure. It is important to note that the sensitivity for sucrose was 100% and mean percentile for fructose and glucose were 11.4% and 15.6%, again really close to the expected value for 100% sensitivity (11.1%).
In order to check the robustness of FT-MIR predictions, two new general models were calculated using the whole number of samples of one of the sample sets of breeding lines and the set for commercial materials.
These new models were then applied to predict the contents of the remaining sample set for breeding lines.
This is a really tough test, as it involves the prediction of a high number of samples of an external assay, and it is rarely found in the literature. This test will evaluate if internal calibrations are required to develop specific models for each assay, increasing the cost of analysis, or if reliable general models can be applied without assay-specific calibrations. The models developed offered a lower performance compared to the previous general model (Table 4). This result is expected, as a principal component analysis of response variables showed that the range of variation of the three sample sets was rather different, both from the point of view of FT-MIR spectra (Fig. 1) and sugar and acid contents (Fig. 2), especially in the case of breeding lines grown in 2018.
It seems evident, then, the necessity to use a high number of samples representing the widest variability possible to reinforce the robustness of FTIR models. But despite this limitation, the %RMSEP obtained for sugars in the prediction of the breeding lines grown in 2018 ranged from 12.6% for sucrose and 16% for glucose. In the case of the predictions of sugars for 2017, %RMSEP values ranged from 10.1% for sucrose and 28.3% for glucose. These figures prove the reliability of the methodology, considering its indirect nature and the application to assays not included in the calibration.

Conclusions
FT-MIR based PLS regression models in watermelon can offer precise indirect predictions of gross variables such as TSS and of the individual accumulation of specific sugars (fructose, glucose, and sucrose). The error in the prediction of acids, is considerably higher, especially for citric acid, found at low concentrations in this crop. The performance of the models depends on the specific material included in the calibration. Therefore, the use of general models is recommended as they offer a compromise between the performance of the best and worst models and can be applied to a wider range of variation. These general models can be satisfactorily applied to selection programs, with high sensitivity levels. Even those samples wrongly selected would have high contents close to the threshold of selection. General models are robust and can be applied to the prediction of external assays not included in the calibration. Nevertheless, in such cases the use of a wide range of variation is suggested in the development of general models.