The distribution of index futures realised volatility under seasonality and microstructure noise

Previous research documents that the distribution of realised volatility appears approximately log-normal. However, formal tests reject normality fairly convincingly, which may indicate intrinsic features in the intraday data series, namely, the presence of seasonal intraday patterns and microstructure noise. Because many models are based on a normality assumption, this must be verified in order to validate the results. We find departures from normality due to the seasonal and noise components of intraday data, such that, after controlling for both features, the volatility estimates follow a log-normal distribution. Our results reveal that failing to account for these market imperfections can have important implications for analyses of volatility transmission and for investment and hedging decisions.


Introduction
The distributional characteristics of asset returns are pivotal in financial economics for asset pricing, portfolio selection, performance evaluation, and managerial decision-making, among others. Because the second moment structure is probably the most critical feature of the conditional return distribution, it has triggered a growing body of research (see Bollerslev and Ole Mikkelsen, 1999;Bekaert and Wu, 2000;Campbell et al., 2001;and Andersen et al., 2003, among others). The pioneering study of Clark (1973) states that the log-normal distribution is appropriate for daily volatility. Indeed, adjusting the volatility distribution towards a log-normal distribution becomes crucial, because normality is assumed in many models, such as option pricing models (e.g. Scott, 1987;Hull and White, 1987;Heston, 1993). Furthermore, when a statistical model includes a normality assumption, this must be verified in order to validate the results. Thus, we investigate the effects of departures from normality in the volatility distribution.
Traditionally, (G)ARCH models have been popular parametric approaches for modelling financial asset return volatilities, correlations and distributions. However, the increasing availability of high-frequency data (HFD) has produced an explosive growth in the financial econometrics of volatility dynamics, allowing for the construction of more accurate daily volatility measures: realised volatility (RV). 1 Additionally, as volatility becomes observable, it can be modelled directly, rather than being treated as a latent variable. Therefore, we can model and forecast it using standard time-series techniques (Andersen et al. 2001(Andersen et al. , 2003. In general, under the assumption that RV follows a log-normal distribution, standard linear Gaussian approaches are used. However, if the hypothesis of normality is not supported, inferences derived from these models may be biased. Previous research documents the RV distribution. Andersen et al. (2001c) study the RV of exchange rates, Andersen et al. (2001b) examine the RV of individual stocks in the Dow Jones, and Areal and Taylor (2002) analyse the RV of index futures contracts. Although these studies find that the RV distribution appears approximately Gaussian, formal tests reject normality fairly convincingly, which may indicate intrinsic features in the data series. Our main objective is to analyse the extent to which the presence of such biases may affect normality. These biases are widely studied in the literature. For example, they include the presence of seasonal intraday patterns (behavioural characteristics of financial markets) and the presence of a noisy component generated by intrinsic operations (trading characteristics arising from issues such as price discreteness, bid-ask spreads, or non-synchronous trades/quotes). 2 Additionally, we examine whether the results are influenced by the frequency of the data, and analyse the economic effects of such corrections in areas that rely on volatility estimates, such as investment/hedging decisions and spillover analyses.
We depart from previous studies and make several novel contributions. First, we extend the data. As such, the empirical analysis is performed using CAC and DAX index futures contracts over a long time horizon of more than 12 years. The frequency of observations is critical, because it may affect the accuracy of estimates. Thus, we use different frequencies (1,5,10,20, and 30 minutes). As noted below, the early literature focuses on one frequency only.
Second, we remove those features that introduce bias into the variance estimates, namely, intraday seasonality and market microstructure noise. Andersen and Bollerslev (1997a) provide a stylised intraday return model based on the Fourier flexible methodology to uncover intraday seasonality. 3 This enables us to obtain deseasonalised intraday returns and, there-1 RV, constructed as a sum of squared intraday returns, has theoretical advantages in the construction of interdaily volatility forecast evaluation criteria (Andersen and Bollerslev, 1998a;Andersen et al., 2001b). 2 The theory of quadratic variation suggests that under certain conditions (basically, frequencies that tend to infinity and markets free of frictions (Merton, 1980;Andersen and Bollerslev, 1998a;Andersen et al., 2001c), RV is an unbiased and efficient estimation of return volatility; however, these circumstances are rarely met in practice. 3 Intraday volatility patterns induce serial correlations and result in difficult statistical inferences (e.g. standard variance-ratio statistics or comparable tests). Andersen and Bollerslev (1997a) propose the Fourier fore, the deseasonalised RV. Later, Barndorff-Nielsen and Shephard (2002) derived a model to measure the RV without the microstructure noise caused by the intrinsic operating and trading limitations. 4 In practice, RV is often modelled by ignoring the effects of these features on the results. In this study, we combine these two models to produce a more accurate estimate of actual volatility. First, to account for the presence of intraday patterns in the volatility of financial market returns (see, among others, Wood et al. (1985), Harris (1986), and Tse (1999)), we estimate the deseasonalised 5 RV (RV s t ) using the model of Andersen and Bollerslev (1997a). Prior studies that address intraday seasonal patterns using different methods include those of Andersen et al. (2001b) and Areal and Taylor (2002). The former remove the serial correlation in high-frequency returns using a de-meaned MA(1) filtered return series; the latter follow Taylor and Xu (1997) to determine the optimal weights of intraday returns, from which they calculate the RV. Then, to address the non-negligible component associated with microstructure frictions, we apply the realised kernel estimator (Barndorff-Nielsen and Shephard (2002); Barndorff-Nielsen et al., 2009) and estimate the RV free of market microstructure noise (RV k t ). In contrast, previous studies have relied on certain frequencies to overcome noise frictions. For example, Andersen et al. (2001b) and Areal and Taylor (2002) employ five-minute returns to mitigate market microstructure noise problems, whereas Andersen et al. (2003) use observations sampled at a frequency of 30 minutes. Finally, we consider both effects, calculating the deseasonalised RV free of microstructure noise (RV sk t ) as the closest approximation to the actual volatility. 6 Third, we study the implications of removing the noise and seasonality for the unconditional distribution of the RV. Given the importance of obtaining log-normal distributions (Clark, 1973;Areal and Taylor, 2002;Andersen et al., 2001b), and because many models are based on normality assumptions, we analyse how volatility estimates evolve to meet a log-normal distribution when considering both features jointly and explicitly. Our results reveal that remarkable reductions in skewness and kurtosis occur in both indices and all frequencies after removing both components. As such, the adjustment to a normal distribution becomes almost perfect, to the extent that the hypothesis of normality cannot be rejected in certain cases.
Fourth, we analyse the effect on the conditional distribution, that is, the autocorrelation flexible form (FFF) regression (Gallant, 1981(Gallant, , 1982 to accommodate this seasonal intraday component, allowing for robust inferences and reliable hypothesis testing procedures in empirically realistic settings (Andersen et al. 2001a). 4 The model of Barndorff-Nielsen and Shephard (2002) neglects intraday periodicities. Their paper states that the model has a repeating intraday component (i.e. diurnal), and that this diurnal effect may not be completely ignorable when the relation between the deterministic periodic component and the noise component is not additive. However, they neglect this deficiency. 5 Deseasonalised volatility refers to the RV in which the intraday seasonal pattern of volatility has been removed. 6 Note that s, k, and sk denote data that are deseasonalised, free of market microstructure noise, and free of both effects, respectively. function and the asymmetric effect. Series that account for intraday seasonality and noise frictions show lower autocorrelation coefficients that remain high for up to about 100 days. Then, they decay slowly to zero at a hyperbolic rate, suggesting the removal of a deterministic structure and the presence of a common degree of fractionally integrated long-memory process across intraday sampling frequencies. With regards to the asymmetric responses of volatility, we find that positive returns have a smaller effect on future volatility than do negative returns, but this response is more visible if the seasonal and noise components are neglected. In other words, the 'true volatility process' (no seasonality and noise) has a minor asymmetric response.
Fifth, we extend the analysis to the multivariate case, focusing on the effects of seasonality and microstructure noise on the covariances and correlations. We encounter significant differences in the unconditional distributions of covariances based on raw data, and in data in which the intraday seasonality and market microstructure noise have been removed. Additionally, we explore two features of the conditional distribution: cross-correlation patterns, and the asymmetric behaviour of the correlations. Our results reveal a significant reduction in cross-correlations after accounting for intraday seasonality. In addition, the greater co-movement of international future indices is mainly observed during negative return scenarios; this response is more obvious if the seasonal and noise components are neglected. Furthermore, when we take these features into account, we observe a decrease in the correlations during positive return scenarios, which are not observed when using raw data.
Finally, we analyse the potential economic effects of the observed differences caused by the seasonality and microstructure noise when estimating RV. The results show a significant effect of ignoring these features in investment and hedging decisions, where we find considerable deviations in the optimal weights. These differences are present at every point in time in which an investor makes investment/hedging decisions. The magnitudes of these differences show high values, involving the full wealth of the investor at certain points in time. In addition, the effect on the spillover analysis is substantial, with differences in magnitude and even in the direction of the net transmission of volatility.
The remainder of this paper is organised as follows. Section 2 describes our data and explains the methodology we employed to compute the RV measures. Section 3 presents the empirical results for the univariate volatility distribution. Section 4 extends the results to the multivariate case, and Section 5 analyses the potential economic effects on investment/hedging decisions and spillover analyses. Finally, Section 6 summarises the results and concludes the paper.

Data
Our empirical data set, obtained from TickData TM , comprises high-frequency observations at different frequencies of transaction prices from two index futures contracts (CAC and DAX) for the period 2 January 2003 to 30 September 2015. 7 . This yields 3,263 and 3,243 trading days for the CAC and DAX index futures, respectively; see Table 1. We rely on the study of Andersen and Bollerslev (1997a) to filter the data and construct the series of intraday returns. At the opening of each trading day, observations incorporate adjustments to the information accumulated overnight and, as a result, display much higher variability. Therefore, we remove the first observation of each trading day to avoid biased results. We also handle missing prices using linear interpolation, which softens the effect of a sharp price change following a trading interruption.
Then, the continuously compounded intraday returns are computed at each interval by taking the logarithms and subtracting the previous value. Thus, the raw intraday returns R t,n at the nth interval at day t, for n = 1, 2..., N and t = 1, 2, ..., T , are computed as follows: where P t,n represents the future price level on interval n at day t.

Construction of realised measures of volatility under seasonality and microstructure noise
Once the raw intraday returns R t,n have been calculated, the realised variance at day t (RV t ), defined as the sum of all available intraday high-frequency squared returns, is estimated as follows: 8 Here, RV (RV t ) is an accurate approximation of the integrated latent volatility (IV t ) (see Andersen et al., 2001c; Barndorff-Nielsen and Shephard, 2002). However, microstructure effects (noise) and intraday periodicities can introduce a severe bias on the daily volatility estimation. Therefore, it follows that where v t =f (c, u t ), c is the deterministic periodic component (intraday seasonality), and u t is the noise component. As noted by Barndorff-Nielsen and Shephard (2002), when the relationship between c and u t has an additive nature, the seasonal (or diurnal) effect is nonsignificant. However, when this relation is not additive, the diurnal effect may not be completely ignorable. Because this relation is unknown, we empirically analyse whether these components have an effect on the RV distribution. To this end, we also estimate the RV distributions after removing these biases.
To construct the deseasonalised realised variance at day t (RV s t ), we first check that the intraday pattern is confirmed for both indices and all frequencies. Figure 1 exhibits the mean absolute return for each interval during a trading day and corroborates the presence 8 A detailed explanation about the RV methodology can be found in Andersen et al. (2003). of intraday seasonality. 9 This pronounced pattern is also evident in the autocorrelogram for the absolute intraday returns depicted out of a lag of 10 days in Figure 2.
Consider, for instance, the intraday volatility pattern for a five-minute frequency represented by Figure 1 (CAC 5 MINUTES). A distorted double U-shaped pattern is evident during the trading day. All markets show a decaying pattern in intraday volatility until 14:30 (corresponding to interval 66). At 14:35 (interval 67), the return volatility increases considerably, before decreasing until 15:30 (interval 77). Then, a remarkable increment occurs again, and remains relatively high until 17:30, reaching its maximum peak at 16:05 (interval 84). Note that the last three five-minute returns for the trading day (intervals 99, 100, and 101) also constitute quite unusual intervals. Similar patterns are found by Harju and Hussain (2011)  In view of this outcome, we apply the FFF, originally proposed by Gallant (1981Gallant ( , 1982, to account for this intraday seasonality exhibited by the data. To capture irregularities in the seasonal pattern, we include time-of-day volatility dummies in the Fourier regression. 11 Thus, after implementing the FFF, we obtain the deseasonalised returns R s t,n . These returns are then used to calculate the deseasonalised realised variance at day t (RV s t ): To overcome the microstructure noise problem, we adopt the realised kernel estimator 12 of Barndorff-Nielsen et al. (2009). Under this specification, the realised variance at day t, free of market microstructure noise, is defined as the following weighted sum (by a kernel function) of the intraday returns: 9 A well-known stylised feature of the intraday statistical characteristics of many financial markets is that volatility is higher at the opening and closing of the trading day, and lower in the middle (e.g. see Wood et al. (1985), Harris (1986), and Tse (1999)). 10 For example, the producer price index, retail sales, consumer price index, consumer confidence, and so on. 11 Dummy variables in which these observations have been removed are incorporated to capture irregularities in the seasonal pattern, except for the last 15 minutes of the trading day, when the frequency used is one minute. Andersen and Bollerslev, 1997a also remove the last 15 minutes returns. Further details on the dummy variables are available upon request. A more detailed explanation of the FFF procedure is given in Appendix A.1.
12 See the estimation details of the realised kernel estimator in Appendix A.2.
where k h H+1 is the kernel function, H is the optimal bandwidth, and R t,n is the raw intraday return.
Finally, to jointly account for intraday seasonality and microstructure noise, we estimate the deseasonalised realised variance free of market microstructure noise (RV sk t ). To do so, we use the deseasonalised returns (R s t,n ) calculated previously, and then apply the following Barndorff-Nielsen et al. (2009) kernel estimator: Thus, for each index and for each frequency, four series of RV (i.e. RV t , RV s t , RV k t , and RV sk t ) are available for analysis. 13

Empirical results for univariate volatility distributions
This section examines the effects on the distributional properties, autocorrelation function, and asymmetric responses of the return volatility after controlling for intraday seasonality and noise. For this purpose, we use as input data the four RV series obtained in the previous section.

Unconditional distribution of volatility
We extend the findings in the extant literature based on HFD to study the distributional characteristics of volatility by jointly considering intraday seasonality and market microstructure noise. To evaluate the extent to which these two factors may affect the distributional characteristics of the RV series, we start by investigating the similarities between the series of realised variances. To this end, we analyse whether the distributions, medians, and variances between the series RV t versus RV k t , RV t versus RV s t , and RV t versus RV sk t 13 Consequently, for each index, we have 20 estimations of volatility (five frequencies × four RV series).
We find that removing seasonality mainly affects the second moment of the series (see Panel  A of Table 2), whereas addressing noise affects the median and the distribution (see Panel B and C of Table 2). Considering these results, it seems that the choice of the series matters. Thus, we believe different distributions of the daily realised logarithmic standard deviations 14 With the exception of the equality of the variances for the CAC index and a 10-minute frequency will be encountered after controlling for both components. 15 Table 3 summarises the moments for the distribution of the daily realised logarithmic standard deviations (lrv t , lrv s t , lrv k t , and lrv sk t ) obtained at frequencies of 1, 5, 10, 20, and 30 minutes for both index futures, where lrv t = log( Based on our samples of high-frequency returns for the two stock index futures, we find that when we use raw returns, the unconditional distributions of the logarithmic realised standard deviations are close to a normal distribution. These results are consistent with those reported in the literature, which documents that the distribution of logarithmic monthly standard deviations constructed using daily returns is approximately Gaussian (French, et al., 1987;Andersen et al., 2001b;Areal and Taylor, 2002). However, note that the hypothesis of normality is rejected for all frequencies and indices (see the results for the Jarque-Bera test in Table 3). For instance, for a frequency of 30 minutes, the Jarque-Bera statistic is equal to 52.08 and 27.90 for the CAC and DAX stock index futures, respectively.
Notwithstanding, after considering either seasonality or market microstructure noise, the results change considerably. Note the reduction in skewness and kurtosis after removing the intraday periodic component of volatility (lrv s t series), market microstructure noise (lrv k t series), and both features (lrv sk t series), leading to a significant improvement in the closeness of fit to the standard normal distribution for both indices and all frequencies. For instance, Table 3 shows that for the CAC index and a frequency of 1 minute, the skewness decreases from 0.394 to 0.160 (lrv sk t series), and the kurtosis decreases from 3.180 to 3.059 (lrv sk t series); similar results hold for the rest of the data. Furthermore, note that for 10minute frequencies onwards, the adjustment to a normal distribution is superior. Thus, the hypothesis of normality cannot be rejected for the CAC index when the frequency of observations is 20 or 30 minutes, whereas for the DAX index, it cannot be rejected for frequencies of 10, 20, or 30 minutes (see the values of the Jarque-Bera test in the last column of Table 3: 2.79, 2.38, 0.95, 2.51, and 8.74, respectively). These findings are consistent with the literature that suggests that using data at the highest available frequency to measure volatility is not always the best approach, because the measure might include additional microstructure effects (Meddahi, 2002;Andersen et al., 2003). These results corroborate the improvement towards the normal distribution for all frequencies studied, suggesting the importance of considering both adjustments up to frequencies of 30 minutes. Failing to account for these effects might lead to bias in a model that relies on a normality assumption.  Table 3 shows the fourth first moments of the distributions of the daily logarithmic standard deviations (lrvt, lrv sk t , lrv s t , and lrv k t ) for the CAC and DAX index futures and the Jarque-Bera test for normality. The rows within each panel display the results for 1-, 5-, 10-, 20-, and 30-minute intraday returns; ** represents rejection at the 5% significance level of the null hypothesis that the daily logarithmic standard deviations are normally distributed.  Finally, in Table 4, we display the mean absolute error (MAE), mean squared error (MSE), and root mean squared error (RMSE), taking as reference the raw RV (RV ), for each adjustment made in the estimate: noise (RV k ), seasonality, (RV s ), and both effects (RV sk ). We analyse how these differences evolve as the data frequency changes for both the CAC index futures (Panel A) and the DAX index futures (Panel B). The results show that microstructure noise is mostly present at the 1-minute frequency, but the effect on the estimate of RV becomes less important as the frequency decreases. However, we still obtain adjustments due to market microstructure noise for the 30-minute frequency (e.g. MSE of 0.2283 in the DAX index, and MSE of 0.5442 in the CAC index). Thus, although the effect of noise at this frequency is lower, it is still advisable to adjust for this bias. The effect of seasonality shows the opposite relationship, where we observe a higher effect at lower frequencies (e.g. MSE of 1.1622 and 1.1518 for the CAC, and DAX index at the 30-minute frequency), and the effect at the 1-minute frequency is much lower (e.g. MSE of 0.2402 and 0.2950 for the CAC and DAX index, respectively). Finding an optimal frequency is not trivial, owing to the trade-off between the opposite effects Russell, 2006, 2008). These results reinforce the importance of considering market microstructure and seasonality when using HFD (at any frequency), and show the potential deviations in estimates of the RV if we ignore them.

Conditional distribution of volatility
Volatility clustering is one of the most common stylised facts in financial time series. A quantitative way to view this clustering property is to use an autocorrelogram. To this end, we plot the 300-day autocorrelation function for the series lrv t and lrv sk t for a 30-minute frequency 16 (see Figure 3).
Note that the intraday periodicities in the return volatility in the lrv t series with autocorrelations are repeatedly above the corresponding values for the autocorrelation function of the lrv sk t series. This is clearly appreciated in Figure 3, where the autocorrelation coefficient of the daily realised logarithmic standard deviations (lrv t ) for one lag is approximately 0.7, but is nearly 0.5 for the lrv sk t series. In addition, there are significant differences among the autocorrelation coefficients of the series lrv t and lrv sk t until approximately lag 250. Furthermore, note that after removing the noise and intraday seasonality, the autocorrelation function remains high up to about 100 days, before decaying slowly to zero. This suggests that a deterministic structure and long-memory volatility dependencies have been removed from the returns. 17 Neglecting this deterministic structure could lead to spurious inferences about the dynamics of the return volatility (Andersen and Bollerslev, 1997a).
To test the suggested long-memory effect, we implement fractional integrated (I(d)) testing procedures. 18 Table 5 shows the estimates of d for the raw against the denoised and deseasonalised absolute returns series. As expected, the estimated coefficients are statistically significant, remarkably stable across the high-frequency returns, and similar to the estimates reported in the extant literature with longer periods of daily data ( Table 5 shows that the degree of fractional integration d ranges from 0.2148 to 0.3546). Other studies, such as those of Andersen and Bollerslev (1997b,1998b,1998c and Andersen et al. (2000), reach the same conclusion. Thus, the commonality across intraday sampling frequencies reveals that the long-memory component of volatility is an intrinsic feature of the returngenerating process, with a similar degree of fractional integration across different measures of volatilities (RV t , RV sk t ) and frequencies (Andersen et al., 2000).
Moreover, previous studies suggest that good and bad news have different predictive abilities for future volatility. In general, negative returns have a greater effect on volatility than 16 The pattern is similar for other frequencies. 17 All frequencies follow the same pattern. The results are not reported here, but are available upon request. 18 The results thus far reveal the importance of removing intraday seasonality and noise. Thus, the following discussion focuses on the measure RV sk t .    ) and Robinson (1995) (d AP ) for the fractional difference parameter d (standard errors in parentheses) using absolute intraday returns. Rows represent estimates at different data frequencies, and columns are divided into two panels (the CAC and DAX indices, respectively). Each panel displays information for raw and denoised deseasonalised absolute returns; ***, **, and * indicate that the null hypothesis of d = 0 (no fractional integration) is rejected at the 1%, 5%, and 10% significance level, respectively. , has important implications for portfolio selection and asset pricing, and should be considered to better estimate volatilities. To this end, we also analyse these asymmetries under microstructure noise and intraday seasonality, and re-examine the underlying empirical evidence relative to the realised measure RV t versus RV sk t .
Following Andersen et al. (2001b), we use the ordinary least squares method to fit the following regression model: where ln(S t ) denotes the daily realised logarithmic standard deviations lrv t and lrv sk t , the indicator I − takes the value zero when r t−1 > 0, and one otherwise, r t−1 is the daily return on day t − 1 (calculated using raw returns when the dependent variable is lrv t , and using denoised deseasonalised returns when the dependent variable is lrv sk t ), and u t is an error term (denoted as ε t and ε sk t for the regressions with dependent variables lrv t and lrv sk t , respectively).
Panel A in Table 6 shows the results of the previous regression for these realised measures Table 6: Estimates of the asymmetric behaviour of volatility Panel A in Table 6 shows estimates (p-values in parentheses) of equation (7) for the CAC and DAX index futures for a 30-minute frequency, where ln(St) denotes the daily realised logarithmic standard deviations lrvt and lrv sk t , the indicator I − takes the value zero when rt−1 > 0, and one otherwise, rt−1 is the daily return on day t − 1 (calculated using raw returns when the dependent variable is lrvt, and using denoised deseasonalised returns when the dependent variable is lrv sk t ), and ut is an error term (denoted as εt and ε sk t for the regressions with dependent variables lrvt and lrv sk t , respectively). In Panel A, *** indicates statistical significance at the 1% level. The descriptive statistics for the residuals are displayed in Panel B. Finally, Panel C displays the median, variance, and distribution equality tests for the series lrvt versus εt and lrv sk t versus ε sk t ; *** indicates that the null hypothesis of equality is rejected at the 1% significance level.
Panel A. Estimates for the regression: ln  19 Note that all parameters are significant. Specifically, the parameters w 3 are negative, indicating that negative returns have a greater effect on volatility. Nevertheless, the findings reveal that this asymmetric effect decreases after removing the market microstructure noise and the seasonal component. For instance, the coefficient w 3 for the CAC index changes from -0.510 to -5.097. 20 Panel B in Table 6 shows the main statistics indicators relative to the residuals of equation (7). As expected, in view of the results thus far, the skewness and kurtosis in the regression in which the noise and seasonality are considered decrease considerably. Turning to the CAC index, we find that the skewness decreases from 0.3477 to 0.2098, and the kurtosis decreases from 3.3710 to 3.0473. The DAX index follows a similar pattern. Finally, Panel C in Table 6 presents equality tests between the variables lrv t and the residuals ε t , and between the variables lrv sk t and ε sk t . Note that the equality of distributions does not hold in either case, which means that considering the leverage effect makes a difference. Again, neglecting the noise and seasonal components reveals noticeable differences in the asymmetric effect.

Empirical results for multivariate volatility distributions
In this section we analyse the differences in the multivariate volatility distribution due to microstructure noise and intraday seasonality. Studies neglecting these characteristics may suffer from model misspecification and provide misleading results. Therefore, understanding the effects of ignoring these two issues when estimating realised measures of volatility in a multivariate setting is important to corroborate previous findings.

Unconditional distribution of volatility for the multivariate case
Although previous studies have analysed the distribution of individual RV (Andersen et al., 2001b;Areal and Taylor, 2002), few works examine the distribution of realised covariances and correlations. Advantages of realised measures of covariances and correlations include that co-movements do not rely on a parametric model and their computation is straightforward. The realised covariance is defined as the cross-product of all available intraday returns in both markets: where RCov i,j,t represents the realised covariance between markets i and j, and R i,t,n , R j,t,n are the intraday raw returns at the n-interval of period t for markets i and j, respectively.
The other covariance specifications considering seasonality and microstructure noise follow 20 Note that exp(-0.510)=0.600,exp(-5.097)=0.006. similar expressions to that in the univariate case. For the deseasonalised realised covariance, we use the deseasonalised returns (R s t,n ) calculated earlier, and then compute the deseasonalised realised covariance as follows: where R s i,t,n and R s j,t,n denote the deseasonalised intraday returns for every interval n of day t for markets i and j, respectively.
To compute the realised covariance free of microstructure noise, we use a variation of the realised kernel in equation (5): where k h H+1 is the kernel function, H is the optimal bandwidth, and R i,t,n , R j,t,n are the intraday raw returns.
Finally, the estimate of the realised covariance considering both effects (seasonality and microstructure noise) is obtained using equation (11) and the deseasonalised returns (R s i,t,n , R s j,t,n ) instead of the raw returns (R i,t,n , R j,t,n ): Table 7 (Panels A-D) summarises the fourth first moments of the distributions of the daily logarithmic covariances (lRCov t , lRCov k t , lRCov s t , and lRCov sk t ) between the CAC and DAX index futures. 21 Additionally, Table 7 (panel E) shows the equality test for the variances, medians, and distribution between the series RCov t versus RCov sk t . Similarly to the individual volatilities, we observe significant deviations in the observed levels of skewness and kurtosis among the series (in the mildest case, for denoised deseasonalised realised covariances, the skewness and kurtosis reach values of 0.0532 and 3.0211, respectively; for the raw realised covariances, the skewness and kurtosis reach values of 0.1940 and 3.1225, respectively). The equality tests also reveal significant differences. Indeed, the null hypothesis of equality is rejected when comparing RCov t versus RCov sk t , that is, when removing seasonality and noise (with the exception of a frequency of 5 minutes). 22 Although the results for the distribution of the realised covariance are not conclusive, they extend our previous findings to the multivariate level. Furthermore, they show the perils of ignoring seasonality and microstructure noise when analysing volatility interactions, because different specifications of realised covariances will lead to different conclusions and decisions. Table 7: Moments and equality test of the distribution of unconditional daily covariances Panels A-D in Table 7 show the fourth first moments of the distributions of the daily logarithmic covariances (lRCovt, lRCov k t , lRCov s t , and lRCov sk t ) between the CAC and DAX index futures. Panel E shows the equality tests for the variances, medians, and distributions between RCovt versus RCov sk t . The associated p-values are shown below each metric; ***, **, and * denote that the null hypothesis of equality is rejected at the 1%, 5%, and 10% significance levels, respectively.

Conditional distribution of volatility for the multivariate case
Similarly to the analysis conducted for the univariate case, we analyse two important features of the conditional distribution for the multivariate case: the serial cross-correlation, and the asymmetric response to positive and negative shocks.  Figure 4 shows the 50-day cross-correlation function for the series of daily realised logarithmic standard deviations (lrv t and lrv sk t ) of the CAC and DAX indices for a 30-minute frequency. This plot shows how the past volatility in one index affects the volatility in the other index. Note that the contemporaneous cross-correlation is very high for both measures of volatility, with values ranging between 0.8 and 0.95. The persistence of the cross-effects is severely reduced, with the decline being more obvious in the series in which we consider the noise and the periodic components (lrv t versus lrv sk t series). Note that the estimates when using raw data (lrv t ) show much higher persistence up to a lag of 35 days (approximately). From this lag onwards, the differences in persistence between the two series are less visible, but still have significant effects on the contemporaneous volatility of the other index. We argue that this higher persistence observed in the cross-correlation for series when neglecting seasonality is due to the repetition of a deterministic structure in the intraday periodic component. This structure may lead to biased results on the transmission of information between markets (Alemany et al., 2019). We could conclude that this transmission exists, when in fact there is a mere repetition of an intraday scheme. Hence, a proper volatility estimator is essential because different choices lead to significant differences.
Previous research suggests that correlations behave differently during good and bad economic scenarios, increasing during crisis periods. However, there is still a debate in the literature about the reasons for this empirical pattern, and even what to call it (some authors refer to it as financial contagion, while others state this is just interdependence; see Forbes and Rigobbon (2002), Corsetti et al. (2005) for further discussion). This phenomenon has important implications for portfolio selection and diversification because, during a financial crisis, the possibilities for risk diversification decrease when they are needed the most. Next, we analyse how different estimates of the RV affect this observed phenomenon.
To do so, we start by computing the realised correlations in our framework, as follows: where RCorr k,s,sk i,j,t is the correlation between indices i and j at day t, for t = 1, 2, ..., T . For different specifications that consider noise (k), intraday seasonalities (s), and both effects (sk), we have that RV k,s,sk i,t and RV k,s,sk j,t are the RVs (obtained from equations 2, 4, 5, and 6), and RCov k,s,sk i,j,t is defined as the realised covariance (obtained from equations 8 to 11).
Then, we regress the following two models using data from the CAC and DAX indices at a 30-minute frequency: 23 where the variable RCorr k,s,sk CAC−DAX,t stands for the different correlations at day t between the CAC and DAX indices, with k, s, sk representing the adjustment made for noise, seasonality, and both effects, respectively. Furthermore, I (++) is a dummy variable that takes the value one when the return of the CAC and the return of the DAX are both positive, and zero otherwise, and I (−−) is a dummy variable that takes the value one when the return of the CAC and the return of the DAX are both negative, and zero otherwise. Thus, in this regression, we can see the increase in correlation during bull and bear markets compared with the average correlation.
In addition, to link the estimates of the correlations with the magnitudes of the past returns, we regress the following model: where r t−1 is the daily return on day t − 1, calculated using the intraday returns corresponding to each correlation. 24 The results for these regressions are displayed in Table 8. Panel A shows that the average estimate of the correlation increases when the indices move downwards together (the coefficients w 2 are positive and significant). This result is robust for all RV estimations, corroborating that financial markets tend to co-move during a crisis. However, the differences in this asymmetric effect of the correlation are more obvious in panel B of Table  8, where we include the magnitude of past returns. Here, we observe differences in the estimates, regardless of whether we consider the intraday seasonality. When we ignore the seasonality, the correlation is positively affected by both positive and negative past returns, where negative past returns lead to a bigger increase in the correlation. For instance, the w 1 coefficient is equal to 0.9951, whereas the w 2 coefficient is equal to -2.0710 when we consider RCorr CAC−DAX,t as the dependent variable and r CAC,t−1 as the independent variable in the regression (14). However, the results after considering intraday seasonality draw a different picture. With these volatility estimates, positive past returns decrease the correlation among the indices (the w 1 coefficients are negative: -0.0407 and -0.0749 for r CAC,t−1 ;  (11)  -0.0629 and -0.0726 for r DAX,t−1 ), while past negative returns increase the levels of correlation (the w 2 coefficients are negative: -0.0480 and -0.0520 for r CAC,t−1 ; -0.0284 and -0.0481 for r DAX,t−1 ). In addition, note that the coefficients of w 1 and w 2 in absolute terms are lower after controlling for noise and seasonality, indicating a lower effect of past returns in the correlation. In these estimates, although we still see an increase in the co-movement during crisis periods, the role of positive returns on the correlation is different. Obviously, this has an important effect on all decisions that rely on estimates of correlations among financial markets, such as international asset allocation.

Effects of seasonality and microstructure noise on economic decisions
Conclusions in different fields of economics rely on the features of multivariate volatility distributions, individual standard deviations, and correlations. The decision-making process in areas such as asset allocation, risk management, or volatility transmission is based directly on estimates of volatility. Therefore, poor estimates might result in suboptimal decisions. Thus, exploring the effects of the corrections caused by seasonality and microstructure noise in such applications will provide a measure of their economic impact.
We analyse two of these applications. First, we examine the effects of seasonality and microstructure noise on the weights of a 'volatility timing' strategy (Fleming et al., 2001;Wang et al., 2020) between the CAC and DAX indices. Second, we construct a minimum variance portfolio (suitable for an investor who wants to hedge her investment) using the CAC and DAX indices.
The top plot in Figure 5 displays the differences (using RV against RV sk ) in the estimated weights of a 'volatility timing' strategy between the CAC and DAX index, where we rebalance the portfolio on a monthly basis. The allocation rule in this strategy is the inverse of the estimated individual volatility; thus, a higher share of wealth will be placed on an asset with lower volatility. This allocation strategy has been shown to provide advantages over more sophisticated allocation strategies (Kirby and Ostdiek, 2012) because it does not require optimization, does not require covariance matrix inversion, and generates positive weights. Specifically, the weight invested in asset i at time t is computed aŝ where RV represents the estimate of the RV for the raw (RV ) or denoised and deseasonalised (RV sk ) cases of asset i at period t, and N is the number of assets (two, in our case).   We observe differences in the allocated weights for the CAC index 25 across the sample period, depending on the volatility estimate. The magnitude and direction of these deviations do not follow a clear pattern, with periods of overestimates and underestimates in the allocated weights of the CAC index futures. The sizes of these deviations are not marginal, and lead to different allocations, up to approximately 10%, at certain points of time.
The bottom plot in Figure 5 shows the differences (using RV against RV sk ) in the estimated weights of a hedging strategy using the CAC and DAX indices. In this strategy, the investor is interested in obtaining an allocated portfolio with a minimum variance. The vector of weights at every point of time t (ω (h) t ) can be estimated using the following expression: where min ω (h) t V ar p represents an optimization problem that minimises the variance of the allocated portfolio, Σ (h) t is a 2 × 2 covariance matrix of the CAC and DAX indices in period t for the raw (Σ t ) or denoised and deseasonalised (Σ (sk) t ) cases, and 1 is a vector of ones. The differences in the allocated weights under this strategy are more evident. The average absolute deviation in weights between using the raw estimated RV and the estimate considering noise and seasonality is around 22%. We also observe differences in the weights at certain points higher than 100%, which represent not only a change in the amount invested in the asset, but also a different position in the asset: a short (long) position instead of a long (short) position. In this dynamic hedging strategy, the selection of the RV estimator turns out to be crucial for its proper implementation, because differences lead to very different decisions.
Certainly, asset allocation and risk management are two applications in which the effects of seasonality and microstructure noise on the distribution of the RV are evident. However, there are other fields in which these effects may have a significant effect. In Figure 6, we plot the net pairwise volatility spillover (NPVS) measure of Diebold and Yilmaz (2012) between the CAC and DAX indices. This measure identifies which markets act as net transmitters and net receivers of volatility spillovers, allowing us to disentangle how international markets are interconnected. The figure shows that the magnitude and the direction of the spillovers between the CAC and DAX indices also depends on the choice of the RV estimator. 26 When using the raw RV, we observe that the CAC index acts as a net transmitter of volatility in periods around 2006 and 2008. However, after considering seasonality and microstructure noise the DAX index shows a more dominant role during these periods. In addition, in this 25 Note that the differences in the weights allocated in the DAX index are the negative values of the weights in the plot 26 Positive values of the spillover index indicate that the CAC index is acting as a net transmitter of volatility, and negative values indicate that the DAX index is acting as a net transmitter of volatility.
second specification, we observe that the DAX index acts as a net transmitter for most of the sample periods analysed, where the magnitude of the estimated spillover is also higher. These results highlight again the potential economic effects of microstructure noise and intraday seasonality, and the importance of considering these characteristics in economic applications based on HFD.

Conclusion
This study examines the conditional and unconditional distributions of RV using HFD, while controlling for features that introduce bias in variance estimates, that is, intraday seasonality and market microstructure noise. We evaluate these estimates for the univariate and multivariate case using two index futures contracts (CAC and DAX) and several frequencies of observations (1, 5, 10, 20, and 30 minutes).
The results of the univariate analysis reveal that filtered series (RV sk t ) better fit the lognormal distribution for all frequencies (to the extent that the hypothesis of normality cannot be rejected), have lower autocorrelation coefficients, and have a lower asymmetric volatility response. The results from the multivariate case are in line with those of the univariate analysis. Here, we find significant differences in the bivariate distribution of volatility, lower cross-correlation coefficients, and that the asymmetric response of volatility remains visible. However, the effect decreases after accounting for intraday seasonality and noise.
Overall, our findings show that intraday seasonality and market microstructure noise are key factors to meet normality. Because normality is the basis of many models, considering both features becomes crucial. Divergences of the empirical distribution from the normal distribution have important implications for both the conditional and the unconditional distributions of volatility, the degree of autocorrelation, cross-correlations, and asymmetric effects. Failing to account for these features leads to RV estimates that are far from the ideal integrated volatility, resulting in inconsistent conclusions in fields such as information transmission, asset allocation, or risk management.

A.1 Modelling intraday seasonality
The HFD literature documents an intraday periodic pattern in volatility that suggests either a U-shaped or a W-shaped form (see, among others, Wood et al., 1985;Harris, 1986;Andersen and Bollerslev, 1997a;and Tse, 1999). Owing to this intraday periodicity, volatility models might lead to spurious inferences about the dynamic of the returns, suggesting the importance of addressing seasonality exhibited by the data. The approach of Gallant (1981Gallant ( ,1982, based on the FFF, has proven particularly adept at overcoming this drawback, making it possible to obtain deseasonalised data (Andersen and Bollerslev, 1997a).
We employ the FFF methodology to approximate the intraday periodic component of volatility. This method uses linear polynomial regressions and Fourier methods, which consider sines and cosines. The decomposition for the intraday returns can be expressed as follows: where E(R t,n ) indicates the unconditional mean, N indicates the number of return intervals per day, S t,n is the periodic component for the nth intraday interval, σ t is the conditional volatility factor for day t, and Z t,n is an independent and identically distributed (i.i.d.) error term with mean zero and unit variance that is assumed to be independent of the daily volatility process. By taking the square of both sides and applying logarithmic transformations, we have log (R t,n − E[R t,n ]) 2 = log σ 2 t S 2 t,n Z 2 t,n N .
To apply the Fourier approach, a two-step procedure is followed. In the first stage,X t,n is computed from equation (18). Then,X t,n is considered as a dependent variable in the Fourier regression (19), which is estimated using a non-linear regression. 28 Oncef (θ; σ t ; n) is calculated, the intraday periodic componentŜ t,n for interval n on day t, which provides a close approximation to the overall volatility patterns in each market, is retrieved as: 29 S t,n = T e(f t,n /2) Finally, the deseasonalised intraday returns series are defined as follows: R s t,n = R t,n S t,n .

A.2 Microstructure noise consistent estimators
Previous studies have shown that measures of RV can be sensitive to microstructure noise at short intervals. The observed price process at these high frequencies is the result of two components: the true price-generating process and noise. However, we can only observe 27 We employ the widely used parametric GARCH(1,1) model to capture daily volatility. In most empirical applications, the GARCH(1,1) is enough to reproduce the volatility dynamics of financial series; thus, it is popular with both academics and practitioners (Engle, 2001). 28 Following Andersen and Bollerslev (1997), the choice of (J,P) is determined by choosing the model that best matches the basic shape of the periodic pattern with the fewest number of parameters. Our selected model for both indices sets J = 1 and P = 2. Expanding the Fourier beyond this produces nonsignificant estimates for any additional µ0j, µ1j, µ2j, γP j , and δpj coefficients. 29 For more detail, see Andersen and Bollerslev, 1997a. both components together. To minimise the effect of microstructure noise in the estimates of the realised variance, studies have proposed techniques that design efficient estimators robust to certain types of frictions.
In our application, we follow Barndorff-Nielsen et al. (2009) to compute the realised kernel estimator 30 as the sum of the intraday returns weighted by a kernel function: where k(x) is a kernel weight function. We implement the Parzen kernel applied to the non-flat-top case, which is given by: One of the critical steps when implementing this kernel is the selection of the bandwidth H. Previous studies have shown that an optimal choice for this bandwidth is: where c * = 3.5134 for the Parzen kernel, n is the number of intraday intervals, and the variable ε 2 is defined as ε 2 = ω 2 We estimate ε simply by:ε whereω 2 is an estimate of ω 2 , andÎV is an estimate of IV = T 0 σ 2 u du. We use this last approximation because IV 2 ≈ T T 0 σ 4 u du, and it is simpler to obtain an estimate of IV than it is for T T 0 σ 4 u du. In our case, we use the raw RV at the corresponding frequency as the estimate ofÎV . The estimate of ω 2 is obtained as follows. By varying the starting point, we obtain k distinct auxiliar realised variances, defined as RV (1) aux , RV (2) aux ...RV (k) aux ,.
For each of the auxiliar realised variances, we computeω 2 (i) = RV (i) aux 2n . Finally, we compute the average of these k estimates of ω 2 (i) as our final estimate ofω 2 : Both techniques (FFF and realised kernel estimator) are combined to obtain the deseasonalised RV free of market microstructure noise. In a first step, the deseasonalised returns (R s t,n ) are obtained using equations (17) to (22). Then, a similar procedure to that described for equations (23) to (26) is implemented, except we use the deseasonalised returns (R s t,n ) instead of the raw returns (R t,n ).