Long-run expectations in a learning-to-forecast experiment: a simulation approach

In this paper, we elicit short-run as well as long-run expectations on the evolution of the price of a financial asset in a Learning-to-Forecast Experiment (LtFE). Subjects, in each period, have to forecast the the asset price for each one of the remaining periods. The aim of this paper is twofold: first, we fill the gap in the experimental literature of LtFEs where great effort has been devoted to investigate short-run expectations, i.e. one step-ahead predictions, while there are no contributions that elicit long-run expectations. Second, we propose a new computational algorithm to replicate the main properties of short and long-run expectations observed in the experiment. This learning algorithm, called Exploration-Exploitation Algorithm, is based on the idea that agents anchor their expectations around the last realized price rather than on the fundamental value, with a range proportional to the past observed price volatility. When compared to the Heuristic Switching Model, our algorithm performs equally well in describing the dynamics of short-run expectations and the realized price dynamics. The EEA, additionally, is able to reproduce the dynamics long-run expectations.


Introduction
The economy can be formally thought of as an expectation feedback system, i.e. a system where agents' expectations, formed on the basis of available information and on past realizations of economic variables, influence future realizations of those variables Hommes (2001). The comprehension of how individual agents form their expectations constitutes a crucial aspect to the understanding of the evolution of the economic system and to design efficient economic policies to guide the system towards socially desirable outcomes. One of the main problems when dealing with expectations is that they are not directly observable, in contrast to to prices, volumes, interest rates and all economic variables recorded every day in world-wide markets. There are several cleverly designed methods to estimate directly or indirectly agents' expectations based on surveys (for a overview, see Manski 2004). However surveys do not typically provide incentives depending on the performance of the responders, so their validity turns out to be limited.
A valid alternative to surveys are controlled laboratory experiments, which have the advantage of a perfect monitoring of the information available to the subjects and, furthermore, allow for the elicitation of the expectations using appropriate performance-based incentives. Learning to Forecasts Experiments, introduced by Marimon and Sunder (1993), are controlled laboratory experiments used to elicit subjects' expectations in an expectation-feedback environment, where the feedback between the subjects' expectations and the aggregate quantities, typically prices, is designed by the experimenter. They turn out to be a powerful and flexible tool to study subjects' expectations formation under different feedback systems. Many LtFEs have been conducted to study how agents form their short-run expectations in financial markets (Hommes et al. 2005b), real estate markets (Bao and Ding 2016), commodity markets (Bao et al. 2013) and in simple macroeconomic frameworks (Assenza et al. 2011;Anufriev et al. 2013a;Cornand C and Kader M'baye C 2013).
We conduct a LtFE in which, unlike the standard settings 1 , subjects should submit a prediction for the asset price at different time horizons. In other words, we explicitly elicit subjects' long-run expectations at the beginning of every period, giving the possibility to revise their expectations as new information becomes available. The novelty of our experimental design is that it incorporates into the LtFEs the elicitation of long-run expectations, in order to study how expectations form and co-evolve with the price. Our setting is extremely simple, since long-run expectations do not enter directly in the feedback mechanism, which is influenced just by one-step-ahead predictions. 2 We can study how subjects form their long-run expectations based solely on price dynamics, considering our setting as a first step in the direction of a better understanding of the dynamics of expectations in more 1 For a comprehensive survey of the macroeconomic experiments on expectations, see Assenza et al. (2014). 2 We use the term "prediction" referring to the forecasts submitted by the subjects during the experiment. We assume that subjects submit their predictions based on their expectations, which are not observable. Therefore, across the paper, we use the word "prediction" and "expectation" as (almost) interchangeable. complex environments. Our results show that, as for short-run expectations, the Rational Expectation Equilibrium (REE) is not a good benchmark for describing the dynamics of subjects' long-run expectations. Instead, the anchor-and-adjustment principle seems to work well in describing subjects' behavior. Subjects, in fact, learn to coordinate their expectations around the last realized price, following some simple adaptive rules. More specifically, we observe that the degree of coordination of expectations as well as the influence of the last realized price decrease with the forecasting time horizon. Finally, we show that eliciting subjects' long run expectations does not affect their short-run forecasts and, therefore, it does not impact the price dynamics. This experimental finding constitutes an important contribution to the literature on LtFEs.
To the best of our knowledge, the only experimental work that elicited the longrun expectations in an asset market with bubbles is the work of Haruvy et al. (2007). Moreover, Hanaki et al. (2016) investigate the impact of forecast elicitation on the miss-pricing in an experimental asset market. A kind of natural experiment has been conducted by Galati et al. (2011), where they elicit short, medium and long-run inflation expectations using professional forecasters from central banks, academics and students. They provide, when possible, a reward based on the performances. Although some effort has been devoted to elicit long-run expectations using data from surveys (Ashiya 2003;Fujiwara et al. 2013), they are not immune to the critical aspects related to the survey methodology.
Why are we interested in eliciting long-run expectations? Several empirical as well as theoretical contributions have stressed the importance of taking into account the whole time spectrum of agents' expectations when designing effective economic policies (Gurkaynak et al. 2005;Coeuré 2013). Since early 2000s, Central banks follow a "forward guidance" communication strategy (Woodford 2001) to try to influence expectations by releasing public announcements on different macroeconomic indicators. Central banks might also establish medium-term inflation targets, to discipline the expectations of economic actors. In both cases, Central banks, when devising their monetary policies, take into consideration agents' expectations at different time horizons. Focusing on the regulation of financial markets, Joyce et al. (2008) highlight the important link between expectations in financial markets and monetary policy measures. They stress the need to monitor the behavior of traders in financial markets and to gather information on their short and long-run expectations to evaluate better the effect of monetary policies. However, they underline how difficult is to gather information on traders' expectations. In the same line, Draghi (2008) also points out how having a clear idea about the effect of public disclosure on long term expectations could be useful to control the emergence of new bubbles. These are just examples of why understanding the way agents form and revise their expectations at different time horizons is a relevant issue for policy design. The experiment we propose is a powerful instrument to observe expectations in financial markets that might be useful to the setting of specific monetary policy.
In the second part of the paper, we introduce a learning algorithm capable of reproducing the properties of the short-as well as long-run expectations observed in our experiment. In the literature of LtFE, we find several computational attempts to describe short-run expectations in different experimental settings and information sets at the disposal of the subjects using learning algorithms to simulate human behavior. Some examples are Heemeijer et al. (2009), Assenza et al. (2011, Bao et al. (2013), and Hommes and Lux (2013) in the context of LtFEs with positive and negative feedback, with exogenous shocks, or in a macroeconomic environment. A unified framework has been proposed with the aim of reproducing all experimental results of LtFEs, the so-called Heuristic Switching Model. This approach is based on the idea that each subject considers a limited number of simple extrapolative rules (heuristics), based on the seminal paper of Brock and Hommes (1998). The artificial agents can switch heuristic depending on the forecasting performance in the recent past. The learning mechanism is based on a performance measure proportional to the quadratic forecasting error (see, for example, Anufriev and Hommes 2012).
We propose an alternative approach to HSM to model individual behavior in a LtFE. We introduce an algorithm that we can loosely define as "non-parametric" since it does not impose any predetermined forecasting rule. The main idea arises from the analysis based on professional forecasters as in Campbell and Sharpe (2009) and Nakazono (2012). According to these studies, professional forecasters, in order to reduce the uncertainty about the future, use the last observed price as an anchor. Looking at the experimental results in LtFEs, especially in the one step-ahead predictions, subjects predict the next price anchoring their predictions around the last realized price. A similar mechanism holds for the long-run predictions, as shown in Colasante et al. (2018). The alternative algorithm proposed in this paper, called the Exploration-Exploitation Algorithm (EEA). The EEA is similar to those algorithms used to solve the multi-armed bandit problem (see Auer et al. 2002 andKoulouriotis andXanthopoulos 2008). In this kind of computational problem, artificial agents face a trade-off between exploitation and exploration, i.e. taking a decision using "known and cheap" information up to period t or gathering "new and costly" information about the environment by exploring the "neighborhood space". The EEA is close in spirit to some existing models based on Genetic Algorithms as in Arifovic and Masson (2004) and Hommes and Lux (2013).
All in all, the HSM and EEA are based on the well-known behavioral principle of anchor-and-adjustment, with the difference that the HSM has a predetermined set of few rules, while the EEA has a wider set of variability of the feasible actions available to the agents 3 . We show that EEA and HSM perform well in describing the dynamics of short-run expectations and the realized price. We test the capability of the EEA to describe the dynamics of long-run expectations we observe in our experiment. The simplicity of the EEA, moreover, allows for the estimation of the key-parameters of the algorithm, providing us with an individual classification of the subjects based on how they form their expectations.

The learning to forecast experiment 2.1 Experimental design
Our goal is to study expectations formation in the short-run as well as in the long-run. We implement a LtFE similar to Heemeijer et al. (2009), where the task of subjects is to predict the future price of an asset. In each of the seven sessions, six subjects play the role of professional forecasters for 20 periods. (See the translated instructions in the complementary material) At the beginning of period t, subject i submits his short-run prediction for the asset price at the end of period t, denoted as i p e t,t , as well as his set of long-run predictions for the price at the end of each one of the 20 − t remaining periods. Long-run predictions are denoted as i p e t,t+k with 1 ≤ k ≤ 20 − t. Each subject must submit a total of 190 predictions, since we elicit contemporaneously both short and long-run expectations.
When submitting their predictions, subjects are informed about: (i) the constant interest rate (r) and average dividend (d), (ii) the realized asset prices until period t − 1, (iii) all their own (short and long-run) past predictions and their corresponding profits. However, they are not informed about the predictions submitted by the other subjects and have just qualitative information on the price generating mechanism. In the instructions, subjects are informed that there is a positive relationship between their one-step-ahead predictions and the next realized price. (In Appendix A, Fig. 21 shows the screen-shot of the experiment.) We follow the approach of Heemeijer et al. (2009) in deriving the pricing equation. The function connecting the short-run predictions with the price at the end of period t is the following: where r= 0.05 in all sessions and d is equal to 3.5 or 3.25 depending on the session. The fundamental price is computed as p f = d r , so that we have some sessions with a fundamental price of 65 and other sessions with 70. 4pe t,t is the average of the six onestep-ahead predictions submitted at the beginning of period t,p e t,t = 1 6 6 i=1 i p e t,t , and the term t ∼ N(0, 0.25) is an iid normal shock. Note that eq. (1) describes a positive feedback between the short-run predictions and the realized price. Under a positive feedback expectation system, an increase in average expectations yields a corresponding increment of the realized price Hommes (2013). This positive feedback seems particularly well suited to describe financial market dynamics, especially when the coefficient of proportionality between expectations and prices is close to 1, as in our experimental setting.
Individual earnings at the end of each period depend on both short and long-run prediction errors and are computed as i π t = i π s t + i π l t . We denote as i π s t the subject pay-off that depends on his short-run prediction error: and as i π l t the subject pay-off that depends on long-run prediction error. We define i π l t = t−1 j =1 iπ l t−j,t , where i π l t−j,t represents the individual profit associated with the accuracy of the prediction submitted by subject i at the beginning of period t − j about the asset price in period t, where 1 ≤ j ≤ t − 1. It is computed according to the following payment schedule: 5 The final payment of each subject is the sum of pay-offs across all periods. We calibrated the parameters of the pay-off functions such that approximately max 20 t=1 i π s t = max 20 t=1 i π l t , in order to give to the subjects the same incentive to provide accurate predictions in the short as well as in the long-run. We implement a payment scheme for the long-run expectations as a step-function, since subjects have an immediate feedback about the accuracy of their short-run predictions, while they experience a delay in evaluating the accuracy of their long-run predictions. Forecasting prices in the long-run is a more demanding task for the subjects with respect to the short-run forecast, so we think that a step function guarantees to the subjects an easier evaluation of their long-run forecasting accuracy. 6 Our choice of 20 periods, in contrast to the approximately 50 one-step-ahead predictions typically used in the literature on LtFEs, is based on a trade-off between having a time series sufficiently long to conduct a meaningful statistical analysis and, at the same time, to avoid a too demanding task for the subjects. It is important to underline that subjects attention do not decrease over time. To stress this aspect, we compute the average time taken by subjects to submit their predictions. Each subject takes approximately 2.5 minutes to submit 20 predictions in the first period, and, at the end of the session, takes on average 30 seconds to submit just one prediction. Figure 1 illustrates the average number of seconds per prediction in each period. We can reasonably assume that subjects submit their predictions not based on boredom. We are aware that one drawback of our setting in comparison with the other LtFEs is the lower number of periods. The main advantage, instead, is that we can monitor the entire time-spectrum of expectations and its evolution over time. Given the absence of  any reference in the literature on eliciting long-run expectations, we decided, thereby, that obtaining experimental evidence on the long-run behavior could significantly contribute to the LtFEs literature.
The experiment, conducted in the Laboratory of Experimental Economics at University Jaume I, involves 42 undergraduate students. Each session lasted approximatively 50 minutes and the average gain was 20 Euros.

Working hypotheses
According to the law of motion of Eq. 1, the REE implies that the realized price p t converges to the fundamental value with very small fluctuations due to the idiosyncratic shock t . If we assume that all subjects follow rational expectations, their predictions in each period t and for each forecasting horizon k should fluctuate around the constant fundamental value, i.e. i p e t,t+k ≈ p f .

Hypothesis 1
Under REE, short-and long-run predictions as well as prices fluctuate around the fundamental value. When testing hypothesis 1, we will consider three possible alternatives: (i) we can observe in our experiment that prices as well as short-run expectations converge to the fundamental value, a case often reported in the literature on LtFEs, whereas long-run expectations exhibit diverging paths; (ii) it might be that longrun expectations converge to the fundamental value, while short-run expectations together with the price dynamics do not converge; (iii) finally, we could observe that neither short-nor long-run expectations converge to the fundamental value. In all these cases, the REE would not be a good benchmark to describe the experimental data.
From Eq. 1, each subject has an incentive to coordinate his expectations around the others' expectations: subjects' expectations are strategic complements. So a subject has to guesstimate the expectations of the other subjects when submitting short-run predictions. The subjects have a strong incentive mutually to coordinate their shortrun predictions.
Hypothesis 2 Subjects learn to coordinate their short-run predictions.
Whereas subjects have a direct incentive to coordinate their short-run predictions, their coordination motive for long-run expectations is more complex. When submitting their long-run predictions, the subject's task is to forecast at the beginning of period t, the price at the end of period t +k, with k > 0. The price at the end of period t + k depends on the other subjects' short-run predictions submitted at the beginning of period t + k. Therefore, each subject should guesstimate, k-periods in advance, the short-run expectations of the other subjects. We should expect that long-run expectations exhibit a lower degree of coordination the longer the forecasting horizon, given the increasing uncertainty in guesstimating the future short-run behavior of other subjects. As a consequence, we should observe a higher dispersion of subjects' long-run expectations, i.e. a lower degree of coordination.

Hypothesis 3
The heterogeneity of subjects' long-run expectations increases with the forecasting horizon.
If hypotheses 2 and 3 hold true, at which level do subjects coordinate their expectations? Since the last realized price is publicly available, it plays two different roles: on the one hand, it determines subjects' profits via their forecasting errors; on the other hand, the realized price is a public signal, carrying information on the others' expectations. Our conjecture is that, due to this double role, the realized price becomes an anchor for the coordination of expectations. We expect a strong coordination of subjects' short-and long-run predictions around the last realized price.

Hypothesis 4
The last realized price acts as an anchor for the subjects coordination of their expectations.
Note that in the Eq. 1, we explicitly exclude the dependence of the price on long-run expectations. Therefore, one might expect that the elicitation of long-run expectations does not significantly impact the price dynamics. However, subjects' pay-offs depend on the accuracy of their long-run expectations. Therefore, we cannot exclude a priori that a subject forms his short-run expectations as a function of his past long-run expectations, using some kind of inter-temporal hedging strategy. Under such incentive scheme, subjects, when submitting short-run predictions, might tend to form short-run predictions consistent with their past long-run predictions. Following this reasoning, we conjecture that: Hypothesis 5 Past long-run expectations influence short-run expectations.

Experimental results
As a first step, one can have a first look at the dynamics of prices and expectations at different time horizons. Figure 2 shows individual short-run predictions and realized prices for all groups. Figure 3 shows the average individual long-run predictions for 2-periods, 3-periods and 4-periods ahead, together with the realized price. As an example, Fig. 4 shows the evolution over time of the price together with individual long-run predictions of one of the groups. Figures 22,23,24,25,26,27 and 28 describe individual long-run predictions as well as the evolution of the price for the 20 periods and for all 7 groups.

Convergence of price and expectations to the REE
In line with the LtFE literature implementing a positive feedback system, Fig. 2 shows that, in our experimental markets, there is no immediate convergence of realized prices to the fundamental value. Apparently, in some cases prices exhibit a slow monotonically or oscillatory patter towards the fundamental value, whereas, in other cases the price seems to diverge. 7 Therefore, we observe that the price does not converge to the REE. If the price does not converge to the REE, do the individual expectations converge? In order to test whether individual expectations converge to the fundamental value, we compute the Root Mean Square Error (RMSE) as the difference between the fundamental value and individual predictions in period t for the price k-periods ahead: Fig. 5, it is evident that short-run expectations do not converge to the fundamental value, although the RMSE reduces over time. The same pattern is observed for long-run expectations. When comparing the degree of convergence for different time horizons, from Fig. 5 we observe a slight increase of the RMSE with the forecasting time horizon. As a first approximation, we can state that the degree of convergence of expectations to the fundamental value is largely independent of the horizon. It seems that the fundamental value is not the main determinant of the dynamics of short-and long-run expectations. Given our results, we can conclude then that the REE is not a good benchmark to describe either the dynamics of subjects' expectations or the price dynamics and, therefore, we reject hypothesis 1.

Coordination of short and long-run expectations
In order to measure the degree of coordination of subjects' expectations, Fig. 6 shows the average standard deviation of subjects' predictions submitted in period t for the price at the end of period t + k. In line with the LtFEs literature, we observe a fast coordination of subjects' short-run predictions. The heterogeneity of subjects' shortrun predictions declines rapidly during the first five periods, to reach afterwards an almost stable value. Similarly, the degree of coordination of long-run predictions increases over time. However, long-run expectations clearly need more time to reach  the same coordination degree as short-run predictions. Moreover, we observe that the heterogeneity of subjects' expectations submitted in period t systematically increases with the time horizon. Subjects' long-run expectations are persistently heterogeneous across periods and the heterogeneity increases with the time horizon. Hypotheses 2 and 3 are supported by the data. An important question arises: What is the origin of such heterogeneity? We are sure that it cannot be a difference in the subjects' information set, since the past price  4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19  dynamics is common and common knowledge among subjects. We can conjecture that subjects have different interpretations of whether and how past prices influence future prices. In this respect, our experimental design allows better to measure the heterogeneity in the way individual subjects form their expectations using their available information as compared to other LtFEs in the literature, since we have a more comprehensive measure of their expectations.

Realized price as an anchor for the coordination of expectations
Subjects learn to coordinate their expectations: at what level do they coordinate? From a visual inspection of Figures from 22 to 28 in Appendix B, one observes that the last realized price acts as an anchor in the formation of subjects' expectations. This effect is particularly evident if we compare the whole set of expectations submitted in the first period, when subjects have no past prices available, to those submitted in period 2 (see panels (a) and (b) in Figures from 22 to 28 in Appendix B). In particular, the anchor effect of the last realized price on the expectations' dynamics can be clearly identified if one considers the strong reduction in the heterogeneity of the entire spectrum of expectations submitted in the second period with respect to the first; such reduction persists for several of the subsequent periods. Figure 7 shows the distribution of the correlation coefficients between the time series of previous realized prices p t−1 and the individual predictions i p e t,t+k at different time horizons, represented as box-plots. We can see that the median of the correlation coefficient is decreasing with the time horizon starting from a value very close to 1. Even for a larger horizon (k = 10), the value of the correlation is significantly different from zero. It is clear that the last realized price becomes an anchor for the short-run predictions. Moreover, the realized price remains a stable anchor even for longer horizons, helping the subjects to reduce the uncertainty in guesstimating the others' future short-run expectations. Hypothesis 4 is supported by the experimental data.

The Exploration-exploitation algorithm
We introduce the Exploration-Exploitation Algorithm to describe subjects' short and long-run expectations together with the price dynamics of our LtFE. Within the EEA, agents choose their actions from a distribution of feasible actions with mean and variance of evolve adaptively as a function of past price and expectations dynamics. In particular, agents learn to adapt their range of actions as they acquire more information on the past price dynamics. We show in Fig. 8 how the heterogeneity among individual predictions in our experiment reduces over time. In early periods, subjects count with few realized prices to have a precise idea about the price evolution, and therefore the range of their short and long-run predictions is wide (exploration phase). After a few periods, subjects learn to coordinate their predictions, using the last realized price as an anchor (exploitation phase), i.e. to submit forecasts narrowly centered in p t−1 . This is particularly true for the one-step-ahead predictions for which subjects receive an immediate feedback Colasante et al. (2018), whereas the heterogeneity of subjects' long-run forecasts remains fairly high. Within the framework of the EEA, we can interpret this experimental evidence as the subjects' tendency to explore a wider range of actions in the first periods due to the lack of information, adopting later on the exploitation strategy, i.e. learning how to coordinate their predictions using the information on the price dynamics in order to gain higher profits. Let us formalize the EEA algorithm. All agents have a set of n = 101 feasible actions denoted as A t = {a 1t , a 2t , ..., a nt } where a 1t < a 2t < ... < a nt . a jt denotes a single element in A t , being j ∈ {1, ..., 101}. The set A t changes every period depending on the last realized price and the standard deviation of past prices. The range of the set is given by (p t−1 − 5 t−1 , p t−1 + 5 t−1 ), common to all Fig. 8 Evolution of the heterogeneity across subjects predictions (one-step-ahead and four-steps-ahead predictions) as a function of the period. The continuous line represents the realized price and the shadowed area refers to one standard deviation of individual predictions. The data refer to Group 1 agents, where t−1 is the standard deviation of the last three realized prices 8 . Note that a 1t = p t−1 − 5 t−1 and a 101t = p t−1 + 5 t−1 . At the beginning of period t, agent i selects an action iãt = i p e t,t from A t , that corresponds to the agent i's expected price for the end of period t, i.e. its one-step-ahead prediction. Besides the one-step-ahead predictions, agent i chooses three more actions iã k t = i p e t,t+k from the corresponding set of actions A k t = {a k 1t , a k 2t , ..., a k nt }, where k ∈ {1, 2, 3}. Each one of those actions represents the agent i's long-run expectations up to four-stepsahead. 9 The elements of the sets A k evolve over time. In particular, the range of A k t is centered on p t−1 , as in the case of short-run predictions, but its width is constant over time and it does not depend on the price evolution (see Fig. 8 panel b). Since the long-run profit function employed in the experiment is a step-function, we assume that the maximum range for the agents long-run predictions is 15. 10 The range of the actions that belong to A k t is therefore (p t−1 − 15, p t−1 + 15). 11 Once all agents choose their actions, the price is computed according to Eq. 1. Agents then evaluate the performance of all feasible actions using a fitness function that accounts for the last realized price as an anchor and the last predictions. The second term in the fitness measures is used to enhance heterogeneity in the individual distributions. We can interpret these fitness measures as a sort of adaptive adjustment of the expectations. We introduce two different measures: V t to evaluate the individual actions in the set A t (short-run predictions); V k t to evaluate the individual actions in the sets A k t (long-run predictions). The value of the fitness measures are: The parameters φ s and φ l represent the weights assigned to the past expectations. Note that, in the fitness functions, we consider quadratic terms to evaluate short-run predictions, while we consider the absolute distance to evaluate the subject's long-run predictions. This difference is introduced in order to replicate the profit functions we use in the experiment. We then introduce a probability distribution associated to the set A t for each agent i. Essentially, all agents have the same set of actions, however the fitness measures and the associated probability distributions differ among agents depending on their individual past performance. For the choice of short-run predictions, let i P jt be the probability that agent i selects action i a jt from the set A t , such 8 The range of the actions turns out to be influential in the dynamics of the EEA if it is sufficiently wide. The range of actions will be determined by the parameters φ s and γ . See in Appendix C the comments on the estimators of those parameters. 9 Despite the fact that, in the experiment, we elicit the expectations for the whole time horizon, we replicate the individual expectations up to four-step-ahead. Our choice represents a good compromise between considering the whole time-span and having sufficient statistics to analyze the properties of the EEA as a function of the time horizon and comparing them to the experimental data. 10 Note that, following the payment schedule used to reward the subjects' long-run expectations, if the absolute difference between the price and the long-run prediction is higher than 15, the profit is equal to zero. 11 Except for the first period, the long-run expectations that we considered lie always in the chosen interval.
that 0 ≤ i P jt ≤ 1 and 101 j =1 i P jt = 1. The probability to select action i a jt is given by: where γ ∈ [0, ∞) represents the intensity of choice, which determines how an agent evaluates the relative performance of the actions. For the choice concerning the longrun predictions, we introduce: According to the probability distributions in Eqs. 5 and 6, in each period t each agent randomly chooses four actions iãt and iã k t , where k ∈ {1, 2, 3} (one shortrun prediction and three long-run predictions). Figure 9 illustrates an example of the probability distributions associated with each action for the different agents for one-step-ahead and four-step-ahead predictions.

The heuristic switching model
In this subsection we describe the HSM that we consider a benchmark to compare the performance of the EEA algorithm in describing and forecasting the short-run predictions. In the following, we list the four heuristics of the HSM as introduced in the original paper by Anufriev and Hommes (2012). We briefly describe those rules adapting them to our notation. We label the four rules according to the index h = 1, .., 4, indicating the corresponding forecasting price as h p e t,t . Note that the left sub-index h denotes now the heuristic rule instead of the subject. The heuristics are: • Adaptive rule (ADA): a weighted average of the last prediction and the last realized price. • Weak trend following rule (WTR): according to this rule, agents take into account the last realization of the price and adjust their prediction extrapolating the market trend. The coefficient of proportionality is smaller than one, so the "weakness" in the name of the rule.
2 p e t,t = p t−1 + w(p t−1 − p t−2 )w = 0.4 • Strong trend following rule (STR): it is structurally identical to the WTR. The difference is given by the weight assigned to the extrapolative parameter, in this case higher than 1.
3 p e t,t = p t−1 + s(p t−1 − p t−2 ) s= 1.3 • Learning and adjustment rule (LAA): the first part is the time-dependent anchor given by the average of the last observed price and the mean of past prices. The second term of the equation represents the extrapolative term. Note here the unitary coefficient of the extrapolative term: The learning mechanism is based on the relative profitability of each forecasting rule among the four fixed rules. Agents do not learn new rules and do not modify them either. They rank the different rules and choose the one that performed better in the recent past. The switching mechanism is based on a performance measure U h,t that depends on the quadratic forecasting error. The performance measure U h,t is given by: where the parameter 0 ≤ η ≤ 1 represents the "memory" of agents, meaning the weight assigned to past errors. We set η = 0.7, following Hommes (2013).
To generate the price, it is necessary to compute the proportion of agents who use each of the heuristics. We compute the proportion of agents choosing each rule, i.e. n h,t , using the discrete choice model with asynchronous updating as in Diks and Van Der Weide (2005) and Hommes et al. (2005a) , as a generalization of Brock and Hommes (1997). The updating equations are: where 0 < δ ≤ 1 denotes the share of agents who update their choices; the parameter β ≥ 0 represents the intensity of choice which determines the switching speed to the most successful rule. Z t−1 is a normalization factor. We consider δ = 0.9 and β = 0.4 as in Hommes (2013). We compute the expected price as a weighted average across the different expectations given by the four rules: p e t,t = 4 h=1 n h,t−1 h p e t,t .
and we insert this value in Eq. 1 to compute the resulting price.
The literature of LtFE has shown that the HSM can fairly well reproduce the properties of short-run expectations in different experimental settings. By comparing the goodness of fit of HSM in describing the short-run expectations to similar LtF experiments from the literature, we provide a quantitative criterion to support our conjecture that the elicitation of the long-term expectations does not affect the shortrun expectations dynamics, due to a potential inter-temporal hedging activity of the subjects trying to smooth their forecasting errors. This means that, if the dynamics of our experimental data is replicated by the HSM as in other similar experiments, we can conjecture that the elicitation of long-run expectations has a negligible effect on the price dynamics.

Estimation and calibration of EEA
The simplicity of the EAA algorithm allows for the estimation of its main parameters, namely φ s , γ and φ l , at the individual level using the maximum likelihood (ML) procedure. Interestingly, the parameters i φ s and γ i , can be expressed in a mathematical closed-form. For the parameter i φ l , we rely on a numerical optimization algorithm (see Appendix C). The individual probability distribution of one step-ahead predictions from Eq. 5, can be approximated by a Gaussian distribution, the mean and variance of which are, respectively: Note also that the mean varies for each subject and for each period, depending on the previous realized price and short-run forecast, while the variance is constant over time. Note that Eq. 10 fixes the interval of variability of the parameter φ s > −1. The mean can be rewritten as follows:  where α i = i φ s 1 + i φ s . The mean of the distribution turns out to be a convex combination of the previous realized price and the previous individual short-run forecast. Estimating i φ s gives us information on how subjects adjust their short-run expectations over time. In Appendix C, we include the details of the estimation procedure. Depending on the value of i φ s , a subject can adjust his short-run predictions attaching a higher weight to the realized price or to the past forecast: (i) for 0 < i φ s < 1, the subject on average adjusts his forecast following a smooth correction towards p t−1 ; (ii) a value of i φ s > 1 implies a smooth adjustment towards the past forecast i p e t−1,t−1 ; (iii) for − 1 2 < i φ s < 0 there is an overcorrection, i.e. the agent revises his expectations towards the opposite direction from the past forecasting error. Table 1 shows how many of the subjects follow a particular adjustment behavior, according to the estimated value of iφs . 12 We can observe, in many instances, a negative value ofφ s (18 out of 40 subjects), which indicates that the overcorrection adjustment is quite a common behavior. Interestingly, in 26 out of 40 cases, subjects weight more Fig. 11 Simulation results of the HSM and the EEA. The continuous black line is the experimental realized price, the blue line is the price generated using the HSM and the dashed grey line is the price generated using the EEA. The simulated data are an average over 100 Monte Carlo iterations of the EEA and HSM the last realized price than their last one-step-ahead prediction. Figure 10 shows the distribution of the estimated values of iφs (panel a) andγ i (panel b).
For the long-run expectations, we estimate the individual values of i φ l , common for two, three and four-step-ahead predictions of a given subject, and use the value of γ i obtained from short-run expectations (see Appendix for the details). Figure 10 (panel c) shows the distribution of the estimated values of i φ l . Note that we obtain just positive values of i φ l . Despite the fact that we do not have a closed-form solution for the probability distributions of the long-run actions (see Appendix C), such consistent positive estimated values for i φ l indicate that, in comparison to shortrun predictions, subjects weight more their own past long-run expectations when forming their long-run expectations, compared to the corresponding weight for short-run expectations.

Performance of the HSM and EEA in describing short-run expectations
After a comprehensive description of the two algorithms, in this section we compare their performance in replicating the experimental results. We calibrate the HSM using the experimental prices in the first three periods. We need, in fact, three prices to compute some of the heuristics. At the beginning of the simulation, we assign the same weight to each rule, i.e. n h,1 = 0.25, ∀h. Starting from period 3, we compute the fitness measures and weights n h,3 associated to each heuristic. For the subsequent periods, we iterate the HSM algorithm detailed in the previous section. We simulate the EEA using the median values of the estimated parameters of i φ s and γ i that turn out to be 0.12 and 0.57, respectively. We calibrate the EEA using the experimental individual predictions and the first three realized prices, since the range of the actions sets depends on the three past realized experimental prices. To compute the price using the EEA, in each period we draw the individual actions from the distributions of Eq. 5. We then use Eq. 1 to compute the price and iterate the algorithm.
As we can see from Fig. 11, the two algorithms reproduce reasonably well the experimental prices. In order to have a more quantitative comparison, Table 2 shows the Mean Squared Error (MSE) of the two algorithms in forecasting the experimental prices. We observe that the errors are of the same order of magnitude. So we can safely conclude that the EEA achieves similar results as the HSM in replicating the The next step is to test the performance of the two algorithms in replicating coordination and convergence of individual expectations. To analyze the coordination of expectations, we compute the standard deviation among the individual one-stepahead predictions in each group of the EEA. In the case of HSM, the procedure is not straightforward since the HSM does not replicate the individual predictions, but instead the frequencies in the use of the heuristics across a population of agents. So, we compute the standard deviation of the expectations considering the frequencies in the use of heuristics, as given by Eq. 8. Figure 12 shows the comparison of the two algorithms with respect to the experimental data. Note that the first three periods coincide with the experimental data because of the calibration procedure of the two algorithms. Figure 13 shows good agreement between simulated and experimental data. The qualitative behavior of the experimental data is well captured by the simulated results of the two algorithms, without any systematic difference.
Comparing the results of Table 2 to the performance of the HSM reported in other papers Anufriev et al. 2013), we can see that the MSEs are of the same order of magnitude 13 . When filtering our experimental prices with the HSM, we obtain essentially similar aggregate results as other LtFEs. We infer that eliciting long-run expectations does not affect the price dynamics with respect the baseline LtFE, i.e. when eliciting just one-step-ahead expectations. We additionally estimate a three-parameters linear individual forecasting rule for shortrun expectations following eq. (9) in Heemeijer et al. (2009), who classified the subjects according to the significant coefficients of the regression. Such comparison is possible, since the two experiments use comparable price equation mechanisms, fundamental prices and time series properties of the price dynamics. We observe that our subjects follow a very similar distribution among the different categories. 14 The similarity in the categorization of subjects constitutes a further evidence of the comparability of our experiment to other similar LtFEs, without the elicitation of long-run expectations. Moreover, in Colasante et al. (2018) it has been shown, using a panel regression, that one-step ahead forecasts do not significantly depend on past long-run predictions. Those three evidences, namely similar performances in forecasting the 13 The MSE reported in the those papers is approximately 0.019 per period. 14 Details on the estimation results and comparative analysis are available from the authors upon request. price under the HSM, comparable categorization of the subjects under linear regression rules and absence of significant correlation past long-run expectations with short-run predictions, allow us to discard a significant impact of inter-temporal hedging strategies followed by the subjects on short-run forecasting and price dynamics, rejecting then hypothesis 5.

Long-run expectations in EEA
To the best of our knowledge, this is the first attempt in the literature on LtFEs to reproduce individual long-run expectations using a learning algorithm. We simulate the EEA using the median values of the estimated parameters of i φ l that turns out to be 1.06. For γ i , we assume that it is equal to the value estimated from the time series of short-run predictions, as stated in Eq. 6.
We analyze the performance of the EEA in replicating the main properties observed in the experimental data: (i) the time series of individual and aggregate expectations, (ii) the role of the realized price in forming long-run expectations, (iii) the coordination of long-run expectations as a function of the time horizon and, finally, (iv) the convergence of long-run expectations to the fundamental value. Note  Fig. 18 Box-plots of the correlation coefficients between the time series of prices and the individual subjects' expectations at different time horizons compared to the corresponding values for the simulated data. The simulated data reported are an average over 100 Monte Carlo iterations of the EEA that here we do not have an aggregate variable such as the realized price, but just individual long-run predictions. Figures 14, 15, 16 show the simulated individual long-run expectations confronted with the experimental data of three representative subjects belonging to groups 1, 4 and 5. Each line represents the predictions submitted by the subject in period t for the price 2, 3 and 4 periods ahead. In other words, each series represents, respectively i p e t,t+1 , i p e t,t+2 and i p e t,t+3 , ∀t. The EEA describes fairly well the individual long-run predictions. However, the simulated time series of expectations exhibit a "rougher" path compared to the smooth time series observed in the experimental data. We conjecture that the experimental expectations possess a higher degree of time-correlations at different time horizons compared to the simulated data EEA. Consider that, in the EEA, the three predictions ( iã 1 t , iã 2 t and iã 3 t ) are independent draws from the corresponding probability distributions of actions that evolve independently without any explicit conditional dependence. Our results speak in favor of the existence of a conditional dependence among the long-run expectations that can be implemented in future modifications of the EEA or, alternatively, in an "augmented" version of the HSM, accounting for long-run expectations.
We cannot rely on a benchmark from the literature to compare the performance of the EEA in describing long-run expectations, as is the case with short-run expectations with the HSM. Therefore, we introduce an alternative formulation of the EEA where long-run expectations are anchored to the fundamental value with a constant range. We label this formulation as EEA(p f ). More precisely, we modified (4) substituting p t−1 by p f and assuming i φ l = 0 for all subjects. This alternative formulation of the EEA can be interpreted as a sort of "noisy rational expectations" bechmark.  Table 3 shows the MSE as a measure of the performance of the two algorithms in replicating the across-subjects average long-run expectations for different time horizons. It is evident that, the longer the horizon, the worse the performance of the EEA. Comparing the MSEs of EEA and EEA(p f ), we observe that EEA performs significantly. 15 Figure 17 shows an example of a time series of average long-run expectations using EEA and EEA(p f ). Note that EEA(p f ) generates a time-series with narrow fluctuations around the fundamental value. The results included in Table 3 give further support for rejecting hypothesis 1 also for long-run expectations.
The good performance of the EEA algorithm when confronted with the EEA(p f ) leads us to infer that the last price turns out to be a meaningful anchor in accounting for long-run expectations (at least up to four periods ahead). Figure 18 shows a stable correlation between the realized price and the expectations at different time horizons, with a fairly good description of the experimental coefficients. Figure 19 displays the standard deviation of individual predictions for the price 2, 3 and 4 periods ahead. It 15 A sum-rank test shows that the difference between EEA and the experimental data is not statistically significant at 5% level in 17 out of 21 cases, whereas, in the case of EEA(p f ), this ratio falls to three out of 21 cases (essentially group 6). shows that the degree of coordination resulting from the simulated data is fairly close to the degree of coordination of experimental data. Additionally, the EEA is able to reproduce the more persistent heterogeneity observed for the long-run predictions as compared to the one-step-ahead predictions. In order to measure the performance of the EEA to reproduce the convergence of individual expectations to the fundamental value, in Fig. 20, we compare the RMSE of simulated and experimental data. We are able to conclude that the EEA is able to replicate also the lack of convergence.

Conclusion
In this paper, we present the results of a LtFE where we elicit subjects' short and long-run expectations about the future prices of a financial asset, generalizing the usual focus of the existing literature on LtFEs. Our experimental results generalize previous findings showing that subjects' expectations are not consistent with the REE, either in the short or in the long-run. The realized price dynamics exhibits in some cases an oscillatory or monotonically tendency to converge to the asset fundamental value; in other instances the chart of the price seems to diverge. We show that the elicitation of subjects' long-run expectations does not have a significant impact on the price dynamics. On the contrary, we observe a clear influence of price dynamics on the formation and evolution of subjects' long-run expectations. Subjects learn to coordinate the entire spectrum of expectations. In particular, subjects anchor their short and long-run expectations around the last realized price. Short-run expectations coordinate faster than the long-run expectations, most probably due to the higher degree of uncertainty faced by each subject when predicting the future behavior of the others.
In the second part of the paper, we introduce an adaptive learning algorithm in order to reproduce individual short and long-run expectations: the Exploration-Exploitation Algorithm. Such an algorithm incorporates the bounded rational behavior of subjects by assuming that their expectations are centered on the last observed price and their range varies according to the most recent price fluctuations. We can cast our algorithm into the well-known anchor and adjustment behavioral framework. In order to evaluate the goodness of fit of our algorithm, we compare it to the well-established Heuristic Switching Model proposed by Anufriev and Hommes (2012) to model short-run expectations and price dynamics. The computational part of the paper shows that the two learning algorithms perform equally well in describing the short-run dynamics of the experimental data. Although structurally different, the two algorithms share the same basic behavioral principle: anchor and adjustment. The fact that both satisfactorily describe the experimental data might signal that subjects follow a similar general heuristic principle when forming their expectations. Additionally, the good performance of the EEA to reproduce the long-run expectations dynamics generalizes and reinforces such conclusion. Interestingly, the simplicity of the EEA algorithm consents the use of the ML procedure to estimate the key-parameters of the learning algorithm at the individual level, allowing for the possibility of a behavioral characterization of the subjects.
Based on our experimental and computational analysis, it seems that the subjects condition their long-run expectations on shorter-term expectations. Such a feature can possibly be cast in a more general version of HSM and/or EEA, which should incorporate a conditional dependence of expectations on past prices and past individual expectations. Future research will focus on the incorporation of more complex dependence of current expectations on past expectations in order to take into account the empirical evidence of our experiment. Moreover, the flexibility of our experimental setting allows us to include other information sources, such as aggregate information on subjects' long-run expectations, public announcements of policy measures (monetary policies with or without targeted level of inflations) or future changes of the fundamentals. In order to illustrate the ML procedure to compute the individual values of φ s and γ , let us introduce a parametrization for the action as a function of the discrete index J :

Figure 21
where N = 101, are the number of steps into which the range of actions is divided and λ = 5. The width of the step is 2λ t−1 N . We approximate the discrete variable a jt , by a continuous variable y, where y ∈ [p t−1 − λ t−1 , p t−1 + λ t−1 ]. We can rewrite (3) as follows: We approximate the distribution of Eq. (5) by a Gaussian distribution: where μ t−1 and σ i are given by Eqs. (10) and (11), respectively. Essentially, the conditioned probability of realization of i p e t,t can be approximated by a Gaussian distribution multiplied by the width of the step of the range of the actions. The loglikelihood function can be easily computed: ln P [ i p e t,t |p t−1 , i p e t−1,t−1 ; i φ s , γ i ] = (T − 4) ln(λ t−1 ) + (T − 4) 2 ln(γ i ) ( i p e t,t −μ t−1 ) 2 +const . (16) It is then straightforward to compute the estimators for i φ s and γ i : i φ s = T t=4 ( i p e t,t − p t−1 ) · (p t−1 − i p e t−1,t−1 ) T t=4 ( i p e t,t − i p e t−1,t−1 ) · ( i p e t−1,t−1 − p t−1 ) . (17) The probability of the actions from Eq. (15) depends linearly on the width of the range. Note, however, that the individual estimators of γ and φ s do not depend on the range of the actions. Therefore, under the Gaussian approximation, we do not have to worry about the choice of the range. On the other hand, it is important to check that the experimental expectations submitted by the subjects lie in the range of actions, i.e. p e t,t can be expressed in terms of the continuous variable y. In our implementation of the EEA, this condition always hold, for short-as well as long-run expectations. In order to estimate the parameters governing the long-run expectations, we make use of a numerical optimization algorithm computing the likelihood using Eq. (6). The main problem for arriving at a closed-form solution for the individual estimator for φ l is the non-analyticity of the absolute value in Eq. (4).