Heuristic Switching Model and Exploration-Exploitation Algorithm to Describe Long-Run Expectations in LtFEs: a Comparison

We elicit individual expectations in a series of Learning-to-Forecast Experiments (LtFEs) with different feedback mechanisms between expectations and market price: positive and negative feedback markets. We implement the EEA proposed by Colasante et al. (J Evol Econ 2018b. https://doi.org/10.1007/S00191-018-0585-1). We compare the performance of two learning algorithms in replicating individual short and long-run expectations: the Exploration-Exploitation Algorithm (EEA) and the Heuristic Switching Model (HSM). Moreover, we modify the existing version of the HSM in order to incorporate the long run predictions. Although the two algorithms provide a fairly good description of prices in the short run, the EEA outperforms the HSM in replicating the main characteristics of individual expectation in the long-run, both in terms of coordination of individual expectations and convergence of expectations to the fundamental value.


Introduction
The origin of heterogeneity across individual expectations and the role that it plays in shaping aggregate outcomes is an important topic in theoretical as well as empirical research in macroeconomics. Unlike the stock prices, volumes of sold books, interest rates of bonds, downloads or number of likes, which can be precisely measured and recorded in the new world where almost "everything" is now in an electronic format, expectations are not directly observable. This means that there is a significant limi-tation when it comes to fully understanding the precise role played by expectations in driving macroeconomic aggregates. One way to circumvent this problem and rigorously model the expectations of individuals is to assume consistent expectations, i.e. rational expectations, following the seminal idea of Muth (1961), which has been further developed by Lucas Jr and Prescott (1971). From a formal point of view, the main advantage of rational expectations is that agents can be rational in one way only. The argument put forward by Friedman and Friedman (1953) on the irrelevance of "irrational" individuals in the long run gives further intuitive appeal to the rationality assumption.
As an alternative to the rational expectations paradigm, it certainly plays a central role in the bounded rationality assumption of economic agents introduced by Simon (1966). In the world of bounded rational agents, we typically lose the uniqueness of the behavior, as agents can be "non-rational" in many different ways. Laboratory experiments have been largely demonstrated to be an essential methodology for shedding light on the degree of bounded rationality of individuals. Countless experiments have shown that in complex environments, subjects follow simple adaptive rules, called heuristics, in order to form expectations, changing their mind as a function of the evolution of the environment and adapting to the new circumstances. The principle of anchor and adjustment, introduced by Kahneman and Tversky (1973), is a sufficiently general and flexible framework that can be certainly cast into the bounded rationality paradigm and is able to realistically describe the way individuals form and adapt their expectations.
Laboratory experiments is one of the methodologies that allows us to directly elicit individual expectations using performance-based incentives. In particular, within the experimental literature, Learning-to-Forecast Experiments, in the following LtFEs, (see Marimon et al. 1993), are designed to study the formation of individual expectations within different expectations feedback systems in a market where the price depends on subjects' (short-run) predictions. Such an experimental framework makes it possible to study the conditions under which individual predictions converge to the rational expectations equilibrium. Moreover, it enables efficient testing of alternative formulations of expectation formation models. A large number of LtFEs have been conducted to analyse the way individuals form and adapt their short-run expectations in different economic environments: in financial markets (Hommes et al. 2005), real estate markets (Bao and Ding 2016), commodity markets (Bao et al. 2013) and in simple macroeconomic frameworks (Assenza et al. 2011;Anufriev et al. 2013;Cornand and M'baye 2016). The vast majority of those LtFEs focus on eliciting individual short-run expectations, i.e. providing incentives to forecast the next period market price. The novel contribution to this experimental literature of Colasante et al. (2018aColasante et al. ( , 2019 is to focus on the entire spectrum of expectations, as they elicit contemporaneously short-and long-run expectations about the evolution of the market price under positive and negative feedback systems. In particular, they explicitly give incentives to the subjects to submit their predictions regarding the market price at the beginning of every period, giving the possibility to revise the predictions as new information becomes available. This experimental design makes it possible to study how expectations form and co-evolve with the price at different forecast horizons. Their results concerning short-run predictions are in line with the literature (see Heemeijer et al. 2009): fast convergence and slow coordination in the markets with negative feedback, and slow convergence and fast coordination in the markets with positive feedback. Regarding the long-run predictions, Colasante et al. (2018aColasante et al. ( , 2019 observe that in markets with positive feedback treatments the market price plays a pivotal role in the expectations formation process, whereas in markets with negative feedback it turns out that subjects use the fundamental value as their main reference point. We can find few computational learning algorithms in the literature to describe individual expectations in LtFEs (see Heemeijer et al. 2009;Assenza et al. 2011;Bao et al. 2013;Hommes and Lux 2013). The most commonly used is the so-called Heuristic Switching Model, HSM hereafter (see Brock and Hommes 1998). Using experimental data on long-run expectations, Colasante et al. (2018b) introduced an alternative adaptive learning model of bounded rationality, the Exploration-Exploitation Algorithm (EEA) that, contrary to the HSM, accounts contemporaneously for subjects' short-and long-run expectations.
The aim of this paper is to evaluate the performance of the EEA and the HSM to reproduce the long-run expectations in markets with positive and negative feedback. Since the original version of the HSM can account for short-run expectations only, we introduce a modified version of the HSM to capture the short-as well as long-run expectations. In the present work, we propose a very basic generalization of this algorithm based on the idea that subjects linearly extrapolate their short-run predictions. The aim of this simple modification is to have a benchmark useful for understanding on what extent subjects implement such a simple extrapolation rule or if they take into account different information to make predictions in different forecast horizons. One can evaluate the performance of this benchmark to precisely measure deviations from the linear extrapolation hypothesis. We take the position that comparing the capability of different algorithms in describing the dynamical properties of the formation mechanism of individual expectations is a valuable contribution. Comparing the basic constituents of the algorithms helps to identify which are the determinants in designing reliable models of expectations formation. Furthermore, it allows to devise new models by possibly combining the crucial ingredients of the considered algorithms in a more efficient architecture to describe the data and to model expectations.
The paper is organized as follows: in the Sect. 2, we illustrate the experimental setting of the LtFE and the experimental results. In Sect. 3, we describe the details of the two learning algorithms, namely, the HSM and its modified version and the EEA. In Sects. 4 and 5, we describe the simulation results and propose a modified version of the EEA to include a different anchor, respectively. Finally, in Sect. 6 we present the main conclusions.

Experimental Design
This paper builds upon the LtFEs of Colasante et al. (2018aColasante et al. ( , 2019. In this section, we briefly describe their novel experimental design to elicit subjects' expectations at different time horizons. In those LtFEs the authors elicit subjects' short-and long-run expectations in markets characterized by either a positive or a negative expectations feedback system. They conduct a total of 15 sessions: 7 with positive and 8 with negative feedback. In each session, 6 subjects play the role of professional forecasters for 20 periods. More precisely, at the beginning of period t, subject i submits her short-run prediction for the price at the end of period t, denoted as i p e t,t , as well as her long-run predictions for the price at the end of each one of the 20 − t remaining periods, denoted as i p e t,t+k , with 1 ≤ k ≤ 20 − t. To compute the market price, the pricing equation proposed by Heemeijer et al. (2009) is implemented. In the positive feedback treatment, the law of motion of the price is given by: while in the negative feedback treatment the market price is computed as follows: where r = 0.05 in all sessions and p f is the constant fundamental value in a given session. 1 The termp e t,t in the equations is the average of the six one-step-ahead predictions submitted at the beginning of period t,p e t,t = 1 The main difference between Eqs. (1) and (2) is how expectations affect the market price: Eq. (1) describes a positive feedback system where subjects predictions are self-fulfilling; i.e., the higher (lower) the average forecast, the higher (lower) the price. Equation (2) describes, instead, a system in which there is a negative feedback between expectations and price; i.e., the higher (lower) the average forecast, the lower (higher) the price. Even if the market price is solely determined by short-run predictions, all predictions are rewarded. Individual earnings at the end of each period depend on the forecast errors and are computed as i π t = i π s t + i π l t , where i π s t denotes the payoff for the short-run predictions: and i π l t denotes the subject's pay-off that depends on the subject's long-run forecast error, being i π l t = t−1 j=1 i π l t− j,t . The term i π l t− j,t represents the individual profit associated with the prediction submitted by subject i at the beginning of period t − j for the price in period t, where 1 ≤ j ≤ t − 1. The long-run predictions are rewarded according to the following scheme: Note that subjects receive immediate feedback on their short-run forecast accuracy, while they experience a delay in evaluating the accuracy of their long-run predictions. The final payment of each subject is the sum of payoffs across all periods. 2 In the positive feedback treatment, subjects are informed about the value of the constant interest rate (r ), average dividend (d), the asset prices until period t − 1 and all their own (short-and long-run) past predictions. In the negative feedback, subjects receive only information about their own past predictions and past market prices. As is typically done in the literature on LtFEs, the subjects receive some qualitative information on the feedback system between expectations and market price and the profit functions. See Colasante et al. (2018aColasante et al. ( , 2019 for additional details about the experimental design. According to Eqs. (1) and (2), the REE predicts that the price p t converges to the fundamental value with fairly small fluctuations proportional to the idiosyncratic shock term t . What would be the REE for long-run expectations? When eliciting long-run expectations, subjects submit at the beginning of period t their predictions for the end of period t + k, for all k > 0. The price at the end of period t + k is a function of the subjects' short-run predictions submitted at the beginning of period t + k. Indeed, each subject has to guestimate k-periods in advance, the short-run predictions of the other subjects. If we assume that all subjects follow rational expectations, then their predictions in each period t, and for each forecast horizon k, are i p e t,t+k ≈ p f , independently of the expectations feedback system.

Experimental Results
In the following we summarize the experimental results in Colasante et al. (2018a). Figure 1 displays the dynamics of the market price in all 15 markets. As an illustrative example, Figs. 2 and 3 show the individual short-run predictions and the market price dynamics in two representative groups for each treatment. From a visual inspection of those figures, we observe that market prices follow qualitatively different patterns, depending on the feedback mechanism implemented in the particular treatment. In the positive feedback treatment, short-run predictions coordinate after only a few periods, although not necessarily around the fundamental value. The market price exhibits an oscillatory pattern without converging to the fundamental value. Meanwhile, in the negative feedback treatment, we observe that market prices quickly converge to the fundamental value after a few periods of uneven fluctuations, while individual short-run predictions need more periods to coordinate on the fundamental value.  To quantify the convergence of subjects' expectations, the Mean Absolute Deviation (MAD) between individual predictions and the fundamental value is computed for both short and long-run predictions as follows: with k=0,1,2,4,6,9. The notation < · · · > g denotes the average across groups. To quantify the degree of coordination, the MAD between individual predictions and the (within-group) average prediction is computed for each period t and for a given forecast horizon k, with k = 0, 1, 2, 4, 6, 9: (6) Figure 6 summarizes the main results in terms of convergence and coordination. A comparison between treatments in terms of convergence leads to a well-known conclusion in line with the LtFE literature (see Heemeijer et al. 2009): the predictions for different forecast horizons converge to the fundamental value only in the negative feedback treatment, while, in the positive feedback treatment, predictions systematically deviate from the fundamental value. Focusing on coordination of expectations, in the positive feedback treatment, subjects' one step-ahead predictions coordinate faster than in the negative feedback treatment, whereas in the long-run predictions, the forecasts' disagreement increases with the horizon, i.e., the longer is the forecast horizon, the higher is the dispersion of predictions. This is strongly connected to the absence of a long-term anchor for subjects' expectations. In fact, the long-run predictions are characterized by some sort of cone-shape trajectory, compatible with subjects using a linear trend extrapolation rule with heterogeneous slopes. In the negative feedback treatment, where subjects learn the fundamental value, coordination of short and longrun predictions is driven by the convergence of the market price to the fundamental value. Subjects are able to learn the REE and, as a consequence, both price and expectations converge to it over time. It is just in the negative feedback system that the REE constitutes a good benchmark to describe the market price dynamics and the evolution of subjects' expectations.

Learning Algorithms: HSM and EEA
In this section we describe the two learning algorithms that we use to reproduce the experimental data.
The HSM is a well-established learning algorithm proposed to explain subjects' behavior in many LtFEs [see Assenza et al. 2014;Hommes (Forthcoming)]. According  to this model, subjects forecast future prices by selecting, among a given set of simple heuristics, the prediction rule that performed best in the recent past. The process of learning is based on the feedback they receive from their profits, which constitutes a fitness measure to rank those rules. In every period, each subject selects the rules with a probability proportional to the fitness measure, so that the best performing rule has a higher chance of being selected. Similarly to the HSM, the EEA is based on the very basic principle of anchor-and-adjustment, (Kahneman and Tversky 1973). Differently from the HSM, which can be thought as a "parametric" learning algorithm, the EEA allows a higher degree of flexibility in the range of possible prediction rules; see Colasante et al. (2018b). Its key learning mechanism is the identification of the range of feasible actions around the anchor, represented by the market price in the previous period. As long as more information on the price dynamics is available, each subject adjusts her own range of possible predictions and then selects her specific prediction proportionally to the fitness measure. The process of selection consists of two main phases: (i) the exploration phase, in which subjects have scant information and try to form their expectations about the future evolution of the market price and (ii) the exploitation phase, in which subjects refine their predictions in order to locally "optimize" their performance. Note that in the EEA we are not imposing any precise parametrization of the subjects' predictions, which are free to vary in a range and whose evolution adapts to the past market conditions.

The Heuristic Switching Model
In the following, we list the four heuristics of the HSM as introduced by Bao et al. (2012). We label the four rules according to the index h = 1, . . . , 4, indicating the corresponding forecast price as h p e t,t . Note that the left sub-index h now denotes the heuristic instead of the subject. The heuristics are: 3 • Adaptive rule (ADA): • Trend following rule (TFR): • Contrarian rule (CR): • Learning and adjustment rule (LAA): The learning mechanism is based on the possibility for the subjects to switch among the four given rules, choosing the one providing the relatively highest profitability in the recent past. The following equation is employed to compute the performance of each heuristic: where the parameter 0 ≤ η ≤ 1 represents the "memory" of agents, meaning the weight assigned to past errors. We set η = 0.7, following Hommes (2013). It is assumed that only a fraction of agents changes the rules every period. The share of agents using a specific rule is computed by using the discrete choice model with asynchronous updating, as in Diks and Van Der Weide (2005): where 0 < δ ≤ 1 denotes the share of agents that update their choice; the parameter β ≥ 0 represents the intensity of choice, and it determines the switching speed to the most successful rule and Z t−1 is a normalization factor. As in Hommes (2013), we consider δ = 0.9 and β = 0.4. We compute the expected price as a weighted average across the different expectations given by the four rules: Plugging the value ofp e t,t into Eqs. (1) and (2), we compute the market price, in the positive and negative feedback treatments, respectively.
In this paper, we generalize the HSM in order to account for the long-run expectations. This constitutes a first attempt to make the HSM, which is usually implemented to explain short-run expectations, suitable for generating long-run predictions as well. Also in this case, the left sub-index h denotes the heuristic instead of the subject, so, for instance, 1 p e t,t+k denotes the individual prediction submitted in period t for period t +k generated according to rule number 1. We simply assume that the subjects linearly extrapolate their short-run prediction rules to determine their long-run expectations as follows: • Long-run adaptive rule (L-ADA): • Long-run trend following rule (L-TFR): • Long-run contrarian rule (L-CR): • Long-run learning and adjustment rule (L-LAA): We assume that, once an agent selects one of the short-run prediction rules, she extrapolates linearly that rule in order to predict prices in horizons up to four steps ahead.
In other words, we assume that the selection of the rule at time t is based on the feedback from the profits associated with the short-run predictions only. Our simple modification of the HSM to account for long-run predictions constitutes the first step in this direction present in the literature on LtFEs. It is essentially based on simple intuition and easy implementation.

The EEA: the Exploration-Exploitation Algorithm
The EEA, as outlined in Colasante et al. (2018b), is a "non-parametric" learning algorithm implemented to characterize the price dynamics of both short-and long-run expectations in LtFEs. We assume that each subject, represented by an artificial agent, has a set of available actions in every period. To compute short-run predictions, the range of actions of the artificial agents evolves adaptively as a function of the past prices and expectations. In particular, the range of the set of actions is centered in the last realized price and its range, which represents the exploration space, and is proportional to the standard deviation of the past market prices. The probability to choose a particular action is proportional to a given distribution, whose mean and standard deviation depend on past pries. During the experiment, the range of variability of subjects' predictions diminishes over time, from a very high level of dispersion in the first few periods to a very narrow band at the end of the experiment (see Figure  6b). Within the EEA, the behavior of the subjects can be cast into two distinct phases: (i) the exploration phase, i.e., the tendency to explore the range of actions to acquire knowledge on their environment and (ii) the exploitation phase, i.e., learning to coordinate their actions using the acquired information on the price-generating process in order to gain higher profits. In the second phase, the range of actions significantly reduces around the last market price. Let us formalize the EEA algorithm. All agents are characterized by a set of n = 101 feasible actions, A t = {a 1t , a 2t , ldots, a nt }, where a 1t < a 2t < · · · < a nt and a jt denotes the single element in the set. The set A t changes every period depending on the last realized price and the magnitude of the fluctuations of past prices. More precisely, the range of the set of actions lies in the interval ( p t−1 − 5 t−1 , p t−1 + 5 t−1 ), and it is common to all agents, where t−1 is the standard deviation of the last three market prices. In each period t, agent i selects an action iãt ≡ i p e t,t from A t , which corresponds to its price forecast for the end of period t, i.e., its one-step-ahead prediction.
Agent i chooses three additional actions iã k t ≡ i p e t,t+k from the corresponding set of actions A k t = {a k 1t , a k 2t , . . . , a k nt }, where k ∈ {1, 2, 3}. They represent the spectrum of agent i's long-run expectations up to four steps ahead. The interval of variability of the range of A k t is centered in p t−1 , similarly to the case of short-run predictions, with a constant width independent of the evolution of the price. The maximum range for the agents' long-run predictions is 15, similar to the maximum deviation of a long-run prediction to be rewarded with a positive profit in the experiment; see Eq. (4). The range of actions that belong to A k t is therefore ( p t−1 −15, p t−1 +15). 4 Once all agents choose their actions, the price is computed according to Eqs. (1) and (2).
Agents, then, evaluate the performance of all feasible actions using a fitness function that accounts for the last realized price as an anchor and their last predictions. We introduce two different measures: V t to evaluate the individual actions in the set A t (short-run predictions) and V k t to evaluate the individual actions in the sets A k t (long-run predictions). The value of the fitness measures are: We can interpret these fitness measures as a sort of adaptive adjustment of the expectations. The parameters φ s and φ l constitute the relative weight assigned to the past prediction. Note that, in the fitness functions, we reproduce the structure of the payoff function in Eqs. (3) and (4), i.e. a quadratic term to evaluate short-run predictions, and a term proportional to the absolute distance to evaluate the subject's long-run predictions. We then introduce a probability distribution over the range of the possible actions of each agent i. Essentially, all agents have the same set of actions; however, the fitness measures and the associated probability distributions are different among agents, depending on their individual past performance. Let i P jt be the probability that agent i selects action i a jt (i.e., a short-run prediction) from the set A t , such that 0 ≤ i P jt ≤ 1 and 101 j=1 i P jt = 1. The probability to select an action i a jt is computed as: where γ ∈ [0, ∞) represents the intensity of choice. It determines the way an agent ranks the relative performance of its actions. We introduce the following formula for the long-run predictions: The mean is the average across significant coefficients According to the probability distributions in Eqs. (19) and (20), in period t each agent randomly chooses four actions iãt and iã k t , where k ∈ {1, 2, 3} (one short-run prediction and three long-run predictions).

HSM Calibration
In order to calibrate the extrapolative trend parameters of the heuristics, we run the regressions given by Eqs. (7) and (8) on the time series of predictions of each individual subject. 5 For the adaptive rule, we obtain similar results in the two treatments (see Tables 1 and 2). Interestingly, in the negative feedback treatment, the majority of subjects adopt a contrarian behavior, i.e., subjects form their predictions by assigning a negative coefficient to the observed trend. To obtain the parameter values, we compute the average of all the individual estimated significant coefficients for each treatment. We use the value obtained in the positive feedback treatment for TFR and the value obtained in the negative feedback treatment to compute the CR. In order to simulate the HSM, we use the following common specification: α = 0.63, w = 0.44 and s = −0.44. Note that we decide not to implement exactly the heuristics described in the original paper by Anufriev and Hommes (2012) but rather the rules implemented in Bao et al. (2012).

EEA Calibration
We can estimate the main parameters of the EEA algorithm, namely φ s , γ and φ l , for each subject using the maximum likelihood procedure. The two parameters related to the short-run predictions can be expressed in a close form, while for φ l we rely on a numerical optimization. The probability distribution for one-step ahead predictions of Eq. (19) can be approximated by a Gaussian distribution, whose parameters can be expressed as: The mean i μ t−1 is determined by the past prediction and past market price with a given weight (φ s ). The variance is invariant over time and depends on the given subject. By rearranging Eq. (21), it is possible to express the mean of the distribution as a convex combination between past expectations and past market price: where the assigned weight is given by to the values for i φ s , we can determine how subjects combine past predictions and prices in different ways to form their expectations. In particular, we identify the following behaviors: (i) 0 < i φ s < 1, translates in 0 < α i < 1/2, so that subjects try to adjust their future predictions towards the market price; (ii) a value of i φ s > 1 implies α i > 1 2 , and, as a consequence, subjects forecast values close to previous predictions; (iii) for negative values of i φ s , we infer that subjects "overcorrect" their past expectations by submitting predictions opposite to either market price − 1 2 < i φ s < 0 or past predictions − 1 2 < i φ s < − 1 3 . Figure 7 shows the distribution of the estimated values of iφs (panel a),γ i (panel b) and iφl (panel c) for both positive and negative feedback treatments.
In Table 3, we classify subjects according to the different behaviors. 6 As can be seen, future predictions are computed differently depending on whether subjects participate in a positive or negative feedback treatment. In the positive feedback treatment, almost half of the subjects "overcorrect" their forecasts (18 out of 40) either towards p t−1 or i p e t−1,t−1 . In the negative feedback treatment, the large majority of subjects (30 out of 44) adjust their predictions to be close to the market price. Interestingly, only a negligible minority (6 out of 44) take into account past expectations.
For the long-run expectations we estimate the individual values of i φ l , common for two, three and four-steps-ahead predictions of a given subject, and use the value of γ i obtained from short-run expectations. In order to simulate the EEA we use the following set of parameters, φ s = 0.4, φ l = 1.4 and γ = 0.4 for the positive feedback treatment; φ s = 0.15, φ l = 1.4 and γ = 0.4 for the negative feedback treatment. Note that those values are the median values of the estimated coefficients.

Simulation Results
After a comprehensive description of the two algorithms, in this section we compare their performance in replicating the experimental results in markets with positive and negative feedback and for short-as well as long-run expectations.

Comparing the HSM and EEA to Describe Short-Run Expectations
In order to simulate the HSM, in the first three periods we use the experimental data to initiate the algorithm, assigning the same weight to each rule, i.e., n h,1 = 0.25, ∀h. Starting from period three, we compute the fitness measure and the weights n h,3 associated to each heuristic. For the subsequent periods we iterate the algorithm detailed in the previous section. In order to initiate the EEA algorithm, we use the experimental individual predictions and the first three realized prices, as the range of the actions depends on the (last) three past realized prices. Individual predictions are independent realizations from different distributions, so that we have six (the number of subjects in the group) different distributions and six short-term predictions in every period and for every group. Once short-term predictions are determined, we compute the market price by using either Eq. (1) or (2) according to the treatment the group belongs to. Figures 8 and 9 show the simulated market prices compared to the experimental market prices for all groups in the positive and negative feedback treatments, respectively. From a preliminary inspection, both algorithms perform well in replicating experimental prices in the two treatments. Table 4 displays the values of the Mean Squared Error (MSE) for the two algorithms. The ability to predict experimental prices is similar for both algorithms. Interestingly, we observe a systematic higher value of the MSE in the negative feedback treatment compared to the positive feedback treatment. Thus, it seems that the two algorithms perform better in capturing the price time series in the positive than in the negative feedback treatment. This is in line with the literature on the application of the HSM in reproducing the experimental data.
The two algorithms can fairly well reproduce the stylized facts regarding the mutual coordination of expectations and convergence to the fundamental value described in Sect. 2.2. Figures 16 and 17 compare the experimental data with the simulated data from the two algorithms. Both algorithms capture the faster coordination of short-term predictions in the positive feedback markets as compared to the negative feedback markets. At the same time, the HSM and EEA can reproduce the convergence to the REE of the short-term expectations in the negative feedback treatment. In the positive feedback treatment, however, both prices and predictions are not converging to the fundamental value, which is well captured by the two algorithms.

Comparing the HSM and EEA to Describe Long-Run Expectations
One interesting contribution of the paper is the extension of the HSM to describe long-term expectations, with a simple linear extrapolative modification of the existing heuristics. The comparison with the EEA and its extension can give a rough idea of the goodness of the modified HSM in reproducing the long-term expectations. This is a relevant step in ensuring a reliable framework to model long-term expectations in a realistic environment.
To the best of our knowledge, this is the first attempt in the literature on LtFEs to reproduce individual long-run expectations using the HSM. Despite the fact that in the experiment subjects submit in each period t their expectations for the remaining  20 − t periods, we replicate the individual expectations up to four steps ahead. Our choice represents a good compromise between considering the whole time-span and having sufficient statistics to analyze the properties of the two algorithms as a function of the forecast horizon and comparing them to the experimental data.
We study the performance of the two algorithms in replicating the main statistical properties of the experimental data, namely, (i) the time series of the individual longterm expectations, (ii) the coordination of long-term predictions as a function of the forecast horizon and (iii) the convergence of long-run expectations to the fundamental value. It is worth mentioning that we do not have an aggregate variable as the market price for the long-run predictions but only individual long-run predictions. Therefore, when necessary, we rely on the average of the long-run expectations. Figures 10 and 11 illustrate the evolution of the average long-run expectations (across subjects) for two, three and four steps ahead in the case of positive and negative feedback treatments. Once again, both algorithms seem to replicate the experimental data with reasonable accuracy. Figures 12, 13, 14 and 15 show the evolution of individual long-run expectations in a representative group in the two treatments and for the two algorithms. At first glance, those figures shows that the individual long-run expectations generated by the two algorithms resemble expectations elicited in the experiment. In particular, we observe in the positive feedback treatment the cone-shape form of the predictions submitted in a given period for different forecast horizons. In the negative feedback treatment, instead, we observe the dynamical process of convergence to the REE of short-as well as long-run predictions.
For a more quantitative comparison between the two algorithms, Tables 5 and 6 show the mean squared error of the simulated data (averaged over 100 Monte Carlo simulations) in describing the average long-run expectations. The EEA describes the long-run predictions significantly better than the modified HSM with the linear extrapolation heuristics.
In Figs. 16 and 17, we compare the dynamical properties of the mutual coordination of long-run expectations and their convergence to the fundamental value. Upon first inspection, it seems that the two algorithms are able to fairly well replicate the behavior of the experimental data, as the degree of coordination resulting from the simulated data is fairly close to the degree of coordination of the experimental data. Additionally, tendency is systematic in both the negative and positive feedback treatments. In the positive treatment, such linear beahvior is similar to the empirical data. In the negative feedback treatment, however, the increase of the dispersion over the horizons is much less evident, with some periods showing an absence of such a systematic increasing tendency. Note that the HSM with the linear extrapolative trend for the long-run predictions has built-in such characteristics. Any (linear) measure of dispersion of the predictions as a function of the time horizon will, therefore, exhibit a linear increase. This property is counterfactual if we consider the negative feedback treatment. It is intuitive, in fact, that a simple linear extrapolation of prices cannot predict convergent prices to the fundamental value. Thus, our numerical exercise shows that the HSM should be modified by introducing more complex heuristics rules to account for the behavior of long-run expectations. Figure 16 displays the standard deviation of individual predictions for the price two, three and four periods ahead. The degree of coordination resulting from the simulated data is fairly close to the degree of coordination of experimental data. Additionally, the EEA is able to reproduce the more persistent heterogeneity observed for the long-run predictions as compared to the degree of coordination of the one-step-ahead predictions.

EEA with the Fundamental Value as Anchor
From our analysis on how subjects form their expectations we observe that subjects follow an anchor and adjustment mechanism, where the anchor depends on the feedback mechanisms that drive the formation of market prices: in markets with positive feedback, subjects' expectations are driven by the past market price dynamics, whereas in markets with a negative feedback markets, subjects learn the fundamental value and use it as anchor to form their short-run expectations on the price. Table 7 illustrates our conclusion. We estimate two equations to explain individual short-run expectations for the positive and negative feedback treatments implemented in the laboratory experiments. In the Model (1), we consider as explanatory variables the lagged value of individual short-run expectations and market price. Instead, in Model (2)   We compute the average (across groups) quadratic distance between the average long-run experimental and EEA simulated predictions.
Predictions are computed using the last observed price as an anchor Model (2) clearly highlights the pivotal role of the fundamental value in determining short-run expectations in negative feedback markets with a coefficient close to 1, compared to the role of the fundamental value in the formation of short-run expectations in markets with positive feedback (with a coefficient of 0.09). Focusing now on individual expectations in the long run, recall that the the EEA was developed on the basis of the anchor-and-adjustment mechanism, and the choice of the right anchor is therefore crucial. Colasante et al. (2018b), using the market price as an anchor in markets with positive feedback, obtain a fairly good replication of the subjects' expectations elicited in the laboratory experiment. In their work, they also compare the performance of the EEA with an alternative model, the so-called noisy rational expectations, which is based on the idea that subjects have homogeneous   rational expectations. 7 They conclude that this model provides a good approximation of the expectations elicited in the negative feedback treatment. However, even though (experimental) predictions in markets with negative feedback quickly converge to the fundamental value, they observe a persistent heterogeneity of expectations in the experimental data that is not found in the simulated expectations.
In order to reproduce the main features typically observed in (experimental) subjects' expectations regarding heterogeneity and convergence of long-run expectations, we implement an alternative version of the EEA, where we use the fundamental value as an anchor. 8 We then compute agents' individual predictions for different forecast horizons k, where k = 1, 2, 3, using the fundamental value (instead of past market prices) as an anchor. In other words, the fitness measure is computed using Eqs. (17) and (18) considering instead the fundamental value as a reference point as follows: Setting φ = 0, we are able to obtain heterogeneous predictions as a result of past individual expectations. Figures 18 and 19 show as an example the average of individual predictions two, three and four steps ahead in two representative groups in markets with negative and positive feedback. Table 8 displays the MSE comparing the experimental data and the simulated expectations in markets with positive and negative feedback. The results show that the EEA with the fundamental value as an anchor performs better to replicate, on average, the experimental data in the markets with negative feedback compared to its performance in the markets with positive feedback. With these results we provide further evidence that subjects form their expectations following different rules depending on the feedback system. In markets with negative feedback, subjects' expectations are mainly driven by the fundamental value, while in the positive feedback system the reference point is the market price.

Conclusion
In this paper we present the results of the application of two adaptive learning algorithms to describe the experimental data of LtFEs, in which we elicit short-and long-run expectations. In particular, we elicit subjects' predictions in two alternative environments: positive and negative feedback systems. The main difference between those expectations' feedback systems lies in the sign of the relation between expectations and market price. The descriptive analysis shows that the dynamical properties of the predictions are markedly different in terms of both the coordination and convergence of expectations to the fundamental value. In order to understand the process of expectations formation in both feedback systems, we consider two evolutionary learning algorithms: the Heuristic Switching Model and the Exploration-Exploitation Algo-  We compute the quadratic distance between the average experimental and EEA simulated predictions. Predictions are computed using the fundamental value as an anchor rithm. The main difference between these algorithms is that the HSM can be defined as "parametric," meaning that it is based on few predetermined heuristics, and the EEA is instead "non-parametric," so that the predictions are chosen according to a specific probability distribution. The two algorithms are based on the common principle of the anchor and adjustment rule. Regarding short-run predictions, we observe that both algorithms perform well in replicating individual predictions. In order to simulate long-run predictions, we have introduced a straightforward extension of the HSM: a linear extrapolation of short-run predictions across different horizons. A considerable difference emerges between the two algorithms. The linear extrapolation provides a good approximation of the main stylized facts in the positive feedback treatment. However, in the negative feedback system, the coordination and convergence properties are better explained by the EEA. Moreover, we have performed an exercise to test whether a change in the anchor can lead to better results in replicating the experimental data.
We have considered the fundamental value as an alternative anchor to the market price. This modification leads to better simulation results in the negative feedback system, while in the positive feedback market we obtain worse performance. The comparison between such different algorithms helps us to understand that the subjects form their expectations using different anchors. The EEA is a good and flexible tool to replicate short-and long-run predictions in both positive and negative feedback systems. The linear extrapolation we have implemented in this paper as an extension of the HSM is not sufficiently flexible to capture the observed behavior in the different feedback systems. A more complex structure is needed to replicate the properties of long-run predictions, which is the focus of future research. Nevertheless, the simple benchmark model is a useful first step toward detect deviations from pure linear extrapolation rules.

Fig. 20
Screenshot of the experiment