Markovian jump system approach for the estimation and adaptive diagnosis of decreased power generation in wind farms

: In this study, a Markovian jump model of the power generation system of a wind turbine is proposed and the authors present a closed-loop model-based observer to estimate the faults related to energy losses. The observer is designed through an ℋ ∞ -based optimisation problem that optimally fixes the trade-off between the observer fault sensitivity and robustness. The fault estimates are then used in data-based decision mechanisms for achieving fault detection and isolation. The performance of the strategy is then ameliorated in a wind farm (WF) level scheme that uses a bank of the aforementioned observers and decision mechanisms. Finally, the proposed approach is tested using a well-known benchmark in the context of WF fault diagnosis.


Introduction
Wind energy is considered as a powerful source of sustainable energy. However, wind turbines (WTs) are expensive systems and their maintainability and reliability must be high. Fouling of the rotor blades with ice or insects, as well as erosion, is an important root cause of faults and it is expected to increase in significance as more WTs are situated in locations with higher wind speeds (WSs) [1]. As stated in [2], the major problem related to debris build-up and erosion is the reduction of the overall WT aerodynamic efficiency leading to unpredicted reductions in power production. Besides possible human safety risks, debris build-up may also cause mass and aerodynamic imbalances, damaging all WT components.
As discussed in [3], the diagnosis of debris build-up can be done using direct or indirect measurements. Direct methods are based on the detection of some change of physical properties such as mass, electrical or thermal properties. Hence, they often require extra equipment, which augment the installation and maintenance costs and the weight and space requirements. For its part, indirect methods are mainly based on detecting the reduction in power production. These methods do not require extra hardware because they use WT control measurements. They require, however, the WTs to be in operation. Their main disadvantage is that energy losses maybe also the consequence of other phenomena which are, nonetheless, generally easily distinguishable from debris build-up. For instance, consider the occurrence of an electric fault in the WT generator. In this case, the deviation of the generator torque from its reference, which can be deduced from the generator torque measurement, provides an immediate and accurate diagnosis of the electric fault leading to its straightforward isolation from debris build-up or erosion [4].
The indirect fault diagnosis (FD) of decreased power generation due to debris build-up or erosion is one of the objectives of the realistic and widely accepted wind farm (WF) benchmark presented in [5]. The benchmark includes the WF power generation model which is based on the well-known power curve of a WT. Basically, two kinds of residuals are used in [6][7][8][9][10] to diagnose power reductions in the benchmark. At a WT level, temporal residuals represent the inconsistencies between the power generation model output and the generated power measurement of a single WT. At a WF level, spatial residuals represent the inconsistencies between the generated power measurements of different WTs. Borcehrsen et al. [6] utilise open-loop temporal residuals in dynamical cumulative sums. Alternatively, Duviella et al. [7] present a FD approach based on spatial residuals and Blesa et al. [8] compute both spatial and open-loop temporal residuals via non-linear parameter varying parity equations. For its part, Badihi et al. [9] merge both approaches. The authors use the generation model of the WTs to compute the power differences among the WF; then, they compare them to the measured power differences. They also present an approach which replaces the generation model by fuzzy inference mechanisms. Similarly, Simani et al. [10] present fuzzy and neural networks techniques. The power generation model of a WT is affected by various non-negligible disturbances: mean WS estimation errors, turbulences, vibrations and measurement noises [5]. Hence, the WT level open-loop temporal residuals in [6,8] may be significantly disturbed. The spatial residuals in [7][8][9] are not affected by mean WS estimation errors because they compare WTs working under the same wind conditions. For instance, the methods in [9] require that all the WTs in the WF operate under the same wind conditions. However, in most cases, the WTs operate in different conditions and WS estimation errors still affect the spatial residuals. In all, it is of interest to develop closed-loop strategies with a better performance w.r.t. disturbances at a WT level and at a WF level for groups of WTs affected not only by identical but also by similar wind conditions. Especially, if the FD objectives require not only the information about the appearance and the location of a fault (fault detection and isolation or FDI) but also about its size and shape (fault estimation or FE). FE is of paramount importance for both real-time decisions and active fault-tolerant control (AFTC) [11] such as power demand redistribution among the WF.
The closed-loop power generation system of a WT is a parametric loop because the WT operates in different WS zones and under different WT operating modes [12]. Moreover, the stochastic characteristic of the wind brings further difficulty to the design of closed-loop FE strategies. In the literature, many authors conclude that WS behaviour can be explained as a Markovian process [13,14]. This behaviour has been exploited in recent WT control schemes as the ones in [15][16][17]; however, up to date, there has been no work taking advantage of the Markovian behaviour of the wind to design FE strategies.
Motivated by the above background, in this work, we develop a Markovian jump system approach for closed-loop FE of decreased power generation in a WF. First, we develop a WT level approach; then, we extend it to the WF level for groups of nearby WTs working under similar wind conditions. FE techniques must be simultaneously robust against uncertainties and noises [18] and sensitive to faults through the accomplishment of the certain trade-off between these properties [19]. In this paper, we use the ℋ ∞ performance of the proposed model-based observer to characterise these properties and to set up an optimal observer design strategy based on linear matrix inequalities (LMIs). Providing a systematic performance-based optimal approach for tuning the FE observer is an advantage when compared to other algorithm designs, where some user expertise is necessary. For instance, numerical extensive simulations and trial and error procedures are necessary to tune the algorithms in [10].
To achieve FDI, we feed the fault estimates provided by the proposed observer into decision mechanisms based on thresholds. Model-based thresholding usually leads to too conservative results which are characterised by poor fault detectability and isolability [20], notably in the cases of Markovian jump systems [21]. Therefore, we utilise a data-driven FDI approach based on adaptive thresholds for evaluating the FE output provided by the modelbased observer. The proposed adaptive mechanism enhances a tight FDI adjustment in the different WS zones and WT operating modes when compared to the constant thresholds in [6,9,10].

Challenges
The proposed Markovian jump system approach for FE and adaptive FDI entails the following challenges.
1. Obtaining a linear parameter varying state-space model of the power generation system of a WT, which is based on nonlinear power curves and consists of two operation modes. To do so, we define a parameter vector containing the WS acting on a WT and the difference between the demanded and the generated power. In this work, the power curves in the WF benchmark [5] are utilised. In reality, these curves can be obtained through different methods such as random forest, method of bins, k-nearest neighbours or support vector regression [22]. 2. Modelling the stochastic behaviour of the parameters as Markovian processes. Obtaining a suitable partition of the parameter set taking its discontinuities into account. Computing the transition probabilities between subsets. 3. Building an augmented model-based observer for FE of decreased power generation in a WT. Designing the observer through a multiobjective optimisation problem that guarantees certain optimal trade-off between the robustness against disturbances/uncertainties and the sensitivity to faults. To do so, we use the ℋ ∞ performance of the observer, which we formulate using LMIs for Markovian processes and convex polytopic sets. 4. Building an adaptive threshold-based decision mechanism for FDI of decreased power generation in a WT. Proposing an algorithm for computing a tight adaptive threshold piecewise on the WT operation mode. Tuning the algorithm with faultfree datasets using data-driven techniques. 5. Extending the WT level model-based FE and data-driven FDI approach to the WF level. To do so, we group the WTs operating under similar wind conditions and we use a bank of the previous observers and decision mechanisms that automatically merges the information provided by the temporal and spatial relations in the WF according to the degree of shared uncertainty among the WTs.

Structure and notation
The remainder of this paper is organised as follows. Section 2 gives the problem formulation and Section 3 presents a Markovian jump modelling of the power generation system of a WT. In Section 4, we develop a WT level model-based FE strategy and, in Section 5, the FE output is utilised in a data-based FDI algorithm. Then, in Section 6, these strategies are extended to the WF level. Simulation results are reported in Section 8 followed by some concluding remarks in Section 9. Let a be some vector and A and B be some matrices. The size of a is denoted as n a and a[i] denotes the ith element in a. A ⪯ 0 means that A is negative semidefinite and similar applies to ⪰. The direct sum is represented as ⊕ and the Kronecker product is represented as ⊗. I n is the identity matrix of size n × n, 1 n × m is a matrix of ones of size n × m and 0 n × m is the zero matrix of size n × m (all of these matrices being of appropriate dimensions when the subindex is omitted). Expected value and absolute value are denoted by E{ ⋅ } and ⋅ . Let x k be a vector of stochastic signals at a sample k.

Wind farm benchmark description
Consider the WF benchmark [5] with 9 WTs of 4.8 MW and the layout shown in Fig. 1. We name the WTs according to the existing wind direction (see the example in Fig. 1) and we consider that the wind is perpendicular to the rows of WTs (which are numbered as i = 1, …, Y and Y is the number of rows for the considered wind direction) and parallel to the columns of WTs (which are numbered as j = 1, …, Z and Z is the number of columns for the considered wind direction). We denote the distance between two rows i and i + 1 as l i . There is a wind mast that measures the WS at row i = 0 (i.e. v 0 = v^0 + ṽ 0 , with v 0 being the WS at the wind mast, v^0 being its measurement and ṽ 0 being the corresponding sensor noise). The WS acting on the WT (i,j), denoted as v i, j , can be modelled as where v i is the mean WS acting on the WTs in the ith row and ṽ t i, j is a zero-mean turbulence component of known variance (σ t 2 = 0.2 m 2 /s 2 ).
The static available power in the WT (i,j), denoted as P s i, j , represents the theoretical maximal generated power in the WT and it depends on v i, j as shown in Fig. 2. This power curve is extensively used in monitoring of WFs (e.g. [23]) and it consists of four zones delimited by three different WSs: the cut-in (v cut-in =4 m/s), the rated (v rated = 12.5 m/s) and the cut-out (v cut-out = 25 m/s) WS [12]: In this zone, the aerodynamic torque is not sufficient to overcome the WT inertia and P s i, j = 0.
• Zone II or partial load region with v cut-in ≤ v i, j < v rated . Maximum power point tracking (MPPT) techniques are performed in this region. For MPPT, the pitch angle is held at zero degrees and the generator moment is adjusted to keep the power coefficient at a maximum value. Hence, P s i, j becomes a non-linear function of v i, j .
• Zone III or full load region with v rated ≤ v i, j < v cut-out . In this zone, the pitch angle is controlled to keep the static available power not higher than the WT nominal power (i.e. P s i, j = P nom ).
The turbine is pitched out to stop the rotation due to security reasons and P s i, j = 0.
According to [5], changes in the generated power will be instantaneous; thus, the static available power in the WT (i,j), denoted as P s i, j , is filtered by a first-order transfer function. This transfer function is characterised by the wind-dependent transfer coefficient τ s in Fig. 2. In all, the dynamic available power in a WT, named after P a i, j , is modelled as The WF controller feeds each turbine with the WT static power reference, denoted as P t . The WT also behaves as a low-pass filter to changes in the power reference; thus, the dynamic power reference of a turbine, P r , fulfils where τ r is a known transfer function coefficient (τ r = 1.2 rad/s).
Depending on the values of the dynamic available power P a i, j and the dynamic power reference P r , each WT operates in one of the following two main working modes: • If P a i, j < P r , the control objective is to maximise the amount of power harvested from the wind and, thus, the generated power equals the dynamic available power in the WT.
• If P a i, j ≥ P r , the control objective is to maintain the generated electrical power equal to the reference.
This behaviour can be modelled as where P i, j is the power generated by the turbine (i,j) and s i, j is a disturbance modelling the drive train oscillations that influence the electrical power generation (i.e. s i, j = γ p sin(σ p 2 π t) with γ p = 1000 W and σ p = 10 Hz). The additive element f i, j is the fault representing a decreased power generation and it is caused by changes in the aerodynamics of the WT due to phenomena such as debris build-up or blade erosion.
The power P i, j is measured by a sensor whose noise, w i, j , can be realistically described by a zero-mean Gaussian noise with variance σ w 2 = 2.5 ⋅ 10 5 W 2 . Denoting the generated power measurement as y i, j , we have

FE and FDI signals
The objective of this work is to detect, isolate and estimate the faults representing a decreased power generation in the WTs of the WF. The generated power measurements y i, j (i = 1, …, Y, j = 1, …, Z), the dynamic power reference P r and the WS measurement at the wind mast v^0, are available for FE and FDI (see Fig. 3). We assume that the WS at the WTs is not measured and, thus, v i, j must be estimated. As detailed in Appendix 1 in Section 12.1, the measurements v^0 can be used to estimate the mean WS v i through some propagation strategy. Denoting the propagated mean WS as v^i, it yields where ṽ p i is the propagation error (i.e. ṽ p i = v i − v^i), which derives from both the propagation model mismatch and the use of the noisy measurement v^0 in the propagation scheme. Note that the total WS estimation error verifies ṽ i, j = ṽ p i + ṽ t i, j , where the propagation error ṽ p i is common for all the turbines in the ith row while the It is well known that most continuous-time control systems are implemented digitally [24]. Thus, we present a discrete-time FE and FDI algorithm. The following assumption on the fault f i, j is made.
. Remark 1: Assumption 1 is common in FE and it considers faults whose variations are slow with respect to the dynamics of the system and it can cover the typical faults in engineering systems such as abrupt faults and incipient faults [24]. In any case, the strategies developed in this work can be easily extended to faults with more complex dynamics (cf. [25]. In this section, we develop the discrete state-space model of the power generation system described in Section 2. Let us first define with T s being the sampling time that we fix to 1 s. Introducing the state x i, j = P a i, j and the inputs u(v i, j ) = P s i, j (v i, j ) and r = P r ; it yields, with Δ i, j being the power difference defined as The faults verifying Assumption 1 can be modelled through where δ i, j is the fault variation. Defining the parameter vector For ease of readability, let us omit hereafter the dependence of the variables on the number of turbine (i,j) (e.g. z k stands for z k i, j ).
The extended system model (11) results in The system dynamics (12) depends on the system matrices A(θ k ), B(θ k ), C(θ k ) and D(θ k ). According to (6) and (11), the behaviour of A(θ k ) and B(θ k ) switches between the estimated WS zones as shown in Fig. 2. According to (9) and (11), C(θ k ) and D(θ k ) are switching matrices whose value depends on the WT working mode determined by the sign of the parameter Δ k .

Markovian jump discrete state-space modelling
The parameters v and Δ can be considered to be bounded by the sets where v, v¯, Δ and Δ are the minimum and maximum possible values of these parameters (which we fix to v = 0 m/s, v¯= 30 m/s, Δ = − 4.8 MW and Δ = 4.8 MW). The parameter vector with N = N v ⋅ N Δ (see Fig. 4). The partition is one such that each interval Ω v (p v ) lies in a single WS zone and each interval Ω Δ (p Δ ) lies in a single WT working mode (see Table 1). For such partition, the system state-space matrices C(θ k ) and D(θ k ) associated with θ k ∈ Θ (p) are constant and they can be expressed as   The system state-space matrices A(θ k ) and B(θ k ) associated with , Ω Δ (p Δ ) )) lies in the WS zones I, III or IV (i.e. , these matrices are not constant because the discrete transfer coefficient a(v) is described by the function in Fig. 2. In all, we express the matrices A(θ k ) and B(θ k ) associated with θ k ∈ Θ (p) as Define the membership signal ξ v whose value at a sample k equals the number p v of the interval Ω v (p v ) in which the WS v lies at the sample k; similar applies to ξ Δ w.r.t. Δ and to ξ w.r.t. θ: Research has shown that the stochastic behaviour of the mean WS can be represented as a Markovian process [13,14]. This behaviour has been exploited in recent WT control strategies as the ones in [15][16][17]. Motivated by this research, we assume that that {ξ k v } is a discrete homogenous Markov chain taking values in the finite set {1, …, N v }. Since the mean WS affects the WT working mode, we also assume that {ξ k Δ } is a discrete homogenous Markov chain taking values in the finite set {1, …, N Δ }. Both assumptions are considered in the following assumption over {ξ k }. Assumption 2: The membership {ξ k } is governed by a discrete homogenous Markov chain whose states are in the finite set S = {1, …, N} with the transition probability matrix and π pq being the transition probability defined as where ∑ q = 1 N π pq = 1 for any p ∈ S.
being defined in analogy to (19) w.r.t. ξ v and ξ Δ [26]. Remark 3: The probabilities π pq can be obtained through numerical simulations by sampling the data in the separate subsets Θ (p) and computing π pq = observed transitions from state p to q observed occurrences of state p [27]. Similar applies to π p v q v and π p Δ qΔ .

FE architecture
To achieve FE, we build the following model-based observer for the extended system (12): where θ^= v^Δ T is the estimated parameter vector in which v^ is the open-loop propagated WS (see (5)) and Δ is the closed-loop estimated power difference computed as Δ = P r − x^ (with x^ being the first state of z^). The estimated parameter vector θ^ does also lie in the set Θ, partitioned into the subsets {Θ (p) } p ∈ {1, …, N} as explained in Section 3.2.
The matrices L(θ^k) ∈ ℝ 2 × 1 and K(θ^k) ∈ ℝ 1 are the parameterdependent design gain matrices of appropriate dimensions. Provided the switching behaviour of the system state-space matrices in (15) and (16), we fix L(θ^k) ∈ ℝ 2 × 1 and K(θ^k) ∈ ℝ 1 as where L p and K p are constant matrices associated to θ^k ∈ Θ (p) and must be designed. Given (17), these matrices can be also expressed as L(θ^k) = L p if ξ^k = p and K(θ^k) = K p if ξ^k = p. Remark 4: Note that a certain degree of conservatism is introduced when using switched observer gain matrices in the form of (21) for the regions lying in the WS zone II, where the statespace matrices A(θ^k) and B(θ^k) are not constant (see (16)). Even though switched polytopic gain matrices could be used instead, the small variations experienced by these matrices (see Fig. 2) justify the choice in (21) that reduces the observer complexity.
Define the extended state estimation error as z k = z k − z^k and the FE error as f~k = f k − f^k. It follows that: with R = 1 0 T . The error depends then on the fault variation δ k , the oscillation s k , the noise w k and the disturbances g k = g k (θ^k, θ k ) and h k = h k (θ^k, θ k ), which stem from using θ^ instead of θ in the estimation algorithm. As (22) refers to the turbine (i,j), g k and h k refer in fact to g k i, j and h k i, j defined as where e k i is caused by the wind propagation error and ε k i, j is caused by the wind turbulence, i.e.
In (25), the first summands are negligible w.r.t. to the second summands. Hence, we omit the dependence of the disturbance e i on the column j. The following assumption on e i , ε i, j and h i, j is made.
The disturbances e i , ε i, j and h i, j can be considered to be bounded as (see (26)) .
Details on the computation of these bounds are included in Appendix 2 in Section 12.2. Note that it would have been possible to consider a row-dependent bound of the disturbance caused by the propagation error (i.e. e i ≤ e¯p i ) because this error generally increases with the distance to wind mast. For the sake of simplicity, we introduce some conservatism by neglecting this dependence. Hence, e¯p verifies e¯p = max i e¯p i .
For brevity, we omit again the dependence on the number of row i and on the number column j. Taking (24) and (26) into account, the summands R g k and h k in (22) can be expressed as with λ k ∈ [ − 1, 1], μ k ∈ [ − 1, 1] and φ k ∈ [ − 1, 1] satisfying if θ k ∈ Θ (p) and where G(θ^k), G(θ^k) and H(θ^k) are defined as In all, we rewrite (22) as Note that A p (θ^k), ℬ p , C p and D p are defined using the corresponding matrices associated with θ k ∈ Θ (p) (e.g.

FE design
The FE error caused by δ describes the fault sensitivity of the observer and the error caused by W describes the robustness of the observer against disturbances and noises. We characterise them through the ℋ ∞ performance of the observer w.r.t. δ and W, respectively. To that end, we use the formulation based on LMIs in the following theorem. For the sake of generality, we denote the size of the inputs in (30) as n δ , n λ , n μ , n φ , n s and n w verifying n δ = n λ = n μ = n φ = n s = n w = 1 in this case of FE at a WT level. Theorem 1: Consider the observer (20) applied to the system (12). If there exist positive scalars γ λ , γ μ , γ φ , γ s , γ w and γ δ ; full matrices of appropriate dimensions X p , K p and V p ; and symmetric matrices of appropriate dimensions Q p , for p, q = 1, …, N and m = 1, …M fulfilling the LMIs with Γ = γ λ I n λ ⊕ γ μ I n μ ⊕ γ φ I n φ ⊕ γ s I n s ⊕ γ w I n w ⊕ γ δ I n δ , then defining L p = V p −1 X p , the following holds for all θ^k ∈ Θ.
(1) In the absence of disturbances, noises and faults (i.e. W k = 0 and δ k = 0), the extended state estimation error (30) converges asymptotically to zero in average.
(2) Under null initial conditions, the expected value of the FE error is bounded as Proof: If (31) holds for p, q = 1, …, N, we have that Q q ≻ 0 and that V p is a non-singular matrix because V p + V p T − Q q ≻ 0. Hence, Thus, the matrix inequalities which result from replacing (31) are also positive definite. Let us substitute X p by V p L p and apply a congruence transformation by  to V k = z k T Q p z k for θ^k ∈ Θ (p) and p = 1, …, N.
(2) For brevity, let us denote E{V k + 1 θ^k ∈ Θ (p) } as E{V k + 1 θ^k} and let us not include in the next that the inequalities are fulfilled for all θ^k ∈ Θ. Taking conditional expectation given θ^k − 1 over expression (33) leads to because E{V k + 1 θ^k θ^k − 1 } = E{V k + 1 θ^k} and the exogenous signals are independent of θ^k − 1 . In (34), we have also taken into account that λ is a deterministic signal and (similar applies to μ, φ and δ), and that E{s k T s k } = ∥ s ∥ RMS 2 because s k is zero-mean (similar applies to w k ).
Under null initial conditions (V 0 = 0), adding the aforementioned expression from k = 0 to k = K − 1 leads to because ∑ k = 0 K − 1 E{V k + 1 θ^k} − E{V k θ^k − 1 } = E{V K + 1 θ^K} and E{V K + 1 θ^K} > 0. Dividing this expression by K and taking the limit when K tends to infinity leads to (32).
□ There exists thus a trade-off between ameliorating the fault tracking ability of the observer and its robustness [18]. From the practical viewpoint, it is of considerable interest to achieve certain tracking ability and to minimise the effect of the disturbances and noises on the estimations [19]. Thus, we propose to design the observer gain matrices through the following optimisation problem: along the variables γ λ , γ μ , γ φ , γ s , γ w , γ δ , X p , K p , V p , Q p and P p (p = 1, …, N) and with γ¯δ being the required ℋ ∞ performance describing the fault tracking ability of the observer. Remark 5: The use of the signals (28) in the design process allows considering the differences in the bounds (26) among the different subsets Θ (p) . Effectively, we now have that the value of these bounds is introduced through the matrices G p , G p and H p and If the signals e, ϵ and h were used instead, the maximum value of the bounds (26) (e.g. max p e¯p) would be the only information introduced in the design. Note also that if the bounds (26) are not completely accurate, ∥ λ ∥ ∞ 2 , ∥ μ ∥ ∞ 2 and ∥ φ ∥ ∞ 2 can be considered as tuning parameters regarding the bounding accuracy. Similar applies to ∥ s ∥ RMS 2 and ∥ w ∥ RMS 2 that can be easily derived from the known parameters γ p and σ w but whose values can be increased with the uncertainty over these parameters.
Taking Remarks 4 and 5 into account, we deduce that the conservatism is reduced when a lot of gridding intervals are utilised for the partition of the set Θ. Hence, the density N of the grid {Θ (p) } p ∈ {1, …, N} is to be determined from a trade-off between having a few gridding intervals that ensure reduced computational burden but introduce conservatism or having a lot of gridding intervals causing heavy computational times but reduced performance conservatism [28]. As specified in Appendix 2 in Section 12.2, we choose N v = 6 and N Δ = 4. This gridding is a posteriori validated through the numerical simulations in Section 8.

FDI at a WT level
We set the following decision for FDI: where J is an adaptive threshold covering the uncertainties affecting the fault estimate in fault-free conditions. As recalled in [20], the model-based setting of thresholds usually leads to too conservative thresholds which result in poor fault diagnosability. It is the state-of-the-art in real applications to optimally set thresholds on the basis of tests in the real application environment. In this context, we propose to compute the threshold through a multivariate linear model over the estimated parameters θ^ and to obtain the coefficients of this model using a set of fault-free training data as detailed hereafter. Remark 6: Norm-based constant thresholding can be performed using the RMS-norm bound of the FE error in (32) (with γ δ = 0). Assuming zero-mean disturbances, one can apply the Chebyshev's inequality and obtain a stochastic threshold for certain confidence level. However, this approach ignores the real statistical distribution of the error and leads to too loose bounds [29].
Provided the switching behaviour of the system, we use different coefficients for each WT working mode. For this purpose, we define η M (M = 1, 2, 3) as a signal indicating wether the WT is estimated to be operating in mode M: with ξ^k Δ being the estimated membership function of power ) and n′ being a design parameter that defines an uncertain intermediate mode. Effectively, mode M = 1 refers to the cases where Δ k < Δ n 1 ′ − n′ ≤ 0 and the generated power equals the dynamic available power in the WT. Mode M = 3 refers to the cases where 0 ≤ Δ n 1 ′ − n′ ≤ Δ k and the generated power equals the dynamic power reference (see Table 1). For its part, mode M = 2 is the transition mode defined by n′ > 0 that takes the working mode estimation error into account. In all, the proposed multivariate linear model for computing the adaptive threshold is with X k being the variable vector defined as otherwise. Similarly, it would be possible to take account of the process dynamic behaviour by using the alternative variable vector X k ′ = X k … X k − O (with β¯′ ∈ ℝ 3O × 1 and O being the number of past considered samples). However, the simplified form in (38) reduces the complexity of the threshold computation while it has provided satisfactory results in the simulations in Section 8.
To obtain β, we collect the fault-free training data of N T samples, we run the estimator (20) and we construct the matrices (39) Remark 8: To obtain a fault-free training dataset two approaches are possible. If a realistic simulator of the WF is available, the designer can excite it with all possible power demands and WS profiles and collect the output data. Alternatively, the designer can collect historical data covering all possible power demands and WS profiles. For newly-built systems, one can use either the corresponding simulator or the historical data of its initial operation by assuming that no faults affect the system during this initial period.
Then, we solve the following optimisation problem with σ ∈ [0, 1] being a weighting factor. The optimisation problem (40) guarantees that the collected fault estimates do not exceed the threshold and it minimises a weighted combination of the quadratic accumulated difference and the maximum difference (denoted as ϵ) between the threshold and the simulated data. Note that the multiplicand N T 2 in (40) is used to regularise the minimised terms. The proposed FE and FDI strategies are summarised in Fig. 5.
Remark 9: The sampling false alarm rate (FAR) obtained from (40) is equal to 0. If the resulting threshold is too conservative, one can fix FAR=ϕ/T by performing ϕ iterations in which the optimisation problem (40) is solved and the most conservative data (i.e. f^k¯ and X k¯ such that k¯= argmin k ∈ [1, N] f^k − X k β) is eliminated from ℱ and X for the next iteration.
Remark 10: For brevity, we have omitted the dependence of the variables on the number of row i and on the number of column j. Hence, let us remark that the inequality in (36) stands in fact for f^k i, j ≥ J k . The threshold J does not depend on the number of WT (i,j) because the disturbances and noises affecting the estimation error f~i , j are equally bounded for all the WTs in the WF.

FE and FDI at a WF level
The performance of the observer (20) and the diagnoser (36) may be compromised if the bounds of the disturbances in Assumption 3 are too large. A solution to mitigate this effect is to totally decouple the FE error from these signals instead of attenuating their effect through the robust design in (35). However, a necessary condition for achieving disturbance decoupling is that the faults and the disturbances have a totally decoupled effect on the measurements and this is not possible from a WT level perspective because there is just one measurement which is simultaneously affected by the faults and the disturbances. At the WF level, we can consider altogether the WTs in the same row because the disturbance caused by the wind propagation error (i.e. the disturbance e i ) is common to all these WTs. As detailed hereafter, when all the WTs in a row are prone to the power fault, e i does not verify disturbance decoupling conditions either. However, in this case, it is possible to build a bank of observers and diagnosers that allow to achieve the decoupling from e i at the cost of some fault simultaneity restrictions (cf. [30]).

Remark 11:
We have here considered that e i is caused by the wind propagation error and that ϵ i, j takes account of the uncertainties caused by the individual turbulence. If for any wind direction or for any other WF layouts there were no sufficient WTs per row to build the bank of observers, we would consider groups i of close WTs and divide the uncertainty caused by the wind propagation error into a common disturbance e i and a non-common disturbance which would be considered together with the turbulence by ϵ i, j .
To do so, let us first define the auxiliary vectors As previously specified, λ k (i) is a single disturbance modelling the wind propagation error and it affects all the turbines simultaneously. In all, the signals affecting the lth observer are the fault generators δ l (with n δ = Z − 1), the disturbances in W (with n λ = 1 and n μ = n φ = n s = n w = Z) and the fault f i, l which can be now considered as a new 'disturbance'. Following the approaches in [30], one can verify that it is now possible to decouple f l from λ if the new 'disturbance' f i, l is ignored. Note that if f i, l was not ignored and its dynamics were considered together with the dynamics of f l , λ would not verify disturbance decoupling conditions because it would not be possible to distinguish this disturbance from the occurrence of simultaneous faults in all the turbines in the row.
Hence, we propose to ignore the presence of the fault f i, l and to design the observer (45) through the optimisation problem with Ξ p m built in analogy to Ξ p m with the matrices in (47) replacing the matrices in (30). The new constraint γ λ = 0 ensures now the decoupling from λ.
In the absence of the ignored fault f i, l , we can thus set the following decisions ( j = 1, …, l − 1, l + 1, …, Z): otherwise No fault f l [ j] .
The adaptive threshold J does not depend on the number of turbine (i,j) nor on the number of ignored fault l (see Remark 10) and it is designed following the strategy in Section 5. The decoupling from λ enhances FDI because J in (50) is now smaller than J in (36). However, if the fault f i, l is present in the system, the decision mechanisms (50) are no longer reliable. Hence, we build a bank of l = 1, …Z observers, each of them taking account of the fault f l and ignoring the fault f i, l . An observer l and the corresponding decision mechanisms are reliable if the absence of the fault f i, l is diagnosed by the decision mechanisms of at least another reliable observer l′ (l′ ≠ l). In turn, the reliability of l′ implies that the decision mechanisms of the lth observer diagnose the absence of the fault f i, l′ . In all, the proposed bank of observer and decision mechanisms enables FE and FDI whenever 2 of the faults in a row i are not present in the system (i.e. there are no more than Z − 2 simultaneous faults in a row). FE and FDI are then achieved with any of the reliable observers and decision mechanisms of the bank.

Benefits and practical considerations of the proposed approach
Compared to the relevant existing literature, the benefits of the proposed approach are the following.
• As detailed in Section 2.2, the proposed approach utilises a reduced number of measurements: the power reference, the generated power and the WS at the wind mast. This reduces the information needs w.r.t. other techniques such as the one in [6] (requiring the rotor speed measurement) or in [10] (requiring the collective pitch angle measurement). It does not either require the information about the presumed fault size as it is in [6].
• The residual-based techniques in [6][7][8] are focused on FDI tasks. Contrariwise, the proposed approach is focused on both FE (Section 4) and FDI (Section 5), being more suitable for the development of AFTC strategies. • The proposed closed-loop approach is more actively robust against disturbances and uncertainties than the open-loop methods in [6][7][8][9]. • The Markovian jump system approach is a non-conservative procedure to handle the non-linearities in the power generation model. It facilitates the adaptation to the different levels of uncertainties and disturbances along the parameter set, leading to a less restrictive compromise between fault sensitivity and robustness.
• The proposed observer design in Section 4.2 guarantees certain fault sensitivity with optimal disturbance rejection. This performance is not guaranteed in [6][7][8][9]. where more ad-hoc and user knowledge-based tuning procedures are utilised to set this trade-off.
• In contrast to the constant thresholds in [6,9,10], the adaptive FDI mechanisms presented in Section 5 allow the adjustment of the thresholds to the different WS zones and WT operating modes. The proposed data-driven design of the FDI thresholds provides tight bounds for the fault estimates obtained via Markovian observers. This allows a more rapid detection and isolation of small faults w.r.t. model-based adaptive thresholds as the ones used in [8], which moreover require precise bounds of the uncertainties. • The WF level scheme proposed in Section 6 is based on a systematic multi-input multi-output (MIMO) observer that automatically merges the information acquired from temporal and spatial inconsistencies depending on the level of shared uncertainties among the group of considered WTs. Hence, it is more easily extensible to different WF layouts and wind direction than the residuals in [7][8][9][10], which assume identical wind conditions among groups of WTs.
For its part, practitioners should pay attention to the following considerations when applying the proposed FE and FDI approach.
• The FE observer (20) is based on the model in Section 3. Although this observer is robust against uncertainties (see Section 4.2), a minimum fitting of the model parameters (e.g. τ r ) is necessary.
• Practitioners must take into account the variations of the uncertainties along the parameter set in order to suitably grid it. See Appendix 12.2 including appropriate gridding for the WF benchmark.
• In practice, the transition probabilities among the subsets of the parameter set can be computed as detailed in Remark 3. • The design of the FE observer (20) is based on the convex optimisation problem (35), which can be addressed using semidefinite programming solvers (e.g. MOSEK [31]). • The data-driven design of the FDI decision mechanism (36) requires a fault-free training dataset. See Remark 8 for practical details on the obtention of this dataset. • The WT level approach is independent of the wind direction (WD); however, the WF level approach depends on the WD. Hence, different designs must be performed for different WDs. We propose to group these WDs (e.g. [0-45]° and (45-90]°) and perform a design for each of these groups (considering the worstcase WD in each group). The number of groups of WDs depends on the performance sought by the user. As the number of groups increases, the level of shared uncertainties among the WTs increases and the performance ameliorates at the cost of more offline design computations.
Remark 13: The proposed FE and FDI strategy consists on a diagnostic tool which processes the available control measurements to diagnose power faults (Fig. 5). This diagnostic tool does not decide any control action, i.e. it does not affect the system. Remedial actions based on the fault information provided by the tool will be considered in future works.

Simulation results
In this section, the proposed strategies are numerically validated using the WF benchmark simulator [5]. The benchmark is used as a reference in the literature because it contains a realistic number of non-linearities, uncertainties and disturbances. The simulator includes a WS profile of 4400 s that covers the whole WS range with challenging WS variations (detailed in Fig. 11 of Appendix 12.1) and it considers the wind directions in Fig. 6.
The proposed WT level strategy is independent of the WF layout and of the wind direction. The WF level approach requires to group the nearby turbines and to bound the shared uncertainties caused by the wind (see Remark 11). In the first wind direction, we simply group the turbines by rows. In the second wind direction, we group the WTs as shown in Fig. 6. Due to space constraints, in the following, we just include the simulation results for the first wind direction. Similar WF level results are obtained for the other wind direction.
First, we create a fault-free training dataset by considering ten different dynamic power reference cases (e.g. Case A in Fig. 7). Note that it is not possible to consider different WS scenarios because only one WS profile is available in the benchmark; however, this WS profile is complex and considers a wide range of complex variations. The transitions experienced by the training dataset (e.g. Case A in Fig. 7) are utilised for computing the transition probability matrix of the Markovian process. These probabilities are computed with the open-loop estimated power difference because the closed-loop estimates are not yet available. Then, we perform the design of the WT level observer (20) through the optimisation problem (35) and of the WF level observer (45) through the optimisation problem (49). For both designs, we fix the required ℋ ∞ performance describing the observer fault sensitivity to γ¯δ = 1. The values included in the designs are specified in Tables  2 and 3 of Appendix 12.2. The fault-free training dataset is subsequently utilised to obtain the adaptive threshold algorithms of the data-based decision mechanisms. For each training case, we run the designed estimator ((20) for the WT level and (45) for the WF level). Then, we construct the matrices (39) with all the estimated data and we solve the optimisation problem. All the designs are solved using MOSEK. For simplification, we omit the numerical results. Fig. 8 shows the simulation results for three validation cases (Cases 1-3 with 10 3 validation samples). The estimates are in black and the thresholds are in red. For ease of comprehension, we include both the figures comparing the real estimates in MW with the real thresholds and the figures comparing normalised FE variables with unitary thresholds. First, we deduce that more accurate fault estimates are obtained when the WT is able to generate the amount of requested power. Effectively, in this WT working mode, the uncertainties derived from the WS estimation error disappear. Secondly, we verify that the WF level approach enhances fault diagnosability w.r.t. the WT level approach: the minimum diagnosable faults are ∼ 0.2 MW at the WF level and ∼ 0.8 MW at the WT level. Let us remark that the norm-based constant thresholds in Remark 6 are ∼ 10 2 MW (WF level) and ∼ 10 3 MW (WT level), being not applicable in practice without any further assumption on the statistical distributions of the disturbances. Now, let us consider Case 1 and suppose that the turbine (i = 1, j = 1) is affected by a 40% power degradation during the period T1 (defined as [1000, 1100] s) and by a 3% power degradation during the period T2 (defined as [3200, 3300] s). The WT level and WF level FE and FDI results are depicted in Fig. 9. The first fault is only diagnosed at the WF level and the second fault is diagnosed at both levels (although being more clearly distinguishable at the WF level). Effectively, relative power degradations are more difficult to diagnose during T1 than T2 because the amount of generated power is smaller and the level of uncertainties is bigger.
Finally, we consider the whole WF and the Case 2 in Fig. 8. We assume that the turbine (i = 1, j = 3) is affected by a relative power degradation of 50% during T1 and that the turbine (i = 2, j = 1) is affected by a relative power degradation of 20% during T2. Fig. 10 shows the WF level simulation results for all the WTs in the WF. These 4 ⋅ 10 4 samples prove that the proposed approach allows FE and FDI in the WF. Moreover, the proposed approach allows simultaneous faults in the farm: if the WT level approach is utilised, no simultaneity restrictions apply; if the WF level approach is utilised, the only restriction for achieving FE is that   there should be two simultaneous fault-free turbines per row (i.e. only a faulty turbine per row in this case).

Conclusion
This paper proposed a novel closed-loop scheme for the estimation of decreased power generation in WTs due to blade erosion and debris build-up. The model-based strategy is based on the Markovian jump system model of the power generation system of a WT and it thus covers all the WS zones and WT operating modes. The FE output is also utilised for FDI in tighten data-based adaptive decision mechanisms. The approach is first applied to a single WT and then generalised from a WF perspective. The WF approach has the advantage of being easily applicable to different wind directions and layouts by simply adjusting the level of shared uncertainty by the considered groups of nearby turbines. The proposed strategies are tested in a realistic numerical simulator in the context of FD in WFs. Future work includes their application to a small-scale physical simulator. Also, the extension of the proposed approach to other WT systems and the use of the fault estimates in AFTC strategies highlight as immediate future work.