Optimal sampling pattern for free final time linear quadratic regulator: the scalar case

The optimal sampling problem is the selection of the optimal sampling instants together with the optimal control actions such that a given cost function is minimised. In this article, we solve the optimal sampling problem for free final time linear quadratic regulator with scalar dynamical system. The solution provides the optimal sampling instants, control actions and the optimal final time in a recursive and constructive way for any arbitrary number of samples , as it is not based on asymptotic arguments. An application example shows the feasibility of the approach.


Introduction
Technological developments in sensor networks, the internet of things and networked control systems have led to a multitude of nodes sending information for supervision (Zheng & Jamalipour, 2009), estimation (Ge et al., 2017) and control (Tran & Ha, 2015). Reduction in the number of sampling instants N saves energy consumption, computing power and communication bandwidth, which may be scarce resources.
Digital control systems were initially developed on the basis of periodic sampling (Astrom & Wittenmark, 1997). However, for a given number of sampling instants N, non-periodic sampling may lead to a better performance than a periodic one. The optimal sampling problem was defined in Bini and Buttazzo (2014) as the selection of the N interarrival times μ k := t k+1 − t k and control inputs u k for k ∈ {0, 1, . . . , N − 1} that minimise a cost function J.
The analytical solution of the sampling problem by deriving necessary conditions for the optimum is not feasible. Instead, numerical gradient-based optimisation algorithms have been derived from necessary conditions, which are of high complexity. In Bini and Buttazzo (2014), a numerical optimisation algorithm was proposed that for systems of order n and N sampling instants the computation of the gradient had complexity of O(N 2 n 3 ). In a similar way, the control of switched systems (Egerstedt et al., 2006;Xu & Antsaklis, 2004) has also been founded on numerical optimisation methods where the gradient of the cost function with respect to switching times was derived, and gradient descent optimisation algorithms were employed for numerical minimisation. The derived numerical algorithms were computationally costly, not scalable with the number of sampling instants N, and, being a nonconvex problem, they lacked any guarantee of finding the global minimum (Egerstedt et al., 2006;Xu & Antsaklis, 2004).
Event-triggered and self-triggered sampling rules yield also non-periodic sampling patterns (De La Sen, 1996)  to preserve the same performance while decreasing the information exchange required in estimation (Batmani, 2020) and control (Xu et al., 2020). Inspired by the results in Bini and Buttazzo (2014), a new self-triggered control strategy was proposed in Velasco et al. (2015) and implemented in Rosero Chandi et al. (2017). Significant cost reductions compared to the optimal periodic controller were shown. However, the optimality of the event-triggered and self-triggered sampling rules is still an open problem. The optimal sampling problem is also similar to move blocking (MB) in model predictive control (Son et al., 2020). On the basis of a periodic sampling, the MB keeps the input unchanged through several periods with the aim to reduce the number of variables in the optimisation problem. However, being the sampling instants fixed beforehand, it leads to suboptimal solutions. Furthermore, the resulting problems are of combinatorial nature and do not scale well with the increasing number of variables (Shekhar & Manzie, 2015). In Kowalska and Von Mohrenschildt (2012), the switching times were incorporated into the optimisation problem. However, the resulting problem did not provide any guarantee of optimality of the solution and it was hard to solve in general because the number of decision variables also increased with the number of switching times N.
To overcome these problems, a new approach to solve the optimal sampling problem was proposed in Bini and Buttazzo (2014), called the quantisation-based sampling. The basic idea of quantisation-based sampling was to approximate the optimal continuous-time control input u(t) * by a piecewise constant function that provided the optimal interarrival times. Once the optimal interarrival times were calculated, the optimal control inputs were computed by standard linear quadratic regulator (LQR). The approach was computationally tractable because for LQR the optimal control action u(t) * was readily calculated. Furthermore, it was shown in Bini and Buttazzo (2014) that quantisation-based sampling was optimal for firstorder systems for a large number of samples N.
In this article, we solve the optimal sampling problem for first-order systems with quadratic cost function and free final time T N . This is in contrast with the problem proposed in Bini and Buttazzo (2014) where the final time T is fixed beforehand. As a result, we not only provide the optimal interarrival times μ * k and optimal control inputs u * k but also the optimal final time T * N . The resulting algorithm is: (i) optimal for arbitrary N ≥ 1 because it is not based on asymptotic arguments, (ii) scalable because the optimal interarrival times are obtained by solving N one-dimensional optimisation problems, (iii) recursive because the optimal interarrival times are computed backwards from the last stage in a sequential manner and (iv) constructive because given the N optimal interarrival times, the solution for the new problem with N + 1 interarrival times just requires to solve one more one-dimensional optimisation problem to calculate the new optimal interarrival time.

Problem statement
Consider the free final time T N linear quadratic regulator problem where x(t) ∈ R is the state, u(t) ∈ R the input signal, a ∈ R and b ∈ R are the dynamical system parameters, q ∈ R, r ∈ R, and s ∈ R are the cost function weights, with q, r and s positive. The control input signal u(t) is constrained to be in the set P N of N-complexity piecewise constant signals, that is, u(t) is a linear combination of N indicator functions of intervals are called sampling instants. For t 0 = 0, the sampling pattern information is also given by the sequence {μ 0 , μ 1 , . . . , μ N−1 } of interarrivals μ k := t k+1 − t k for k ∈ {0, 1, . . . , N − 1}, defined as the time between two consecutive sampling instants. Consequently, in what follows the sampling pattern T is defined by the interarrival sequence. For an arbitrary sampling pattern T = {μ 0 , μ 1 , . . . , μ N−1 }, with N−1 k=0 μ k = T N , the exact discretisation of the LQR problem P T N given by (1) is This result follows by direct integration of the cost function and exact discretisation of the dynamical system with the arbitrary sampling pattern T = {μ 0 , μ 1 , . . . , μ N−1 }. The details can be found in Appendix 1. The discrete-time problem (2) is solved by providing the optimal control actions sequence U * := {u * 0 , u * 1 , . . . , u * N−1 } and the optimal sampling pattern T * :=

Remark 2.1:
For the discrete-time problem (2), the cost function parameters q(μ k ), p(μ k ), and r(μ k ), except s, become a function of the interarrivals μ k of the sampling pattern T, and the same applies to the dynamical system parameters φ(μ k ) and γ (μ k ). For clarity of notation, we define On the other hand, x k and u k are the state and control action values at sampling instant t k , that is x k := x(t k ) and u k := u(t k ).

Remark 2.2:
The cost function of the discretised problem (2) depends on the product of x k by u k with weight p k , despite the continuous cost function (1) lacks the product x(t)u(t). This is not a problem for applying dynamic programming.

Dynamic programming
The optimal sampling pattern T * is computed using a two-phase procedure. First, we find the dynamic programming solution for the discrete-time problem (2) considering an arbitrary sampling pattern T = {μ 0 , μ 1 , . . . , μ N−1 }, with N−1 k=0 μ k = T N , as shown in Figure 1. Second, on the basis of the dynamic programming solution, the optimal interarrival time is computed at each stage of the dynamic program by solving an univariate optimisation problem. In what follows, we discuss the dynamic programming solution of the first phase.

Control action u k
For an arbitrary intermediate stage k, the cost-to-go is with J * * k+1 (x k+1 ) the optimal cost-to-go from stage k + 1, calculated in the previous iteration of the dynamic program, and given by Substituting the discrete-time dynamical system x k+1 = φ k x k + γ k u k in Equations (8)-(9) and arranging terms yields The optimal control action is obtained by minimising J k with respect to u k resulting in For an arbitrary interarrival μ k , it follows that the control action is proportional to the current state x k through C k , like in the classical linear quadratic regulator. Substitution of the optimal control action u * k on the cost-to-go function (10) (see Appendix 2 for the details) yields the following cost-to-go at stage k: The cost-to-go function is proportional to the squared state through K k , again like in the classical linear quadratic regulator. However, in contrast to the periodic discrete-time case (Anderson & Moore, 1979), the proportional factors C k and K k are not constant but dependent on the interarrival μ k .

Interarrival µ k
The optimal interarrival time μ k is obtained by minimising K k with respect to the interarrival μ k . Assume we have computed the optimal interarrival μ * k by any means, hence the optimal control action u * * k and optimal cost-to-go function J * * k are given by At this point, the optimal cost-to-go gain K * k is available to repeat the computations at stage k−1.

Sampling pattern solution
Given the optimal cost-to-go gain at stage k + 1, that is K * k+1 , it is possible to derive the cost-to-go function gain at stage k, K k (μ k ), as shown by Equation (12). Equation (12) shows how the optimisation of interarrival times can be embedded into a family of similar problems such that each member of the family is easily related with the solution of the previous problem. By minimisation of the term K k (μ k ) in Equation (12), we obtain at stage k the optimal interarrival μ * k and the optimal cost-to-go gain K * k (μ * k ). Beginning from the last stage N−1, we are able to compute backwards the optimal sampling pattern {μ * N−1 , μ * N−2 , . . . , μ * 0 } by solving N one-dimensional optimisation problems. Once the optimal sampling pattern T * := {μ * 0 , μ * 1 , . . . , μ * N−1 } is computed backwards, the optimal control actions sequence U * := {u * 0 , u * 1 , . . . , u * N−1 } is computed forward from the initial state x 0 by applying Equation (11).

Sampling pattern cost function
The optimal interarrival μ * k at stage k is obtained from the minimisation of the cost-to-go gain K k with q k , p k , r k , φ k , and γ k functions of interarrival μ k . The functional form of K k with respect to interarrival μ k is intrincated due to the exponential relationship. However, we simplify the expression of K k by applying dimensional analysis (Balaguer, 2013) and by a change of variable. First, we define the dimensionless timeμ k := aμ k as a has inverse time dimensions (Palanthandalam-madapusi et al., 2007). We also definē b := b/a,q := q/(2a) andr := r/a. Furthermore, we perform the change of variable τ := eμ k . As a result, we have that Equations (3)-(7) are rewritten as :=qb 2 (τ 2 − 4τ + 2 ln τ + 3) +r ln τ and the cost function (13) to be minimised is rewritten as (see Appendix 2) withq, R and K given bȳ q and R are constant for all stages because they depend on the cost function parameters q and r and on the dynamical system parameters a and b. On the contrary, parameter K changes at Figure 2. Graphs of the sampling pattern cost function for stable systems, given by Equation (15), for several values of R > 2 and K ≤ 1. The dots mark the minimum value for each cost function. Note that the domain is τ ∈ [0, 1]. The cost function is unimodal for sufficiently small values of K. Otherwise the optimal solution is τ * = 1.
each stage because it depends on the optimal cost-to-go at stage k + 1, K * k+1 . Note also that the optimisation problem does not depend on the magnitude ofq, but only on its sign because it is a constant that for stable systems is negativeq < 0 whereas for unstable systems is positiveq > 0.
The success of the proposed approach lies on the solvability of the optimisation problem defined by the cost function (14). In general, it is not possible to show convexity of cost function (14) because that property depends on the values ofq, R and K. Instead, we show the solvability of the optimisation problem by performing the numerical analysis of the resulting cost function for several values ofq, R and K, which depend on the stability of the dynamical system. In the following sections, we perform the analysis for stable, unstable and critically stable systems.

Stable systems (a < 0)
For stable dynamical systems, it follows that τ ∈ [0, 1] because a < 0. Furthermore, parameterq is negative. As a result, we investigate the following minimisation problem: with R > 2 and K ≤ 1, because¯r qb 2 > 0, K * k+1 > 0, andq < 0. In Figure 2, we plot the cost function for several combinations of parameters R and K, together with the optimal point. It can be seen that, for any R, the cost function is unimodal (i.e. having only one maximum o minimum) for sufficiently small values of K.
The graphical results of Figure 2 show that if the derivative of the cost function at point τ = 1 is greater than zero, that is d dτ K k (τ )| τ =1 > 0, the function is unimodal. On the contrary, if the derivative is smaller than zero the optimal point is equal to τ * = 1. As a result, we require the following condition for a unimodal optimisation problem The derivative condition is finite because R > 2. It follows that the previous derivative condition is equivalent to In Figure 3, we analyse graphically the condition (16) where it is shown that for any value R > 2, all values of K to the left of K s and to the right of K u , yield unimodal optimisation problems. Summing up, for problem unimodality it must be the case that condition (16) be true. The condition (16) implies a relation between control action cost r and final cost s. In words, the final cost s must be large enough with respect to the control action cost r to be worth to apply a control action during some finite time greater than zero.  (15) and (17). Given an R > 2, there are values K s and K u such that the optimisation problem is unimodal for stable and unstable systems if K < K s or K > K u .
The graphical results of Figure 4 show that if the derivative of the cost function at point τ = 0 is smaller than zero, that is d dτ K k (τ )| τ =0 < 0, the function is unimodal. On the contrary if the derivative is greater than zero the optimal point is equal to τ * = 0. As a result, we require the following condition for a unimodal optimisation problem which is the same as the first condition of (16). Hence, stable and unstable systems share equivalent unimodality conditions.

Integrator systems (a = 0)
In this case, the optimisation of the sampling pattern is simpler because Equations (3)-(7) for a = 0 reduce to φ(μ) := 1 Figure 4. Graphs of the sampling pattern cost function for unstable systems, given in Equation (17), for several values of R > 2 and K ≥ 1. The dots mark the minimum value for each cost function. Note that the domain is τ ∈ [1, ∞). The cost function is unimodal for sufficiently large values of K. Otherwise the optimal solution is τ * = 0.
with R := r qb 2 We show in Figure 5 the plot of the cost function for several combinations of parameters R and K, together with the optimal point. In this case, the unimodality condition is The derivative condition is finite for R > 2 and the previous derivative condition is equivalent to As a result, the problem is unimodal if condition (19) is true.

Proposed optimisation algorithm
In this section, we design the optimisation algorithm to solve the optimal sampling problem. The algorithm sequence is first to check the unimodality of the problem, second to compute the optimal sampling pattern and third to compute the optimal control. The pseudocode of the algorithm is presented in Figure 6 for stable and unstable systems and in Figure 7 for integrator systems. Being the cost functions unimodal the optimisation may be performed by some search method (i.e. golden ratio or the bisection method) or by gradient descendent methods starting from μ = 0 for integrator systems or from τ = 1 for stable or unstable systems. This problem must be solved at each stage from stage N−1 to stage 0. As a result, we obtain the optimal sampling pattern T * := {μ * 0 , μ * 1 , . . . , μ * N−1 }. Finally, once the optimal sampling patter is computed, the optimal control actions sequence is computed forward from stage 0 to stage N−1 by applying the definition of the optimal control action evaluated at the optimal interarrival μ * k , that is C k (μ * k ), as given in Equation (11). Therefore, we obtain the optimal control  actions sequence U * := {u * 0 , u * 1 , . . . , u * N−1 } and the problem is solved.

Stable system
Consider problem (1) with cost function parameters q = 1, r = 1, s = 1, dynamical system parameters a = −1, b = 1, and initial state x 0 = 1, that is We solve the optimisation problem for N = 2, N = 4 and N = 6 for comparative purposes. The first row of Figure 8 shows the cost-to-go gains K k and the optimal interarrivals μ * k , whereas the second row shows the optimal control action u * k and optimal state trajectory x * k . Consider for instance the first column of Figure 8 that corresponds to the case N = 2. The upper plot shows the cost-to-go Figure 8. Optimal sampling solutions for N = 2 (first column), N = 4 (second column) and N = 6 (third column) for a stable system as defined in problem (20). The first row shows the cost-to-go gains together with their minimum value, marked by blue dots, that provide the optimal interarrival times μ * k . The second row shows the optimal control action u * k and state trajectory x * k (in black) together with the continuous-time optimal solution (in blue) for comparison.
gains K 1 (blue) and K 0 (green) as a function of τ := e aμ k . The minimum of each function, marked by a blue dot, provides the optimum interarrival times μ * 1 and μ * 0 , respectively. The lower plot shows the optimal control u * k and state trajectory x * k , together with the optimal continuous-time solution u(t) * and x(t) * plotted in blue. As can be seen, even for N = 2, the state evolution x * k and the optimal continuous-time state x(t) * are almost indistinguishable whereas the discrete control action u * k seems a piecewise constant approximation of the optimal continuous-time control action u(t) * .
Consider now the solution for N = 4 plotted on the middle column of Figure 8. The upper plot shows the cost-to-go gains K 3 (blue), K 2 (green), K 1 (red) and K 0 (cyan) as a function of τ := e aμ k . Note that the first two cost-to-go gains K 3 and K 2 are equal to the cost-to-go gains from the previous problem N = 2. As a result, for the sampling pattern with N = 4, the only new information lies on the cost-to-go gains K 1 and K 0 . This shows the recursive and constructive nature of the optimal sampling pattern.
Finally, at the sight of solution with N = 6, it can be seen that the optimal state trajectory x * k is equal to the optimal continuous-time state trajectory x(t) * , whereas the optimal control action u * k again seems a piecewise approximation of the optimal continuous-time control action u(t) * . This result is in agreement with the quantisation-based sampling proposed in Bini and Buttazzo (2014), where only the continuoustime control input is approximated meanwhile the optimal continuous-time state trajectory is not considered in the computation of the optimal sampling pattern.

Unstable system
Consider problem (1) with cost function parameters q = 1, r = 1, s = 10, unstable dynamical system with parameters a = 1, b = 1, and initial state x 0 = 1, that is Again we solve the optimisation problem for N = 2, N = 4 and N = 6. The first row of Figure 9 shows the cost-to-go gains K k and the optimal interarrivals μ * k , whereas the second row shows the optimal control action u * k and optimal state trajectory x * k . Consider the first column of Figure 9 that corresponds to the case N = 2. The upper plot shows the cost-to-go gains K 1 (blue) and K 0 (green) as a function of τ := e aμ k . In this case, the domain is τ ∈ [1, ∞). The minimum of each function, marked by a blue dot, provides the optimum interarrival times μ * 1 and μ * 0 , respectively. The lower plot shows the optimal control u * k and state trajectory x * k together with the optimal Figure 9. Optimal sampling solutions for N = 2 (first column), N = 4 (second column) and N = 6 (third column) for an unstable system as defined in problem (21). The first row shows the cost-to-go gains together with their minimum value, marked by blue dots that provide the optimal interarrival times μ * k . The second row shows the optimal control action u * k and state trajectory x * k (in black) together with the continuous-time optimal solution (in blue) for comparison.
continuous-time solution u(t) * and x(t) * in blue. The state evolution between x * k and x(t) * differs more significantly than in the stable case, specially near the final time T * 2 . Consider now the solution for N = 4 plotted on the middle column of Figure 9. The upper plot shows the cost-to-go gains K 3 (blue), K 2 (green), K 1 (red) and K 0 (cyan) as a function of τ := e aμ k . As in previous example, the first two cost-to-go gains K 3 and K 2 are equal to the cost-to-go gains for the previous problem N = 2, showing again the recursive and constructive nature of the proposed solution.
Finally, the solution for N = 6 shows that the optimal state trajectory x * k is equal to the optimal continuous-time state trajectory x(t) * , whereas the optimal control action u * k again seems a piecewise approximation of the optimal continuous-time control action u(t) * . As a result, although for unstable systems the convergence towards the optimal continuous-time state solution is slower than for stable systems, even for low values of N the state trajectory resembles the optimal continuous-time showing the agreement with the quantisation-based sampling proposed in Bini and Buttazzo (2014).

Conclusion and future work
In this article, we have solved the optimal sampling problem for the free final time linear quadratic regulator. First, dynamic programming is applied to find, for each stage, a cost-to-go function dependent on the interarrival time. Second, it is shown how to compute the interarrival time that minimises the cost-to-go, thus solving the optimal sampling pattern problem. The computational approach is feasible because the algorithmic procedure is scalable, recursive and constructive.
The results are important because it is the first time that a proposed algorithm solves the optimal sampling problem without any form of approximation. Existing bibliographical results provide solutions that are approximated and computationally costly, because either are based on asymptotic arguments to show optimality, or are based on minimisation of non-convex problems. Furthermore, the proposed solution also provides the optimal final time.
The problem is stated for scalar systems. One important question is the generalisation to systems of arbitrary order. Preliminary results show that dynamic programming is again applicable and the sampling pattern problem may be cast as a maximum eigenvalue minimisation problem. Current research is focused on the solvability of the maximum eigenvalue minimisation problem by means of semi-algebraic optimisation techniques.

Disclosure statement
No potential conflict of interest was reported by the author(s).