# Discrete-Time LQR Example #1

This simple example illustrates the effects that the open-loop stability of the system and the values of the weighting matrices in the performance index have on the solution to the optimal discrete-time LQR problem. Although the model represents an extremely simple system, it more easily allows us to see the important characteristics of the solution than would a realistic model. The model in state space form is

The scalar A will be given two different values, one representing an unstable open-loop system and the other a stable open-loop system. We will see the effect that this difference has on the control signal, the LQR gain, the state trajectory, and other signals.

A fixed (finite) final time quadratic performance index will be used, so the Q, R, and SN weighting matrices (scalars in this example) will have to be specified. The value for SN will be fixed; Q and R will each take on two values. The final time is chosen to be 4 time steps so that we can easily list and examine the numerical values of the solutions. The form of the PI is:

A total of 4 experiments will be performed with this system, each experiment corresponding to one set of values for {A, Q, R}. The values for each of those parameters are listed below for the various experiments. As previously mentioned, A will take on values representing both stable and unstable systems. For each value of A, Q and R will take on a pair of values representing heavy weighting of the state variable relative to the control variable, and the reverse of those values. The specific values for the parameters in these experiments are:

The solution to each of these experiments will be presented one at a time, and then plots of the various signals will be given to compare the results. For the first experiment, we have an unstable open-loop system and heavy weighting of the state variable relative to the control variable. The Riccati equation is initialized and solved backward in time for 4 time steps. The corresponding gain matrix is also calculated. The state is then initialized, the feedback control signal is computed, and the resulting new state is determined. This process is continued until the state for the final time is computed. The expressions for the control law, LQR gain matrix, and Riccati equation are

The "cost-to-go" values for the performance index (the value of J from k =k0 to k=N) are also computed for each k from 0 to N, using the expression

For the case of {A, Q, R} = {2, 20, 2}, the results of the design and simulation are shown in the table below.

Note that no values are recorded for the LQR gain Gk and the control signal uk at the final time. With the finite final time performance index, these values are only computed for k=0 to k=N-1. The optimization problem is over when the state reaches the final time. Since the LQR gain at time k depends on the Riccati matrix at time k+1 and the state at time k+1 depends on the control at time k, the values of the gain and control signal at the final time are not needed or used. Other control strategies must be used when the system reaches the final time.

Note that the value of the state variable is reduced approximately 86% from its initial condition in one time step. The large value of Q relative to R forces this to happen. The final state of approximately 0.01 illustrates the effect of a large value of SN. Even though the final time is only 4, it can be seen that the Riccati solution Sk and the LQR gain Gk quickly reach steady-state values.

For the case of {A, Q, R} = {2, 2, 20}, the results of the design and simulation are shown in the table below.

When the values of Q and R are interchanged, the performance index is now placing more emphasis on keeping the magnitude of the control signal small than on making the magnitude of the state variable small. With this new PI, the value of the control gain is smaller, so the magnitude of the control signal will also be smaller, and the state variable will not be driven towards the origin as rapidly. This is seen in the entries of the table above. Also seen in the table is the fact that the Riccati equation has not reached a steady-state solution in 4 time steps. One conclusion that can be drawn from these first two experiments is that when Q is larger than R, the all the variables in the system will respond more rapidly, and that this more rapid response requires larger control signals for a given value of the state variable.

For the case of {A, Q, R} = {0.5, 20, 2}, the results of the design and simulation are shown in the table below.

The model now represents a open-loop stable system. Controlling this system shoud be easier than the previous two experiments since the natural tendency of an asymptotically stable system is move toward the origin without any feedback. The large value of Q relative to R will also cause the state to move from its initial value toward 0 quickly. The open-loop stability of the system is reflected in the smaller gain values than were seen in the previous experiments. The large value of Q relative to R in this performance index is seen by the large reduction in the value of the state variable in the first time step, and by the very small values for the state variable and the performance index at the final time. As in the first experiment, the Riccati equation solution reaches steady state very quickly.

For the case of {A, Q, R} = {0.5, 2, 20}, the results of the design and simulation are shown in the table below.

In the last experiment, the open-loop system is stable, but we have small value of Q relative to R. Just as was seen in the second experiment, the state variable does not approach the origin as closely as before, and the value of the performance index is higher at the final time. The Riccati equation solution has not reached steady state in this time interval.

Riccati equation solutions
The numbers next to the curves in each of these figures are the experiment numbers given in the table at the beginning of this example. In each experiment, the Riccati variable is initialized at the same value, namely at SN. The two solutions where the performance index has a large value of Q relative to R reach steady state essentially in one time step. The two graphs for unstable open-loop systems have larger steady-state values than the solutions for the stable open-loop systems.

For the first-order system considered in this example, the steady-state solutions of the Riccati equation can be found by setting Sk = Sk+1 = Sss, and using the quadratic formula to solve for Sss, remembering that it has to be non-negative. The steady-state value of the LQR gain is determined by substituting Sss into the normal gain equation. The expressions for the steady-state solutions are:

When there is a very large or very small ratio of Q to R and a very large or very small magnitude for A, the following limiting values for the steady-state solutions can be obtained. Although the values in this example do not exactly fit the limiting conditions, the solutions for S0 and G0 should be compared to those limiting steady-state expressions to see the applicability of the expressions. The computed values for the initial time are shown next to the limiting expressions. Note that the ordering of the rows is the same in the two tables.

LQR gains
The gains for the two experiments where the open-loop system is unstable are significantly larger than the gains for the stable open-loop systems. More control effort will be required to force the state of the system to the origin when the natural tendency of the system is for the state to become unbounded.

Although eigenvalues do not have all the same interpretations for time-varying systems that they do for time-invariant systems, it is instructive to look at the eigenvalues of the closed-loop system with the steady-state gain. For this example, the "steady-state" eigenvalues are given by

Of particular interest are the results for experiments 2 and 4, when R > Q. In this situation, the "cost" of control is expensive, and it is more important to keep the control signal small than it is to make the state variable small. If the open-loop system is stable (experiment 4), the optimal solution is to use very little control (in the limit, no control), and the corresponding gain is near 0, and the closed-loop eigenvalue is near the open-loop eigenvalue (0.5). If the open-loop system is unstable (experiment 2), control is required, and the optimal policy is to reflect the unstable open-loop eigenvalue (2) inside the unit circle to its reciprocal value (0.5).

Optimal state trajectories
In this set of graphs, it is easy to see the effect of changing the relative values of Q and R. When Q is larger than R, the state variable approaches 0 quickly, but when Q is smaller than R, the state decays more slowly. In the 4 time steps of this example, the state has not gotten very close to 0 when Q is small. For both sets of Q and R values, the stable open-loop system gets closer to the origin in the 4 time steps than does the unstable open-loop system. If the goal is to drive the state to the origin quickly, the obvious choice is to make Q large relative to R. Note that the value of the state at time 1 equals the initial state multiplied by the eigenvalue at the initial time.

Optimal control signals
Since the initial condition for the state variable in each experiment is the same, the value for the control signal u0 is proportional to the gain G0. But the value of G0 depends on the open-loop stability of the system as well as on Q and R. For given values of the weighting parameters, the unstable system will have a larger gain and larger initial control signal than the open-loop stable system does. For a given case of open-loop stability, the LQR gain will be larger when Q is larger than R. The larger control signal however will drive the state closer to 0 than the smaller signal, so the situation becomes more complex for k > 0. The graphs show that although the larger value of Q produces a larger value of u at the initial time, the control signal is smaller for the later time steps, particularly when the open-loop system is unstable. As the state variable becomes small in value, the value of the control signal will also become small.

Values of the performance indices
The "cost-to-go" values of the performance indices are plotted on a semilog scale since the values have a large dynamic range. With the quadratic performance index, for a given set of {A, B, Q, R} values, the optimal control that results from solving the Riccati equation is unique. The value of J0 is the value of the quadratic performance index when the optimal control is applied to the system from the given initial condition. Therefore, each of the curves represents the optimal performance for a given set of conditions.

The values of A and B are dictated by the physics of the system, and the control engineer generally has little or no influence on those values. On the other hand, the values of Q and R can be chosen by the designer so that performance requirements are (hopefully) satisfied.

Consider first the situation with the unstable open-loop system (experiments 1 and 2). From the values of those two performance indices, having Q larger than R seems to be the better choice since that leads to smaller values of Jk for each k. The state trajectories would also favor that choice since the state variable decays much more rapidly with the larger Q and ends much closer to the origin. The only bad news with this choice is in the value of the control signal at the initial time. With the larger Q, the initial control signal is 23% larger in magnitude than with the smaller value for Q. If this larger magnitude control signal is acceptable, then the values of the weighting parameters in experiment 1 should be chosen.

Now consider the situation with the stable open-loop system (experiments 3 and 4). The choice is perhaps not as clear as it was for the other experiments. Looking only at the value of J0, the obvious choice is to have Q smaller than R; this reduces the value of the performance index by a factor of 7.8. However, making that choice also means accepting the fact that the state variable does not decay very rapidly and does not approach the origin very closely at the final time. When the control signals are examined, the only significant difference is at the initial time. Having Q smaller than R reduces the magnitude of that initial control value by a factor of 7.7, which could be a significant difference. Based on the lower total cost and the lower magnitude of the initial control signal, the proper choice might be to use the smaller value of Q, that is, the weighting parameters of experiment 4.