138x Filetype PDF File size 0.50 MB Source: core.ac.uk
CORE Metadata, citation and similar papers at core.ac.uk Provided by CONICET Digital Latin American Applied Research 39:207-211 (2009) PARTIAL DIFFERENTIAL EQUATIONS FOR MISSING BOUNDARY CONDITIONS IN THE LINEAR-QUADRATIC OPTIMAL CONTROL PROBLEM † ‡ V. COSTANZA and C. E. NEUMAN † INTEC (UNL-CONICET), Güemes 3450, 3000 Santa Fe, Argentina tsinoli@ceride.gov.ar ‡ Dep. de Matemática (FIQ), Universidad Nac. del Litoral, Sgo. del Estero 2829, 3000 Santa Fe, Argentina ceneuman@fiqus.unl.edu.ar Abstract−− New equations involving the unknown the Hamiltonian of the problem can be uniquely opti- 0 final states and initial costates corresponding to mized by a control value u depending on the remaining families of LQR problems are found, and their solu- variables (t,x,λ), then a set of 2n ordinary differential tions are computed and validated. Having the initial equations (ODEs) with a two-point boundary-value values of the costates, the optimal control can then condition, known as Hamilton's (or Hamiltonian) equa- be constructed, for each particular problem, from tions (HE), has to be solved. This is often a rather diffi- the solution to the Hamiltonian equations, now cult numerical problem. For the linear-quadratic regula- achievable through on-line integration. The missing tor (LQR) with a finite horizon there exist well known boundary conditions are obtained by solving (off- methods (see for instance Sontag, 1998) to transform line) two uncoupled, first-order, quasi-linear, partial the boundary-value problem into an initial-value one. differential equations for two auxiliary n × n matri- In the infinite-horizon, bilinear-quadratic regulator and ces, whose independent variables are the time- change of set-point servo, there is a recent attempt to horizon duration T and the final-penalty matrix S. find the missing initial condition for the costate vari- The solutions to these PDEs give information on the able, which allows to integrate the equations on-line behavior of the whole two-parameter family of con- with the underlying control process (Costanza and trol problems, which can be used for design pur- Neuman, 2006). poses. The mathematical treatment takes advantage Hamiltonian systems (modelled by a 2n-dimensional of the symplectic structure of the Hamiltonian for- ODE whose vector field can be expressed in terms of malism, which allows to reformulate one of Bell- the partial derivatives of an underlying “total energy” man's conjectures related to the “invariant- function -called “the Hamiltonian”-, constant along tra- imbedding” methodology. Results are tested against jectories), are key objects in Mathematical Physics. The solutions of the differential Riccati equations associ- ODEs for the state and costate of an optimal control ated with these problems, and the attributes of the problem referred above constitute a Hamiltonian system two approaches are illustrated and discussed. from this general point of view. Richard Bellman has Keywords−− optimal control, linear-quadratic contributed in both fields, but was particularly interested problem, first order PDEs, boundary-value prob- in symplectic systems coming from Physics (see for in- lems, Riccati equations. stance Abraham and Marsden, 1978) when he devised a partial differential equation (PDE) for the final value of I. INTRODUCTION the state x(t)=r(T,c) as a function of the duration of the f process T=t-t , and of the final value imposed to the co- The linear-quadratic regulator (LQR) problem is proba- f 0 state λ(t)=c (one of the boundary conditions, the other bly the most studied and used in the state-space optimal f being the fixed initial value of the state x(t )=x , see control literature. The main line of work in this direc- 0 0 tion has evolved around the algebraic (ARE, for infi- Bellman and Kalaba, 1963). Bellman exploited in that nite-horizon problems) and differential (DRE, for finite- case ideas common to the “invariant imbedding” nu- horizon ones) Riccati equations. When expressed in 2n- merical techniques, also associated with his name. phase space, i.e. introducing the costate (the spacial de- In Costanza (2008) the invariant imbedding ap- rivative of the value function), the dynamics of the op- proach is generalized and proved for the one- timal control problem takes the form of the classical dimensional nonlinear-quadratic optimal control situa- Hamilton's equations of fundamental Physics. tion, where the final value of the costate depends on the Since early sixties, Hamiltonian formalism has been final value of the state, i.e. c=c(r). The procedure fol- at the core of the development of modern optimal con- lowed in this proof induces another PDE for the initial value σ of the costate λ(t ), which was actually the main trol theory (Pontryagin et al., 1962). When the problem 0 concerning an n-dimensional system and an additive concern from the optimal control point of view. The cost objective is regular (Kalman et al., 1969), i.e. when first-order quasilinear equation for σ developed here is new. It can be integrated after the PDE for the final 207 V. CONSTANZA, C. E. NEUMAN state ρ (independent of σ) has been solved. The “initial” 0 (7) condition for σ depends on the final value of the state λ =−Hx(x,λ); λ(T)=2Sx(T), 0 0 0 0 and the weight matrix S involved in the quadratic final where H (x,λ) stands for H(x,λ,u (x,λ)), and Hλ , Hx penalty x’(T)Sx(T). Therefore it seems more natural to for the column vectors with i-components ∂H0 ∂λ , consider here (T,S) as the independent parameters of the i ∂H0 ∂x respectively, which here take the form family of control problems under consideration. Having i found the solution σ(T,S) the HE can be integrated for 1 (8) each particular value of the parameters. However, the x = Ax− 2Wλ whole curves ρ(.,S), σ(.,S) can be useful in real time, as (9) λ = −2Qx−A'λ a kind of safeguard against unexpected departures of the with W ≡ BR−1B'. There are no general solutions to numerical solution to the HE. Normal solutions to boundary-value problems. In the following section a Hamiltonian systems are unstable near equilibrium, a novel approach to overcome this difficulty, by imbed- characteristic inherent to the spectrum of their lineariza- ding the individual situation into a two-parameter fam- tions (if λ is an eigenvalue of the linear approximation, ily of similar problems, will be presented and substanti- so is –λ, see Abraham and Marsden, 1978). ated. In what follows, and after some notation and general characteristics of the problem are exposed in section II, III. EQUATIONS FOR THE MISSING the main PDE equations for the missing boundary con- BOUNDARY CONDITIONS ditions are proved in section III. Numerical validations For the nonlinear-quadratic one-dimensional case two and illustrations are provided in section IV, and the quasilinear first-order PDEs (53, 55) have been found whole approach discussed in the Conclusions. An Ap- (see the Appendix), one for each of the missing bound- pendix is added to substantiate the general set-up valid ary conditions of the problem, namely the final state for the nonlinear case, and the corresponding equations x(T) and the initial costate λ(0). Unfortunately such for the one-dimensional case reviewed (see Costanza, PDEs can not be extrapolated to higher dimensions in 2008; for additional details). an obvious way. However, the immersion of the prob- II. FORMULATION OF THE PROBLEM lem into a two-parameter (T,S) family is still fruitful, as will be evident from what follows. The classical finite-horizon formulation of the “LQR It is well known that the LQR problem has a unique problem” for finite-dimensional, constant systems, at- solution via the Riccati differential equation (DRE) tempts to minimize the (quadratic) cost (10) π =πWπ −πA−A'π −Q; π(T)=S, T leading to the optimal feedback [] JT,0,x (u(.)) = ∫ x'(τ)Qx(τ)+u'(τ)Ru(τ) dτ + 0 (1) u*(t) = −R−1B'π(t)x(t). (11) 0 x'(T)Sx(T) An alternative classical approach (see for instance with respect to all admissible control trajectories u(.) of Bernard, 1972) transforms the original boundary-value duration T applied to some fixed, deterministic (linear) problem into an initial-value one, by introducing the fol- n lowing auxiliary objects: plant; i.e. those affecting the ℜ - valued states x of the (i) the Hamiltonian matrix H, system through some dynamic restriction ⎡ 1 ⎤ (2) x = f (x,u) = Ax+bu, x(0) = x0 ≠ 0. H=⎢ A −2W⎥, (12) The (real, time-constant) matrices in Eqs. (1,2) will ⎢−2Q −A' ⎥ have the following properties: Q,R,S symmetric, Q,S≥0, ⎣ ⎦ R>0, A∈M (ℜ), B is n × m. The expression under the (ii) and the augmented Hamiltonian system (a linear n matrix ODE) defined for two n×n matrices X(t), integral is usually known as the “Lagrangian” L of the Λ(t), t∈[0,T] through cost, i.e., ⎛ ⎞ X X(T) I L(x,u)≡ x'Qx+u'Ru. (3) ⎜ X ⎟ ⎛ ⎞ ⎛ ⎞ ⎛ ⎞ (13) ⎜ ⎟ ⎜ ⎟ ⎜ ⎟ =H ; = . n ⎜ ⎟ ⎜Λ⎟ ⎜Λ(T)⎟ ⎜2S⎟ The Hamiltonian of such a problem, namely the ℜ ⎝Λ⎠ ⎝ ⎠ ⎝ ⎠ ⎝ ⎠ × ℜn n ×ℜ→ℜ function defined by The solution to system (13) is H(x,λ,u)≡L(x,u)+λ' f(x,u), (4) ⎛ X(0)⎞ −HT⎛ I ⎞ (14) ⎜ ⎟ ⎜ ⎟ =e , is known to be regular, i.e. that H is uniquely minimized ⎜Λ(0)⎟ ⎜2S⎟ with respect to u by the control value ⎝ ⎠ ⎝ ⎠ 1 and since in this case Eqs. (6-7) read 0 −1 (3) u (x,λ)=− R B'λ ⎛ x⎞ ⎛ x⎞ (15) ⎜ ⎟ ⎜ ⎟ 2 =H , ⎜ ⎟ ⎜ ⎟ (in this case, independent of x), which is usually called ⎝λ⎠ ⎝λ⎠ “the H-minimal control.” The “Hamiltonian” form of the missing boundary conditions can be explicitly found the problem (see for instance Sontag, 1998) requires x(T)= X−1(0)x , (16) then to solve the two-point boundary-value problem 0 λ(0)=Λ(0)X−1(0)x , (17) 0 (6) 0 x = Hλ(x,λ); x(0) = x0, (see Sontag, 1998, for the invertibility of X and other 208 Latin American Applied Research 39:207-211 (2009) details). Actually, the whole solution to DRE can be re- It can be easily checked that the PDEs and boundary cuperated from the solution to the augmented system conditions (53-54, 55-56), whose validity is already (13), namely known in the one-dimensional nonlinear case (see the −1 . (18) Appendix), are also verified by the scalar version of π(t) = Λ(t)X (t), t ∈[]0,T However it is desirable to count with the missing Eqs. (30), namely boundary values for different values of the parameters ρ = x0 α, σ =βx0 α, (31) T,S without solving either the DRE or the augmented (α is always nonzero), provided that α, β satisfy Eqs. system described above. A method to solve the whole (27, 28, 29). (T,S)-family of LQR problems (with common A, B, Q, Two other possible reformulations of these equa- R, x values) was then developed. To be precise, just the tions for linear n-dimensional systems may be explored 0 case with S=sI, with s≥0 will be exposed, the extension (with theoretical rather than practical purposes): the sca- to non-scalar matrices being more operationally in- lar (internal) product volved but not conceptually different. Nevertheless, the ( ) (32) preset set-up is pertinent to many applications. From ρ'ρT −(Mρ)'ρS ≈ (A−WS)ρ 'ρ now on, the notation for the relevant missing boundary and the matrix (external product) form: ( ) , (33) values and matrices will be ρTρ'−ρS(Mρ)'≈ρ (A−WS)ρ ' ρ ≡ x(T); σ ≡λ(0); U ≡e−HT . (19) (and the corresponding analogous equations for σ). Both The method starts by defining, for each particular (T,S)- proved to be insufficient to predict the desired values of problem ρ, σ, so it was necessary to generalize to matrix equa- ⎛α(T,S)⎞ ⎛X(0)⎞ tions for α and β in order to solve the general linear (20) ⎜ ⎟ ⎜ ⎟ problem. In the next section some results of numerical ≡ , ⎜β(T,S)⎟ ⎜Λ(0)⎟ calculations involving the solutions to Eqs. (27,28) are ⎝ ⎠ ⎝ ⎠ which allows to rewrite Eq. (14) in the form examined and compared against the DRE approach. ⎛α⎞ ⎛ I ⎞ (21) IV. NUMERICAL CALCULATIONS AND ⎜ ⎟ ⎜ ⎟ ≡U . ⎜β⎟ ⎜2S⎟ ADDITIONAL VALIDATIONS ⎝ ⎠ ⎝ ⎠ Equations (27, 28, 29) were solved numerically with Since the subjacent Hamiltonian system is linear, so- standard software in several cases, and the solutions lutions depend smoothly on parameters and initial con- were tested to verify the following identities stemming ditions, and then derivatives of Eq. (21) with respect to from the symplectic structure of the problem (see Katok (T,S) can be taken and Hasselblatt, 1999; Jacobson, 1974) ⎛αT ⎞ ⎛ I ⎞ (22) U 'U −U 'U =I =U 'U −U 'U , (34) ⎜ ⎟ ⎜ ⎟ =−HU , 1 4 3 2 4 1 2 3 ⎜β ⎟ ⎜2S⎟ ⎝ T ⎠ ⎝ ⎠ U 'U −U 'U = I =U 'U −U 'U . (35) ⎛α ⎞ ⎛ 0 ⎞ 1 3 3 1 2 4 4 2 S Also, to illustrate the theoretical results, some compo- ⎜ ⎟ ⎜ ⎟ (23) =U . ⎜β ⎟ ⎜2I⎟ nents of the solutions (ρ, σ) for the linear system with ⎝ S ⎠ ⎝ ⎠ Now, by partitioning in the obvious way matrices ⎛U U ⎞ ⎛−2 0 ⎞ ⎛1⎞ (36) 1 2 ⎜ ⎟ ⎜ ⎟ ⎜ ⎟ (24) A= , b= , U= . ⎜ ⎟ ⎜ ⎟ ⎜ ⎟ 3 −1 1 ⎝U3 U4⎠ ⎝ ⎠ ⎝ ⎠ Then, Eq. (23) reads subject to the quadratic Lagrangian defined by matrices 1 1 ⎛ 1 0.3⎞ ⎜ ⎟ α =U , β =U , (25) Q= , r =1, S 2 S 4 ⎜0.3 2 ⎟ 2 2 ⎝ ⎠ which combined with Eq. (21) gives were plotted for the range (T,S)∈[0,1]×[0,1] in Figs. 1 U1 =α −SαS, U3 = β −SβS, (26) and 2. and then, by inserting these results in Eq. (22), the fol- The results were also compared with the DRE solu- lowing (main) relations are obtained tion for several intermediate (T,S)-problems, which α −α M =−αN, (27) showed almost no difference, as can be seen in Figure 3. T S The time trajectories were also calculated from the β −β M =−βN, (28) initial conditions T S σ(0.5,sI) for several values of s, just to where M ≡ A'S +SA+Q−SWS, N ≡ A−WS. illustrate how the regulation capacity (the approaching Boundary conditions for a process of zero horizon to (0,0)) increases with increasing final penalty (Fig. 4). are imposed in view of Eqs. (20, 16, 17), i.e. It is also interesting to observe that the two components of the state tend to the origin, since the optimal LQR α(0,S)=I, β(0,S)=2S. (29) The desired values of the missing values for the state control is stabilizing, but they do not decrease mono- and costate, for any (T,S)-problem may then be recuper- tonically as the results in Fig. 1 may suggest. Actually, for small final penalty (s=0.25), the second component ated from the solutions α and β through first grows from its initial value (0.1), and only after −1 −1 some time it heads towards equilibrium (Figs. 4 and 5). ρ =α x0, σ =βα x0 =βρ. (30) 209 V. CONSTANZA, C. E. NEUMAN tonian formulation of finite-horizon LQR problems, into an initial-value set-up with unique solution, have been derived, solved and illustrated. The approach is based on invariant-imbedding ideas, although the original Bellman methodology resulted somehow inadequate in 0.1 this case. By solving two matrix, quasilinear, first-order 0.075 PDEs for auxiliary variables α, β proposed here, the 0.05 1 missing boundary conditions are effectively recuperated 0.025 0.75 after simple manipulations. Actually, the auxiliary vari- ables are found for a two-parameter family of LQR 0 0.5 S problems posed for fixed plant dynamics and trajectory 0.25 0.25 costs, but with variable final penalty and horizon spans. T 0.5 This immersion allows a whole range of (T,S)-problems 0.75 0 to be assessed by looking at the final reachable state 1 ρ(T,S) and the associated marginal cost σ(T,S). Figure 1: First component ρ (T,s) of final state value calcu- It is remarkable that the solution to a twice-infinite 1 family of LQR problems requires little numerical effort, lated from matrix α. roughly similar to the one involved in running the asso- ciated DRE\ for just one individual situation. The solu- tion for a range of (T,S)-values provides design informa- tion, useful when flexible choice of the parameters to improve performance is present. 0.2 The soundness of this approach for linear plants 0.15 1 seems also promising in suggesting an algorithm to 0.1 solve multidimensional nonlinear problems with regular 0.05 0.75 Hamiltonians, even allowing for more general Lagran- 0 0.5 gians than those described by quadratic forms. This 0.25 S idea is already under development. 0.5 0.25 T 0.75 1 0 Figure 2: First component σ (T,s) of initial costate, from ma- 1 trices α and β. x 10-4 5 δ 0 -5 Figure 4: Trajectories in phase space. 1 1 0.5 0.5 s 0 0 T 0.1 x Figure 3: Difference between ρ calculated from matriz α and s 2 by solving de DRE. 1 For this example both members of relations (32) and state0.05 -3 x (33) were also evaluated. An agreement of the order 10 1 was observed. Since the values of the variables in- volved in the calculations are relatively small, firm hy- potheses on the soundness of Eqs. (32, 33) can not be 0 raised yet. 0 0.2 0.4 V. CONCLUSIONS t New equations to transform the classical two-point Figure 5: States time-trajectories for small final penalty boundary-value ODE system associated with the Hamil- (s=0.25). 210
no reviews yet
Please Login to review.