Programming Rust Pdf 188714 | Ndp Item Download 2023-02-03 03-13-01

Partial capture of text on file.

          Numerical Dynamic Programmingin Economics
          John Rust
          Yale University
          Contents 1
          1. Introduction
          2. Markov Decision Processes (MDP’s) and the Theory of Dynamic Programming
            2.1Deﬁnitions of MDP’s, DDP’s, and CDP’s
            2.2Bellman’s Equation, Contraction Mappings, and Blackwell’s Theorem
            2.3Error Bounds for Approximate Fixed Points of Approximate Bellman Operators
            2.4AGeometric Series Representation for MDP’s
            2.5Examples of Analytic Solutions to Bellman’s Equation for Speciﬁc “Test Problems”
            2.6Euler Equations and Euler Operators
          3. Computational Complexity and Optimal Algorithms
            3.1Discrete Computational Complexity
            3.2Continuous Computational Complexity
            3.3Computational Complexity of the Approximation Problem
          4. Numerical Methods for Contraction Fixed Points
          5. Numerical Methods for MDP’s
            5.1Discrete Finite Horizon MDP’s
            5.2Discrete Inﬁnite Horizon MDP’s
               5.2.1Successive Approximation, Policy Iteration, and Related Methods
               5.2.2Numerical Comparison of Methods in Auto Replacement Problem
               5.2.3New Approaches to Solving Large Scale Problems
            5.3Continuous Finite Horizon MDP’s
               5.3.1Discrete Approximation Methods
               5.3.2Smooth Approximation Methods
            5.4Continuous Inﬁnite Horizon MDP’s
               5.4.1Discrete Approximation Methods
               5.4.2Smooth Approximation Methods
          6. Conclusions
         1 RevisedNovember1994draftforHandbookofComputationalEconomics,H.Amman,D.KendrickandJ.Rust,(eds.).
          I am grateful for helpful comments by Hans Amman, Ken Judd, David Kendrick, Martin Puterman, and two not very
          anonymousreferees, Charles Tapiero and John Tsitsiklis.
       7. References
                                                                                                    2
              1. Introduction
                   This chapter surveys numerical methods for solving dynamic programming (DP) problems.
              The DP framework has been extensively used in economic modeling because it is sufﬁciently
              rich to model almost any problem involving sequential decision making over time and under
              uncertainty.2 By a simple re-deﬁnition of variables virtually any DP problem can be formulated as
              aMarkovdecisionprocess(MDP)inwhichadecisionmakerwhoisinstatest attimet = 1,...,T
              takes an action at that determines current utility u(st,at) and affects the distribution of next
              period’s state st+1 via a Markov transition probability p(st+1|st,at). The problem is to determine
              an optimal decision rule α that solves V (s) ≡ maxα EαnPT   βtu(st,at)|s0 = so where Eα
                                                                      t=0
              denotes expectation with respect to the controlled stochastic process {st,at} induced by the
              decision rule α ≡ {α1,...,αT}, and β ∈ (0,1) denotes the discount factor. What makes these
              problemsespecially difﬁcult is that instead of optimizing over an ex ante ﬁxed sequence of actions
              {a0,...,aT} one needs to optimize over sequences of functions {α0,...,αT} that allow the ex
              post decision at to vary as a best response to the current state of the process, i.e. at = αt(st). The
              method of dynamic programming provides a constructive, recursive procedure for computing α
              usingthevaluefunctionV asa“shadowprice”todecentralizeacomplicatedstochastic/multiperiod
              optimization problem into a sequence of simpler deterministic/static optimization problems. 3 In
              inﬁnite horizon problems V is the unique solution to a Bellman’s equation, V = Γ(V ), where the
              Bellman operator Γ is deﬁned by:
                                   Γ(V)(s) = max [u(s,a)+βZ V(s′)p(ds′|s,a)],                    (1.1)
                                             a∈A(s)
              and the optimal decision rule can be recovered from V by ﬁnding a value α(s) ∈ A(s) that attains
              themaximumin(1.1)foreachs ∈ S. WereviewthemaintheoreticalresultsaboutMDP’sinsection
              2, providing an intuitive derivation Bellman’s equation in the inﬁnite horizon case and showing
              that the value function Vα corresponding to a policy α is simply a multidimensional generalization
              of a geometric series. This implies that Vα can be computed as the solution to a system of linear
              equations, the key step in the Bellman 1955, 1957 and Howard 1960 policy iteration algorithm.
              The Bellman operator has a particularly nice mathematical property: Γ is a contraction mapping.
             2 See Stokey and Lucas 1987 for examples of DP models in economic theory. See Rust 1994a, 1994b for examples of
              of DP modelsin econometrics.
             3 In ﬁnite horizon problems V actually denotes an entire sequence of value functions, V ≡ {VT,...,V T}, just as α
                                                                                    0      T
              denotes a sequence of decision rules. In the inﬁnite-horizon case, the solution (V,α) reduces to a pair of functions of
              the current state s.
                                                                    3
          Alargenumberofnumericalsolutionmethodsexploitthecontractionpropertytoyieldconvergent,
          numerically stable methodsfor computingapproximate solutionsto Bellman’sequation, including
          the classic method of successive approximations. Since V can be recovered from α and vice versa,
          the rest of this chapter focuses on methods for computing the value function V , with separate
          discussions of numerical problems involved in approximating α where appropriate.4
             Fromthestandpointofcomputation,thereisanimportantdistinctionbetweendiscreteMDP’s
          whosestateandcontrolvariablescanassumeonlyaﬁnitenumberofpossiblevalues,andcontinuous
          MDP’s whose state and control variables can assume a continuum of possible values. The value
                                                                   |S|
          functions for discrete MDP’s live in a subset B of the ﬁnite-dimensional Euclidean space R
          (where |S| is the number of elements in S), whereas the value functions of continuous MDP’s live
          in a subset B of the inﬁnite-dimensional Banach space B(S) of bounded, measurable real-valued
          functions on S. Thus, discrete MDP problems can be solved exactly (modulo rounding error
          in arithmetic operations), whereas the the solutions to continuous MDP’s can generally only be
          approximated. Approximate solution methods may also be attractive for solving discrete MDP’s
          with a large number of possible states |S| or actions |A|.
             TherearetwobasicstrategiesforapproximatingthesolutiontoacontinuousMDP:1)discrete
          approximation and 2) smooth approximation. Discrete approximation methods solve a ﬁnite state
          MDPproblemthatapproximatestheoriginalcontinuousMDPproblemoveraﬁnitegridofpointsin
          the state space S and action space A. Since the methodsfor solvingdiscrete MDP’shave been well
          developed and exposited in the operations research literature (e.g. Bertsekas, 1987, Porteus, 1980
          andPuterman,1990,1994),thischapterwillonlybrieﬂyreviewthestandardapproaches,focusing
          instead on recent research on discretization methods for solving continuous MDP’s. Although
          smooth approximation methods also have a long history (dating back to Bellman, 1957 Bellman
          and Dreyfus, 1962 and Bellman, Kalaba and Kotkin 1963), there has been a recent resurgence
          of interest in these methods as a potentially more efﬁcient alternative to discretization methods
          (Christiano and Fisher, 1994, Johnson et. al. 1993, Judd and Solnick 1994, Marcet and Marshall,
          1994, and Smith 1991). Smooth approximation methods treat the value function V and/or the
          decision rule α as smooth, ﬂexible functions of s and a ﬁnite-dimensional parameter vector θ:
          examples include linear interpolation and a variety of more sophisticated approximation methods
         4 Special care must be taken in problems with a continuum of actions, since discontinuities or kinks in numerical
          estimates of the conditional expectation of V can create problems for numerical optimization algorithms used to
          recover α. Also, certain solution methods focus on computing α directly rather than indirectly by ﬁrst computing V .
          Wewillprovidespecial discussions of these methods in section 5.

The words contained in this file might help you see if this file matches what you are looking for:

...Numerical dynamic programmingin economics john rust yale university contents introduction markov decision processes mdp s and the theory of programming denitions ddp cdp bellman equation contraction mappings blackwell theorem error bounds for approximate fixed points operators ageometric series representation examples analytic solutions to specic test problems euler equations computational complexity optimal algorithms discrete continuous approximation problem methods finite horizon innite successive policy iteration related comparison in auto replacement new approaches solving large scale smooth conclusions revisednovemberdraftforhandbookofcomputationaleconomics h amman d kendrickandj eds i am grateful helpful comments by hans ken judd david kendrick martin puterman two not very anonymousreferees charles tapiero tsitsiklis references this chapter surveys dp framework has been extensively used economic modeling because it is sufciently rich model almost any involving sequential making ...

Related files

Share

Help

Related files

Share

Share to social media

Help

Login Area