Zoppoli R. et al. Neural approximations for optimal control and decision

pdf file
size 8,28 MB

added by Masherov 10/21/2020 17:30

Zoppoli R. et al. Neural approximations for optimal control and decision

New York: Springer. 2020. — 532 p.

The Basic Infinite-Dimensional or Functional Optimization Problem
IDO and FDO Problems
From the Ritz Method to the Extended Ritz Method (ERIM)
Approximation of Functions
From Function Approximation to Approximate Infinite-Dimensional Optimization
Relationships with Parametrized Control Approaches
Contents and Structure of the Book
Statement of the Problem
Finite-Dimensional Versus Infinite-Dimensional Optimization
General Conventions and Assumptions
Existence and Uniqueness of Minimizers
Other Definitions, Conventions, and Assumptions
A Deterministic Continuous-Time Optimal Control Problem
A Continuous-Time Network Flow Problem
A T-Stage Stochastic Optimal Control Problem
An Optimal Estimation Problem
A Static Team Optimal Control Problem

From Functional Optimization to Nonlinear Programming by the Extended Ritz Method
Fixed-Structure Parametrized (FSP) Functions
The Sequence of Nonlinear Programming Problems Obtained by FSP Functions of Increasing Complexity
The Case of Problem P
The Case of Problem PM
Solution of the Nonlinear Programming Problem Pn
Optimizing FSP Functions
Polynomially Complex Optimizing FSP Functions
The Growth of the Dimension d
Polynomial and Exponential Growths of the Model Complexity n with the Dimension d
Approximating Sets
Polynomially Complex Approximating Sequences of Sets
The Worst-Case Error of Approximation of Functions
Polynomial and Exponential Growth of the Model Complexity n with the Dimension d
Connections Between Approximating Sequences and Optimizing Sequences
From mathcalSd-Approximating Sequences of Sets to Pd-Optimizing Sequences of FSP Functions
From Pd-Optimizing Sequences of FSP Functions to Polynomially Complex Pd-Optimizing Sequences of FSP Functions
Final Remarks
Notes on the Practical Application of the ERIM

Some Families of FSP Functions and Their Properties
Linear Combinations of Fixed-Basis Functions
The Structure of OHL Networks
A More Abstract View
Tensor-Product, Ridge, and Radial Constructions
Multi-Hidden-Layer Networks
Terminology
Kernel Smoothing Models
mathscrC- and mathscrLp-Density Properties
The Case of Ridge OHL Networks
The Case of Radial OHL Networks
From Universality of Approximation to Moderate Model Complexity
The Maurey–Jones–Barron Theorem
Case Study: Sigmoidal OHL Networks
Lower Bounds on Approximation Rates
On the Approximation Accuracy of MHL Networks
Extensions of the Maurey–Jones–Barron Theorem
Geometric Rates of Approximation
An Upper Bound in Terms of the mathcalG-Variation Norm
Estimating the mathcalG-Variation
Some OHL Networks with the Same Approximation Rate
The ERIM Versus the Classical Ritz Method
The IDO Problem Used for the Comparison
Approximate Solution by the Ritz Method
The Curse of Dimensionality in the Ritz Method and the Possibility of Avoiding it in the ERIM
Rates of Optimization by the ERIM

Design of Mathematical Models by Learning From Data and FSP Functions
Learning from Data
The Expected Risk
The Empirical Risk
The Generalization Error
The Approximation and Estimation Errors are Components of the Generalization Error
Bounds on the Generalization Error in Statistical Learning Theory
Probabilistic Convergence of the Empirical Risk
The VC Dimension of a Set of Functions
Bounds on the Estimation Error
Bounds on the Generalization Error for Gaussian OHL Networks
Bounds on the Generalization Error for Sigmoidal OHL Networks
Estimation of Functions Without Noise: The Expected Risk, the Empirical Risk, and the Generalization Error
Deterministic Convergence of the Empirical Risk
Discrepancy of a Sequence
Variation of a Function and Conditions for Bounded Variations
Examples of Functions with Bounded Hardy and Krause Variation
The Koksma–Hlawka Inequality
Sample Sets Coming from (t,d)-Sequences
Bounds on Rate of Convergence of the Estimation Error
Estimation of Functions in the Presence of Additive Noise

Numerical Methods for Integration and Search for Minima
Numerical Computation of Integrals
Integration by Regular Grids
Integration by MC Methods
Integration by Quasi-MC Methods
Limitations of the Quasi-MC Integration with Respect to the MC One
Numerical Computation of Expected Values
Computation of Expected Values by Quasi-MC Methods
Improving MC and Quasi-MC Estimates of Integrals and Expected Values
Minimization Techniques for the Solution of Problem Pn
Direct Search Methods
Gradient-Based Methods
Stochastic Approximation: Robbins–Monro Algorithm for the Solution of Nonlinear Root-Finding Problems
Robbins–Monro Algorithm and the Stochastic Gradient Method are Particular Cases of a General Stochastic Algorithm
Convergence of the General Stochastic Algorithm
Convergence of the Root-Finding Robbins–Monro Stochastic Approximation Algorithm
Choice of the Stepsize Sequence
Convergence of the Stochastic Gradient Algorithm
Global Optimization via Stochastic Approximation
Monte Carlo Random Search
Quasi-Monte Carlo Search

Deterministic Optimal Control over a Finite Horizon
Statement of Problem C
A Constructive Example for Understanding the Basic Concepts of Dynamic Programming
Solution of Problem C by Dynamic Programming: The Exact Approach
Sampling of the State Space
The Recursive Equation of ADP
Solving the Minimization Problems in the ADP Equation
The Kernel Smoothing Models in ADP
Computation of the Optimal Controls in the Forward Phase of ADP
Generalization Error Owing to the Use of Sampling Procedures and FSP Functions
Approximation Errors and Approximating FSP Functions
Estimation Errors in the Context of Deterministic Learning Theory
Solution of Problem C by Gradient-Based Algorithms
The Final Time of Problem C is not Fixed

Stochastic Optimal Control with Perfect State Information over a Finite Horizon
Statement of Problems C and C'
Solution of Problems C and C' by Dynamic Programming: The Exact Approach
Sampling of the State Space and Approximations by FSP Functions
Computation of Expected Values in ADP Recursive Equations
Computation of the Optimal Controls in the Forward Phase of Dynamic Programming
Error Propagation of the Optimal Cost-to-Go Functions Starting from Approximations of the Optimal Control Functions
Discretization Issues
Approximation Issues
Some Structural Properties of the Optimal Cost-to-Go and Control Functions
Reinforcement Learning
Q-Learning
Approximation of the IDO Problem C by the Nonlinear Programming Problems Cn
Approximation of Problems C' by the Nonlinear Programming Problems Cn'
Solution of Problem Cn by Nonlinear Programming
Computation of the Gradient u,wn J (u,wn,x,ξ)
Computation of the Gradient u,wn J (u,wn,x,ξ) When MHL Networks Are Used
The Model of the Reservoir System
Computational Aspects
Concluding Comments
Stochastic Optimal Control Problems Equivalent to Problem C'
Measurable Time-Invariant Random Variables
The Random Variables ξ, ξ, …, ξT- are Measurable
Tracking a Stochastic Reference

Stochastic Optimal Control with Imperfect State Information over a Finite Horizon
Statement of Problem C
Solution of Problem C by Dynamic Programming
Reduction of Problem C to Problem C
Derivation of the Conditional Probability Densities Needed by Dynamic Programming
Approximate Solution of Problem C by the ERIM
Approximation of Problem C by the Nonlinear Programming Problems Cn
Solution of Problem Cn
Specialization of Stochastic Gradient Algorithms When MHL Networks are Used
Limited-Memory Information Vector
Approximate Solution of Problem C-LM by the ERIM
Solution of Problem Cn-LM by Stochastic Approximation and Specialization to MHL Networks
Approximate Solution of Problem C-LM by the Certainty Equivalence Principle and the ERIM
Certainty Equivalence Principle
Applying the CE Principle to the LM Controller
Comparison Between an LQG Optimal Control Law and Limited-Memory Optimal Control Laws
Freeway Traffic Optimal Control

Team Optimal Control Problems
Basic Concepts of Team Theory
Statement of the Team Optimal Control Problem
Partially Nested Information Structures
Person-by-Person Optimality
LQG Static Teams
LQG Dynamic Teams with Partially Nested Information Structure
Sequential Partitions
Statement of Problem DC
Approximation of Problem DC by the Nonlinear Programming Problems DCn
The Witsenhausen Counterexample
Statement of the Problem
Application of the ERIM for the Approximate Solution of Problem W
Numerical Results Obtained by the ERIM
Improved Numerical Results and Variants of the Witsenhausen Counterexample
Theoretical Results on the Application of the ERIM
Dynamic Routing in Traffic Networks
Modeling the Communication Network
Statement of the Dynamic Routing Problem
Approximate Solution of Problem ROUT by the ERIM
Computation of the Neural Routing Functions via Stochastic Approximation
Numerical Results

Optimal Control Problems over an Infinite Horizon
Deterministic Optimal Control over an Infinite Horizon
Stochastic Optimal Control with Perfect State Information over an Infinite Horizon
Stochastic Optimal Control with Imperfect State Information over an Infinite Horizon
From Finite to Infinite Horizon: The Deterministic Receding-Horizon Approximation
Stabilizing Properties of Deterministic RH Control Laws
Stabilizing Approximate RH Control Laws
Offline Design of the Stabilizing Approximate RH Control Law Obtained by the ERIM
Numerical Results