Sign up
Forgot password?
FAQ: Login

Poznyak A.S., Najim K., Gomez-Ramirez E. Self-Learning Control of Finite Markov Chains

  • pdf file
  • size 6,09 MB
  • added by
  • info modified
Poznyak A.S., Najim K., Gomez-Ramirez E. Self-Learning Control of Finite Markov Chains
Marcel Dekker, 2000. — 315 p.
The theory of controlled Markov chains originated several years ago in the work of Bellman and other investigators. This theory has seen a tremendous growth in the last decade. In fact, several engineering and theoretical problems can be modelled or rephrased as controlled Markov chains. These problems cover a very wide range of applications in the framework of stochastic systems. The problem with control of Markov chains is establishing a control strategy that achieves some requirements on system performance (control objective). The system performance can be principally captured into two ways:
a single cost function which represents any quantity measuring the performance of the system;
a cost function in association with one or several constraints.
The use of controlled Markov chains presupposes that the transition probabilities, which describe completely the system dynamics, are previously known. In many applications the information concerning the system under consideration is not complete or not available. As a consequence the transition probabilities are usually unknown or depend on some unknown parameters. In such cases there exists a real need for the development of control techniques which involve adaptability. By collecting and processing the available information, such adaptive techniques should be capable of changing their parameters as time evolves to achieve the desired objective.
Broadly speaking, adaptive control techniques can be classified into two categories: indirect and direct approaches. The indirect approach is based on the certainty equivalence principle. In this approach, the unknown parameters are on-line estimated and used in lieu of the true but unknown parameters to update the control accordingly. In the direct approach, the control actions are directly estimated using the available information. In the indirect approach, the control strategy interacts with the estimation of the unknown parameters. The information used for identification purposes is provided by a closed-loop system. As a consequence, the identifiability (consistency of the parameter estimates) cannot be guaranteed, and the certainty equivalence approach may fail to achieve optimal behaviour, even asymptotically.
This book presents a number of new and potentially useful direct adaptive control algorithms and theoretical as well as practical results for both unconstrained and constrained controlled Markov chains. It consists of eight chapters and two appendices, and following an introductory section, it is divided into two parts. The detailed table of contents provides a general idea of the scope of the book.
Controlled Markov Chains
Unconstrained Markov Chains
Lagrange Multipliers Approach
Penalty Function Approach
Projection Gradient Method
Constrained Markov Chains
Lagrange Multipliers Approach
Penalty Function Approach
Nonregular Markov Chains
Practical Aspects
  • Sign up or login using form at top of the page to download this file.
  • Sign up
Up