Loading [MathJax]/jax/output/HTML-CSS/jax.js

Monday, August 1, 2011

Draft: Notes on dynamic programming equations which solve cost models for dynamic systems

 

Deterministic Cost Models

Description

Cost Model

Dynamic Programming Equations

Restrictions

Finite Horizon Total Cost

Jπ(x0)=Kk=0αkck(xk,π(xk))

Vπk(x)=ck(x,π(x))+αVπk+1(f(x,π(x))),k{0,,K1}

VπK(x)=cK(x,π(x))

0α<1

Infinite Horizon Total Cost

Jπ(x0)=k=0αkc(xk,π(xk))

Vπ(x)=c(x,π(x))+αVπ(f(x,π(x)))

0α<1

Finite Horizon Shortest Path

Jπ(x0)=Kk=0αkck(xk,π(xk))

Vπk(x)=ck(x,π(x))+αVπk+1(f(x,π(x))),k{0,,K1}

VπK(x)=cK(x,π(x))

0α1

{xχ|c(x,π(x))=0}{}

Infinite Horizon Shortest Path

Jπ(x0)=k=0αkc(xk,π(xk))

Vπ(x)=c(x,π(x))+αVπ(f(x,π(x)))

0α1

{xχ|c(x,π(x))=0}{}

Average Cost

Jπ(x0)=limK1KKk=0αkc(xk,π(xk))

Vπ(x)+λ=c(x,π(x))+Vπ(f(x,π(x)))

0α<1

Vπ(xref)=0 for some xrefχ

 

Stochastic Cost Models

Description

Cost Model

Dynamic Programming Equations

Restrictions

Finite Horizon Total Cost

Jπ(x0)=EW[Kk=0αkck(xk,π(xk),w)]

Vπk(x)=EW[ck(x,π(x),w)+αVπk+1(f(x,π(x),w))]

VπK(x)=EW[cK(x,π(x))]

0α<1

Infinite Horizon Total Cost

Jπ(x0)=EW[k=0αkc(xk,π(xk),w)]

Vπ(x)=EW[c(x,π(x),w)+αVπ(f(x,π(x),w))]

0α<1

Finite Horizon Shortest Path

Jπ(x0)=EW[Kk=0αkck(xk,π(xk),w)]

Vπk(x)=EW[ck(x,π(x),w)+αVπk+1(f(x,π(x),w))]

VπK(x)=EW[cK(x,π(x))]

0α1

{xχ|c(x,π(x))=0}{}

Infinite Horizon Shortest Path

Jπ(x0)=EW[k=0αkc(xk,π(xk),w)]

Vπ(x)=EW[c(x,π(x),w)+αVπ(f(x,π(x),w))]

0α1

{xχ|c(x,π(x))=0}{}

Average Cost

Jπ(x0)=EW[limK1KKk=0αkc(xk,π(xk),w)]

Vπ(x)+λ=E[c(x,π(x),w)+Vπ(f(x,π(x),w))]

0α<1

Vπ(xref)=0 for some xrefχ

 

Risk Aware/Averse Stochastic Cost Models

Description

Cost Model

Dynamic Programming Equations

Restrictions

Certainty Equivalence with exponential utility Jπ(x0)=lim supK1K1γln(EW[exp(K1k=0c(x,π(x),w))])    
Mean-Variance      
       
       
       

 

Cost Models That don’t work or have issues

Description

Cost Model

Issues

Expected exponential disutility Jπ(x0)=lim supK1KEW[sgn(γ)exp(γK1k=0c(x,π(x),w))] Does not discriminate among policies
Different version of expected exponential disutility Jπ(x0)=lim supK1γlog(EW[exp(γγKK1k=0c(x,π(x),w))]) Generally reduces to cost average
     
     
     

 

References

This work is licensed under a Creative Commons Attribution By license.

No comments:

Post a Comment