## Deterministic Cost Models

 Description Cost Model Dynamic Programming Equations Restrictions Finite Horizon Total Cost $$J^{\pi}\left(x_{0}\right)=\sum_{k=0}^{K}\alpha^{k}\cdot c_{k}\left(x_{k},\pi\left(x_{k}\right)\right)$$ $$V_{k}^{\pi}\left(x\right)=c_{k}\left(x,\pi\left(x\right)\right)+\alpha\cdot V_{k+1}^{\pi}\left(f\left(x,\pi\left(x\right)\right)\right)$$,$$\forall k\in\left\{ 0,\cdots,K-1\right\}$$ $$V_{K}^{\pi}\left(x\right)=c_{K}\left(x,\pi\left(x\right)\right)$$ $$0\leq\alpha<1$$ Infinite Horizon Total Cost $$J^{\pi}\left(x_{0}\right)=\sum_{k=0}^{\infty}\alpha^{k}\cdot c\left(x_{k},\pi\left(x_{k}\right)\right)$$ $$V^{\pi}\left(x\right)=c\left(x,\pi\left(x\right)\right)+\alpha\cdot V^{\pi}\left(f\left(x,\pi\left(x\right)\right)\right)$$ $$0\leq\alpha<1$$ Finite Horizon Shortest Path $$J^{\pi}\left(x_{0}\right)=\sum_{k=0}^{K}\alpha^{k}\cdot c_{k}\left(x_{k},\pi\left(x_{k}\right)\right)$$ $$V_{k}^{\pi}\left(x\right)=c_{k}\left(x,\pi\left(x\right)\right)+\alpha\cdot V_{k+1}^{\pi}\left(f\left(x,\pi\left(x\right)\right)\right)$$,$$\forall k\in\left\{ 0,\cdots,K-1\right\}$$ $$V_{K}^{\pi}\left(x\right)=c_{K}\left(x,\pi\left(x\right)\right)$$ $$0\leq\alpha\leq1$$ $$\left\{ x\in\chi|c\left(x,\pi\left(x\right)\right)=0\right\} \neq\left\{ \oslash\right\}$$ Infinite Horizon Shortest Path $$J^{\pi}\left(x_{0}\right)=\sum_{k=0}^{\infty}\alpha^{k}\cdot c\left(x_{k},\pi\left(x_{k}\right)\right)$$ $$V^{\pi}\left(x\right)=c\left(x,\pi\left(x\right)\right)+\alpha\cdot V^{\pi}\left(f\left(x,\pi\left(x\right)\right)\right)$$ $$0\leq\alpha\leq1$$ $$\left\{ x\in\chi|c\left(x,\pi\left(x\right)\right)=0\right\} \neq\left\{ \oslash\right\}$$ Average Cost $$J^{\pi}\left(x_{0}\right)=\underset{K\rightarrow\infty}{\lim}\frac{1}{K}\sum_{k=0}^{K}\alpha^{k}\cdot c\left(x_{k},\pi\left(x_{k}\right)\right)$$ $$V^{\pi}\left(x\right)+\lambda=c\left(x,\pi\left(x\right)\right)+V^{\pi}\left(f\left(x,\pi\left(x\right)\right)\right)$$ $$0\leq\alpha<1$$ $$V^{\pi}\left(x_{ref}\right)=0$$ for some $$x_{ref}\in\chi$$

## Stochastic Cost Models

 Description Cost Model Dynamic Programming Equations Restrictions Finite Horizon Total Cost $$J^{\pi}\left(x_{0}\right)=E^{W}\left[\sum_{k=0}^{K}\alpha^{k}\cdot c_{k}\left(x_{k},\pi\left(x_{k}\right),w\right)\right]$$ $$V_{k}^{\pi}\left(x\right)=E^{W}\left[c_{k}\left(x,\pi\left(x\right),w\right)+\alpha\cdot V_{k+1}^{\pi}\left(f\left(x,\pi\left(x\right),w\right)\right)\right]$$ $$V_{K}^{\pi}\left(x\right)=E^{W}\left[c_{K}\left(x,\pi\left(x\right)\right)\right]$$ $$0\leq\alpha<1$$ Infinite Horizon Total Cost $$J^{\pi}\left(x_{0}\right)=E^{W}\left[\sum_{k=0}^{\infty}\alpha^{k}\cdot c\left(x_{k},\pi\left(x_{k}\right),w\right)\right]$$ $$V^{\pi}\left(x\right)=E^{W}\left[c\left(x,\pi\left(x\right),w\right)+\alpha\cdot V^{\pi}\left(f\left(x,\pi\left(x\right),w\right)\right)\right]$$ $$0\leq\alpha<1$$ Finite Horizon Shortest Path $$J^{\pi}\left(x_{0}\right)=E^{W}\left[\sum_{k=0}^{K}\alpha^{k}\cdot c_{k}\left(x_{k},\pi\left(x_{k}\right),w\right)\right]$$ $$V_{k}^{\pi}\left(x\right)=E^{W}\left[c_{k}\left(x,\pi\left(x\right),w\right)+\alpha\cdot V_{k+1}^{\pi}\left(f\left(x,\pi\left(x\right),w\right)\right)\right]$$ $$V_{K}^{\pi}\left(x\right)=E^{W}\left[c_{K}\left(x,\pi\left(x\right)\right)\right]$$ $$0\leq\alpha\leq1$$ $$\left\{ x\in\chi|c\left(x,\pi\left(x\right)\right)=0\right\} \neq\left\{ \oslash\right\}$$ Infinite Horizon Shortest Path $$J^{\pi}\left(x_{0}\right)=E^{W}\left[\sum_{k=0}^{\infty}\alpha^{k}\cdot c\left(x_{k},\pi\left(x_{k}\right),w\right)\right]$$ $$V^{\pi}\left(x\right)=E^{W}\left[c\left(x,\pi\left(x\right),w\right)+\alpha\cdot V^{\pi}\left(f\left(x,\pi\left(x\right),w\right)\right)\right]$$ $$0\leq\alpha\leq1$$ $$\left\{ x\in\chi|c\left(x,\pi\left(x\right)\right)=0\right\} \neq\left\{ \oslash\right\}$$ Average Cost $$J^{\pi}\left(x_{0}\right)=E^{W}\left[\underset{K\rightarrow\infty}{\lim}\frac{1}{K}\sum_{k=0}^{K}\alpha^{k}\cdot c\left(x_{k},\pi\left(x_{k}\right),w\right)\right]$$ $$V^{\pi}\left(x\right)+\lambda=E\left[c\left(x,\pi\left(x\right),w\right)+V^{\pi}\left(f\left(x,\pi\left(x\right),w\right)\right)\right]$$ $$0\leq\alpha<1$$ $$V^{\pi}\left(x_{ref}\right)=0$$ for some $$x_{ref}\in\chi$$

## Risk Aware/Averse Stochastic Cost Models

 Description Cost Model Dynamic Programming Equations Restrictions Certainty Equivalence with exponential utility $$J^{\pi}\left(x_{0}\right)=\underset{K\rightarrow\infty}{\limsup}\frac{1}{K}\cdot\frac{1}{\gamma}\cdot\ln\left(E^{W}\left[\exp\left(\sum_{k=0}^{K-1}c\left(x,\pi\left(x\right),w\right)\right)\right]\right)$$ Mean-Variance

## Cost Models That don’t work or have issues

 Description Cost Model Issues Expected exponential disutility $$J^{\pi}\left(x_{0}\right)=\underset{K\rightarrow\infty}{\limsup}\frac{1}{K}\cdot E^{W}\left[\textrm{sgn}\left(\gamma\right)\cdot\exp\left(\gamma\cdot\sum_{k=0}^{K-1}c\left(x,\pi\left(x\right),w\right)\right)\right]$$ Does not discriminate among policies Different version of expected exponential disutility $$J^{\pi}\left(x_{0}\right)=\underset{K\rightarrow\infty}{\limsup}\frac{1}{\gamma}\cdot\log\left(E^{W}\left[\exp\left(\gamma\cdot\frac{\gamma}{K}\sum_{k=0}^{K-1}c\left(x,\pi\left(x\right),w\right)\right)\right]\right)$$ Generally reduces to cost average