Problem Statement
How do you validate a model of a system against a physical system when a controller is necessary to make the system operate and the the operational policies of the controllers were developed independently.
Discussion
Consider a development process with a well defined operation cost model which drives both the model and physical system to optimization that operations cost. If the model uses full state feedback for control and the physical system uses either full state feedback or implements output feedback with state estimators and relies on certainty equivalence, there is no guarantee that under identical test conditions the trajectories of the cost and the trajectories of the states will be identical even if both system are optimal with respect to the operational costs. This is because optimal operational costs are unique, but there are no guarantees that optimal tratectories are unique. Furthermore, if the controls in both cases are not globally optimal, by only near optimal, then likelihood of non-unique trajectories is even more likely. However, because the operational costs can be unique, the validation exercise can be decomposed into two validation steps.
First, the equations which model the physics can be validated against test data on the physical system by measuring the states in the real system, then substituting the integrator in the model with the state measurements. Ideally, the physical system could execute both policies. The error in costs, and derivative calculations can be compared to quantify the error between the model of the physics and the real physics.
Second, once the errors in the model of the physics are quantified, the error in the costs under the different controllers can be quantified. Ideally, from this step, the optimality of the controls wrt to a globally optimal controller (or minimizing controller) can be established. Once interesting possibility is to use policy improvement to see if the independently developed policies can merged for better performance. Alternatively, if there are unexplained differences, then the constraints respected by the different policies need to be reconciled. Things like robustness may also contribute to differences. Robustness will in general be a driven by different view of noise and risk sensitivity. Following on that, there is the possibility that the equation structure in the different policies lead to different performance limitations. Again, policy improvement may provide a way to identify these structure imposed limitations.