MDP

https://d2l.ai/chapter_reinforcement-learning/mdp.html

Whole RL part is too brief here and I had to search for another more informative courses. Other chapters are prepared with much better details.

Thank you for the feedback.
What topics are you looking for and missing in your view?
I’d like to note that in this version, our goal was to cover only fundamental concepts in RL.

Rasool

Hi. Shouldn’t be an expectation value over a_0 for the first term in 17.2.2?

In Section 17.1.3, an example of a non-Markovian system is proposed by introducing a robot that may have some velocity. It is concluded that the system is not Markovian because it depends on the current and previous state. This problem formulation is technically non-Markovian, but not representative of real life because Newtonian mechanics is completely Markovian. A more realistic formulation of the problem should define the state to be the position and velocity, where these are both 3-vectors (x, y, z, dx/dt, dy/dt, dz/dt). When formulated in this way, the problem is Markovian. In my opinion, a better example of non-Markovian process is needed. Autoregressive moving average (ARMA) models fit the bill.