https://d2l.ai/chapter_recurrent-neural-networks/sequence.html

In section 8.1.1.1

This leads to models that estimate x_t with \hat{x}_t = P(x_t \mid h_{t})

But going by the figure 8.1.2, isnβt `x^_t`

calculated through a combination of `h_t`

and `x_t-1`

?

\hat{x}_t is calculated based on h_t, where β_π‘=π(β_{π‘β1},π₯_{π‘β1})