Bidirectional Recurrent Neural Networks

https://d2l.ai/chapter_recurrent-modern/bi-rnn.html

In equation 9.4.8, the dimensions of O_t is mentioned as n x q. Is that a typo? Shouldn’t it be 1 x q (same as b_q)?

There are q outputs, and the batch is of size n, and that’s why the output is n * q.
The bias is of size 1 * q, but it’s broadcasted to n * q during addition.