Adagrad

https://d2l.ai/chapter_optimization/adagrad.html

Beginning of this chapter, we have:

Imagine that we are training a language model. To get good accuracy we typically want to decrease the learning rate as we keep on training, usually at a rate of O(t−12)O(t−12) or slower.

It make me confuse about this number. Can anyone show more explained?