Attention Cues

In otherwise exceptional book, sections on attention mechanism, which are main components of many SOTA architectures, are pretty vague and difficult to understand

Considering questions does followings reply make sense ?:
What can be the volitional cue when decoding a sequence token by token in machine translation?

The volitional cue could be tokens that hold significant information into the sequence of terms to be translated. Such tokens may be the Entities (nouns) , meaning, the facts into the sequence that provide semantic considering their relations with other Entities.

What are the nonvolitional cues and the sensory inputs?

They could be the stopwords or any frequent tokens that frequently appear in sequences to be translated without carrying relevant information.