Attention Pooling: Nadaraya-Watson Kernel Regression

A better idea was proposed by Nadaraya [Nadaraya, 1964] and Waston

