How is the KL divergence calculated?

I am confused by the KL divergence calculated here:
https://d2l.ai/chapter_appendix-mathematics-for-deep-learning/information-theory.html#example
The KL divergence between two Gaussians with std=1 where the means are 0 and 1, according to this site is 0.5, which is quite different than the calculations. I am not sure how to calculate the KL divergence of two distributions based on the samples drawn from there and just sorting them.