The Dataset for Pretraining Word Embedding

https://d2l.ai/chapter_natural-language-processing-pretraining/word-embedding-dataset.html