In bert, replace the word with a random word instead of random number.
red line should be made:
vocab.to_tokens(random.randint(0, len(vocab) - 1))
I agree. Contribute if you like.
In BERT paper, https://arxiv.org/abs/1810.04805 Page7
But the deep reason of 15%?
I saw your dataset in the attachment. I also need to build the customized dataset to pretrain for our model. But I don’t understand your way a lot. Could you tell me more detail or provide some sample explanation about your dataset? I’m fresh in this area.