https://d2l.ai/chapter_natural-language-processing-applications/natural-language-inference-bert.html
It is saying " These two loss functions are irrelevant to fine-tuning downstream applications, thus the parameters of the employed MLPs in MaskLM
and NextSentencePred
are not updated (staled) when BERT is fine-tuned."
How to achieve it? could you please share some sample code for this trick?