Fine-Tuning BERT for Sequence-Level and Token-Level Applications

astonzhang · June 29, 2020, 10:46pm

https://d2l.ai/chapter_natural-language-processing-applications/finetuning-bert.html

jorge · December 23, 2020, 7:00pm

Exercise 1.- One idea would be to creative negative samples by picking at random any article (except for the labelled) and labelling it with 0. Then the training dataset for BERT would be pairs of queries, articles and its corresponding target (1 or 0).
Finally to get the ranking, we would have to run the model for a specific query against the n articles. Then, all we would have to do, is to sort the articles according to the model softmax output to get the relevancy ranking.

YookoTian · March 8, 2023, 6:40pm

For the question-answering task, in my opinion, the end of the answer span should be relative to the start position (or the start token). But here the book uses 3 independent FC layers for inference, which I kind of do not agree with that.

keyu_chen · March 20, 2023, 12:07am

will that be possible to also provide case example/code to fine tuning for text classification, text tagging and QA?

JH.Lam · January 8, 2025, 10:13am

hi guy, the dense is shared with weights , right?