Encoder-Decoder Architecture


  1. I think encoder and decoder don’t have to be the same type of neural network
  2. Question answering. I applied this approach to the Conversational Question Answering (CoQA) dataset. Perplexity was 1.6. Not too bad for a very simple model.

I’ve seen Encoder/Decoder approaches used in recommender system applications before.

1 Like

I’ve seen the encoder initialized with pretrained weights. Can decoder be pretrained as well? My hunch is that if the decoder is pretrained, then encoder also must be pretrained - otherwise the representation going into the decoder will be different from the pretraining representation.