Encoder-Decoder Architecture


  1. I think encoder and decoder don’t have to be the same type of neural network
  2. Question answering. I applied this approach to the Conversational Question Answering (CoQA) dataset. Perplexity was 1.6. Not too bad for a very simple model.

I’ve seen Encoder/Decoder approaches used in recommender system applications before.

I’ve seen the encoder initialized with pretrained weights. Can decoder be pretrained as well? My hunch is that if the decoder is pretrained, then encoder also must be pretrained - otherwise the representation going into the decoder will be different from the pretraining representation.