Generalization in Deep Learning
|
|
2
|
1728
|
August 22, 2023
|
Adadelta
|
|
4
|
1900
|
August 22, 2023
|
error with d2l.HyperParameters
|
|
6
|
1810
|
August 15, 2023
|
Transformers for Vision
|
|
0
|
1191
|
August 14, 2023
|
The Transformer Architecture
|
|
0
|
1049
|
August 14, 2023
|
Self-Attention and Positional Encoding
|
|
0
|
1092
|
August 14, 2023
|
Multi-Head Attention
|
|
0
|
1185
|
August 14, 2023
|
The Bahdanau Attention Mechanism
|
|
0
|
1048
|
August 14, 2023
|
Attention Scoring Functions
|
|
0
|
669
|
August 14, 2023
|
Attention Pooling by Similarity
|
|
0
|
838
|
August 14, 2023
|
Queries, Keys, and Values
|
|
0
|
1240
|
August 14, 2023
|
Encoder-Decoder Seq2Seq for Machine Translation
|
|
0
|
729
|
August 14, 2023
|
The Encoder-Decoder Architecture
|
|
0
|
1104
|
August 14, 2023
|
Machine Translation and the Dataset
|
|
0
|
671
|
August 14, 2023
|
Bidirectional Recurrent Neural Networks
|
|
0
|
724
|
August 14, 2023
|
Deep Recurrent Neural Networks
|
|
0
|
1069
|
August 14, 2023
|
Gated Recurrent Units (GRU)
|
|
0
|
865
|
August 14, 2023
|
Long Short-Term Memory (LSTM)
|
|
0
|
1116
|
August 14, 2023
|
Concise Implementation of Recurrent Neural Networks
|
|
0
|
674
|
August 14, 2023
|
Recurrent Neural Network Implementation from Scratch
|
|
0
|
1211
|
August 14, 2023
|
Recurrent Neural Networks
|
|
0
|
446
|
August 14, 2023
|
Language Models
|
|
0
|
728
|
August 14, 2023
|
Converting Raw Text into Sequence Data
|
|
0
|
737
|
August 14, 2023
|
Working with Sequences
|
|
0
|
1219
|
August 14, 2023
|
Designing Convolution Network Architectures
|
|
0
|
672
|
August 14, 2023
|
Densely Connected Networks (DenseNet)
|
|
0
|
661
|
August 14, 2023
|
Residual Networks (ResNet) and ResNeXt
|
|
0
|
1132
|
August 14, 2023
|
Batch Normalization
|
|
0
|
787
|
August 14, 2023
|
Multi-Branch Networks (GoogLeNet)
|
|
0
|
1026
|
August 14, 2023
|
Network in Network (NiN)
|
|
0
|
742
|
August 14, 2023
|