Generalization in Deep Learning
|
|
2
|
1728
|
August 22, 2023
|
Adadelta
|
|
4
|
1899
|
August 22, 2023
|
error with d2l.HyperParameters
|
|
6
|
1810
|
August 15, 2023
|
Transformers for Vision
|
|
0
|
1189
|
August 14, 2023
|
The Transformer Architecture
|
|
0
|
1049
|
August 14, 2023
|
Self-Attention and Positional Encoding
|
|
0
|
1092
|
August 14, 2023
|
Multi-Head Attention
|
|
0
|
1185
|
August 14, 2023
|
The Bahdanau Attention Mechanism
|
|
0
|
1046
|
August 14, 2023
|
Attention Scoring Functions
|
|
0
|
668
|
August 14, 2023
|
Attention Pooling by Similarity
|
|
0
|
837
|
August 14, 2023
|
Queries, Keys, and Values
|
|
0
|
1239
|
August 14, 2023
|
Encoder-Decoder Seq2Seq for Machine Translation
|
|
0
|
729
|
August 14, 2023
|
The Encoder-Decoder Architecture
|
|
0
|
1103
|
August 14, 2023
|
Machine Translation and the Dataset
|
|
0
|
671
|
August 14, 2023
|
Bidirectional Recurrent Neural Networks
|
|
0
|
723
|
August 14, 2023
|
Deep Recurrent Neural Networks
|
|
0
|
1069
|
August 14, 2023
|
Gated Recurrent Units (GRU)
|
|
0
|
865
|
August 14, 2023
|
Long Short-Term Memory (LSTM)
|
|
0
|
1116
|
August 14, 2023
|
Concise Implementation of Recurrent Neural Networks
|
|
0
|
673
|
August 14, 2023
|
Recurrent Neural Network Implementation from Scratch
|
|
0
|
1211
|
August 14, 2023
|
Recurrent Neural Networks
|
|
0
|
446
|
August 14, 2023
|
Language Models
|
|
0
|
727
|
August 14, 2023
|
Converting Raw Text into Sequence Data
|
|
0
|
736
|
August 14, 2023
|
Working with Sequences
|
|
0
|
1218
|
August 14, 2023
|
Designing Convolution Network Architectures
|
|
0
|
671
|
August 14, 2023
|
Densely Connected Networks (DenseNet)
|
|
0
|
661
|
August 14, 2023
|
Residual Networks (ResNet) and ResNeXt
|
|
0
|
1131
|
August 14, 2023
|
Batch Normalization
|
|
0
|
787
|
August 14, 2023
|
Multi-Branch Networks (GoogLeNet)
|
|
0
|
1026
|
August 14, 2023
|
Network in Network (NiN)
|
|
0
|
741
|
August 14, 2023
|