Custom Layers

https://d2l.ai/chapter_builders-guide/custom-layer.html

So here right after I call dense.initialize() the weights are initialized. This is in contrast with deferred initialization. It makes sense because we specified the in_unit.

Is MyDense() as efficient as nn.Dense()?

I’m afraid not. At least because mxnet.gluon.nn.Dense is hybridized (see 12.1.2) and its backend is implemented in C/C++.