Numerical Stability and Initialization

Hey in the section: 5.4. Numerical Stability and Initialization β€” Dive into Deep Learning 1.0.3 documentation

And in the code:

M = torch.normal(0, 1, size=(4, 4))
print('a single matrix \n',M) # πŸ‘ˆπŸ» in this line
for i in range(100):
    M = M @ torch.normal(0, 1, size=(4, 4))
print('after multiplying 100 matrices\n', M)

Please have the space after the comma. Let’s keep the code clean :wink:

:point_right:t2: In this line print('a single matrix \n',M) should be print('a single matrix \n', M)

Thanks!!!

my exercise:

  1. Different symmetry may inherit within different architecture. Linear nn may consider the symmetry of matrix?
  2. no, SGD can’t break this symmetry, can’t train efficiently.
  3. Eigenvalues of product of two matrices may approx the product of two matrices’ eigenvalues under some condition? The ratio of the biggest eigenvalue and the smallest Eigenvalue should not be two large or small to ensure gradient stability.
  4. TBD