Numerical Stability and Initialization

AayushSameerShah · October 28, 2024, 7:36pm

Hey in the section: 5.4. Numerical Stability and Initialization — Dive into Deep Learning 1.0.3 documentation

And in the code:

M = torch.normal(0, 1, size=(4, 4))
print('a single matrix \n',M) # 👈🏻 in this line
for i in range(100):
    M = M @ torch.normal(0, 1, size=(4, 4))
print('after multiplying 100 matrices\n', M)

Please have the space after the comma. Let’s keep the code clean

In this line print('a single matrix \n',M) should be print('a single matrix \n', M)

Thanks!!!

zhang2023-byte · February 12, 2025, 10:03am

my exercise:

Different symmetry may inherit within different architecture. Linear nn may consider the symmetry of matrix?
no, SGD can’t break this symmetry, can’t train efficiently.
Eigenvalues of product of two matrices may approx the product of two matrices’ eigenvalues under some condition? The ratio of the biggest eigenvalue and the smallest Eigenvalue should not be two large or small to ensure gradient stability.
TBD