
where does the field_dim come from?
I think there might be a bug in the evaluation step of the model. To measure accuracy, the current d2l.accuracy function casts the predictions to the same type as y then compares how many are the same. Issue is although y is a float it is binary 0 or 1, and we are comparing probabilities to binary, so unless the model perfectly says something is a 0 or 1 all other values are treated as misclassification.   I did a simple comparison of the original accuracy calc (d2l.astype(net(X),y.dtype) == y).sum() vs (round(net(X)) == y).sum() and got wildly different results. Test was done on a single batch using X,y=next(iter(test_iter)). The round method puts a threshold @ 50%, and in one batch using this i got 1896 labels classified correctly vs the original method where only 536 were shows as classified correctly.
What’s the difference between MF and FM?
According to the formula in this section, it is mentioned that the element of embedding vector($v_{i,l}$) is required to be multiplied by $x_i$:
$$
\frac{1}{2} \sum_{l=1}^k \big ((\sum_{i=1}^d \mathbf{v}{i, l} x_i)^2 - \sum{i=1}^d \mathbf{v}_{i, l}^2 x_i^2)
$$
But this is not seen in the code. The code just use the entire embedding vector $v_i$.
What’s going on here?
        square_of_sum = np.sum(self.embedding(x), axis=1) ** 2
        sum_of_square = np.sum(self.embedding(x) ** 2, axis=1)