I'm training on a linearly separable dataset, without regularization, with thousands of iterations and a leraning rate of 0.001 (but in fact I have used several combinations, and with several datasets)
Taking into account that the dataset is clearly linearly separable I would expect a result as: 100% of accuracy an increasing module of the vector defined by the learnt weights (discarding first one)
But, comparing the learnt weights when using L2 regularization or not, the changes in learn weights only happens in the fourth or fifth digit...
Is that correct?