Regularization
Reduce the Overfit problem.
Reduce the Overfit problem.
Adding Penalty Terms (L1 Norm, L2 term).
Weight Decay (w_new = w_old - η * (∇L + 2λw_old)) with ∇L is the gradient of the original loss function.
Dropout.
Batch Normalization (BatchNorm)