General Guide

Large Loss in Training

Which One?

如果是 Model Bias,那么就换成更大、更有弹性的 Model,如果是 Optimization 的问题,那么… When Gradient is Small.
Small Loss in Training
Large Loss in Test: Overfitting
为什么更弹性的模型更容易过拟合?
Solution for Overfitting
- More Training Data / Data Augmentation

- Constrain Model
不要过度限制!否则会回到 Model Bias

How to Select Model
用 Cross Validation 挑选模型,不要过度关注 public Test,防止过拟合在测试上

Mismatch

Loss Function May Affect
Training Tips
Batch and Momentum
Adaptive Learning Rate
Summary
现在最常用的 Optimizer 是 Adam,但是关于衰减需要自己考虑、指定,Adam 并不包括衰减。