When training large or deep models, exploding gradients are frequent and cause instability. Clipping them to a certian small amount is an effective way of stabilizing training.
To implement this, I believe a method on the Gradients struct would be needed (correct me if I'm wrong)