LLM-Foundry added a lot of eval support recently, and I think it should be possible to add several useful evals that run during training runs.