### Feature request: Investigating language/image diffusion models, and improve the config part of this spec