Skip to content

Conversation

@kushalthaman
Copy link
Contributor

integrates full runtime, compilation, and correctness evaluation for GPU kernels via spinning up a Modal image. Returns a dict verification_result_info with details such as {'correct': 1, 'compile': 1}, runtime statistics and error messages.

tested manually on kernels generated by https://huggingface.co/deepseek-ai/DeepSeek-R1-Distill-Qwen-1.5B (most kernels generated by this model don't compile), will be making a schema with a stronger model e.g. DeepSeek-R1-Distill-Qwen-32B.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant