[feat]: Adding azure NLP graders and python grader #4706
Merged
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This pull request introduces significant enhancements to the reward scoring and grading functionality within the ACPT-RFT environment. The main updates include the addition of new modular grader scripts for Azure-based and Python-based grading, an extensible reward scoring entry point, and several dependency updates and Dockerfile improvements to support these features.
Reward scoring and grading enhancements:
default_compute_scorefunction in__init__.pyto serve as a unified entry point for reward scoring across multiple data sources, supporting both legacy and new grader modules.azure_grader.py, a flexible grader supporting both string matching and text similarity metrics (BLEU, ROUGE, METEOR, etc.), with fallback logic if the HuggingFaceevaluatelibrary is unavailable.azure_python_grader.py, a customizable Python code grader that validates syntax using AST parsing, intended for user extension and custom grading logic.Dockerfile and dependency updates:
verlfrom version 0.6.0 to 0.6.1, and updatedvllmfrom 0.11.1 to 0.13.0; added new dependenciesopenaiandDeepGEMM, and copied new grader scripts into the appropriate locations in the Docker image. [1] [2]azure_grader.py,azure_python_grader.py, and updated__init__.py) are available in theverl.utils.reward_scorepackage by copying them in the Docker build.These changes collectively provide a more modular, extensible, and robust reward scoring framework, enabling easier customization and improved evaluation capabilities for different data sources and grading requirements.
Test links