-
Notifications
You must be signed in to change notification settings - Fork 344
Medhelm Epic #3787
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Medhelm Epic #3787
Conversation
| openpyxl~=3.1 | ||
| python-docx~=1.1 | ||
| transformers~=4.45,<4.50 | ||
| evaluation-instruments @ git+https://github.com/epic-open-source/evaluation-instruments.git@1c4637e84fe4dc54f6695e438f3baca6b2cd4573 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
PyPI packages cannot depend on packages outside PyPI. You should instead provide instructions to users to manually install this this package, either by printing the installation command in an error message, or by documenting it in ReadTheDocs.
| from helm.benchmark.annotation.model_as_judge import AnnotatorModelInfo, LLMAsJuryAnnotator | ||
| from helm.clients.auto_client import AutoClient | ||
|
|
||
| from evaluation_instruments import prep |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Display an error if this is not installed:
from helm.common.optional_dependencies import OptionalDependencyNotInstalled
try:
from evaluation_instruments import prep
import evaluation_instruments.instruments.pdsqi_9.pdsqi_prompt as pdsqi
except ModuleNotFoundError as e:
# Provide manual instructions for installing evaluation-instruments from GitHub
# because PyPI does not allow installing dependencies directly from GitHub.
raise OptionalDependencyNotInstalled(
f"Optional dependency {e.name} is not installed. "
"Please run `evaluation-instruments @ git+https://github.com/epic-open-source/evaluation-instruments.git@1c4637e84fe4dc54f6695e438f3baca6b2cd4573` to install it."
) from e # noqa: E501| def get_note_summary_spec(config_path: Optional[str] = None) -> RunSpec: | ||
| if config_path is None: | ||
| package = "helm.benchmark.scenarios" | ||
| config_path = str(pkg_resources.files(package).joinpath("note_summary_scenario.yaml")) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You need to add *.yaml to the manifest, or this file will not actually get included in the package.
Line 3 in 89001e7
| recursive-include src/helm/benchmark/ *.json |
|
|
||
| return instances | ||
|
|
||
| def read_file(self, file_path: str) -> List[str]: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Delete unused method.
|
|
||
|
|
||
| @dataclass | ||
| class Rubric: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I did not look at the rubric logic too closely, but let me know if there's anything you want me to check.
|
Ping - last activity was two weeks ago. |
This PR adds a new scenario,
NoteSummaryScenario, developed in partnership with Epic Systems. The scenario focuses on generating clinical note summaries for the Emergency Medicine specialty, reflecting real-world medical documentation needs.To assess the quality of the model-generated summaries, we adopt the "LLM as a judge" evaluation framework based on PDSQI-9, a rubric co-developed by UW Madison and Epic Systems for systematic evaluation that doesn't require a gold standard response to compare against.