Skip to content

Self-contained data pre-processing example binary classification (cancer vs. no cancer) #7

@moritzknolle

Description

@moritzknolle

Dear EMBED OPEN DATA team,
Would it be possible to modify/adapt the 'NUS_example_pipeline.ipynb' notebook so that it is entirely self-contained and can be executed with access to only the .csv files available on AWS? Currently, the notebook makes heavy use of additional columns and information (exam_birads, exam_path_severity, exam_path_desc, exam_outcome, etc.) from complicated pre-processing that are not available in any of the AWS files...

I think having a self-contained example showing how to process EMBED data for cancer vs. no cancer classification, which includes official patient-level train-test split information, would greatly accelerate community adoption of this great resource!

Thanks!

Metadata

Metadata

Assignees

Labels

No labels
No labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions