Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Resource file path from simulation #1410

Open
wants to merge 52 commits into
base: master
Choose a base branch
from

Conversation

jkumwenda
Copy link
Collaborator

Created resource file path function and calling it from different modules.

@jkumwenda jkumwenda linked an issue Jul 2, 2024 that may be closed by this pull request
5 tasks
@@ -44,7 +44,7 @@ class Simulation:
"""

def __init__(self, *, start_date: Date, seed: int = None, log_config: dict = None,
show_progress_bar=False):
show_progress_bar=False, resourcefilepath = None):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add str type hint to resourcefilepath.

@@ -80,6 +81,7 @@ def __init__(self, *, start_date: Date, seed: int = None, log_config: dict = Non
data=f'Simulation RNG {seed_from} entropy = {self._seed_seq.entropy}'
)
self.rng = np.random.RandomState(np.random.MT19937(self._seed_seq))
self.resourcefilepath = resourcefilepath
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here, we could convert and store Path type and check that path exists.

thewati and others added 20 commits July 30, 2024 15:50
…tate cancer, fixed test equipment and dxmanager
…t.py and breast_cancer.py method updated for resource file path from simulation.py
…t.py and breast_cancer.py method updated for resource file path from simulation.py
…t.py isort the import to fix incorrectly sorted error
@thewati
Copy link
Collaborator

thewati commented Oct 2, 2024

Is there anyone who has an idea how I can solve this problem. Whenever I run scripts/hiv/batch_test_runs.py, I am
getting a file not found error. My path is getting mixed up with a path to ResourceFile_Demography_parameters.csv.
Can't pinpoint where these two paths are getting concatenated. The paths are C:\\Users\\User\\PycharmProjects\\TLOmodel\\src\\scripts\\hiv\\projections_jan2023\\batch_test_runs.py and demography\\ResourceFile_Demography_parameters.csv

@thewati thewati requested a review from mnjowe October 7, 2024 07:06
@tbhallett tbhallett requested a review from tamuri October 7, 2024 13:41
Copy link
Collaborator

@tbhallett tbhallett left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well done!!! This seems to do the job, and with a minimum if invasiveness! Thank you.

See a couple of comments/queries.

src/tlo/methods/epi.py Show resolved Hide resolved
Comment on lines 396 to 403
logger.info(key="message", data=f"Running draw {sample['draw_number']}, sample {sample['sample_number']}")

sim = Simulation(
start_date=self.scenario.start_date,
seed=sample["simulation_seed"],
log_config=log_config,
resourcefilepath=self.scenario.scenario_path)

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

confused about the need for these changes? Should they be on this PR?!?!

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was mostly following the Simulation objects to make changes to them wherever they are... If this change is not necessary for this PR, I can rollback

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah if it's not necessary for the PR, then I think it's best to roll it back.

src/tlo/scenario.py Outdated Show resolved Hide resolved
def __init__(
self,
*,
start_date: Date,
seed: Optional[int] = None,
log_config: Optional[dict] = None,
show_progress_bar: bool = False,
resourcefilepath: Optional[Path] = None,
resourcefilepath: str = None,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
resourcefilepath: str = None,
resourcefilepath: Optional[str|Path] = None,

Copy link
Collaborator

@tbhallett tbhallett left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks Wati and Joel. This all seems very good to me.

@tbhallett
Copy link
Collaborator

@tamuri and @matt-graham -- this looks ready to go, from my perspective

@tbhallett
Copy link
Collaborator

@jkumwenda -- please merge in master and resolve conflicts

@tamuri -- please can we merge this when done, so this doesn't keep getting out of date?

@jkumwenda
Copy link
Collaborator Author

Will fix that and merge

@tbhallett
Copy link
Collaborator

@jkumwenda --- please could you update this so we can merge it in? As we wait, it becomes out-of-date and requires more updates.

@jkumwenda
Copy link
Collaborator Author

jkumwenda commented Jan 8, 2025 via email

thewati and others added 3 commits January 9, 2025 17:01
# Conflicts:
#	src/scripts/epilepsy_analyses/analysis_epilepsy.py
#	src/tlo/methods/care_of_women_during_pregnancy.py
#	src/tlo/methods/contraception.py
#	src/tlo/methods/epi.py
#	src/tlo/methods/hiv.py
#	src/tlo/methods/hiv_tb_calibration.py
#	src/tlo/methods/labour.py
#	src/tlo/methods/malaria.py
#	src/tlo/methods/measles.py
#	src/tlo/methods/newborn_outcomes.py
#	src/tlo/methods/postnatal_supervisor.py
#	src/tlo/methods/pregnancy_supervisor.py
#	src/tlo/methods/schisto.py
#	src/tlo/methods/simplified_births.py
#	src/tlo/methods/tb.py
Copy link
Collaborator

@mnjowe mnjowe left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you, Wati and Joel, for the excellent work on this PR. Below are my comments, which may seem many but primarily revolve around the following key points:

  1. I suggest initialising resourcefilepath as a path object in both the analysis scripts and test files. This change will help eliminate repetitive code currently arising from creating a path object from resourcefilepath in each disease module.
  2. I suggest reverting the changes made to the simulation end_date and population sizes in some of the analysis scripts.
  3. I suggest removing str option for resourcefilepath argument in Simulation object.
  4. I suggest making resourcefilepath argument in the read_parameters section optional to improve readability
  5. I suggest removing the condition to check if resourcefilepath is empty in utils. There may be a more efficient way to handle this check.

For changing resourcefilepath from str to path, I couldn't provide a comment on every affected line. However, if you agree that it should be declared as a path object (rather than a string), you can apply this change consistently across all affected areas. Similarly, if you agree with making resourcefilepath in read_parameters section optional, you can apply this adjustment to all relevant sections.

Once again, thank you for the great work on this PR!

Comment on lines +40 to +41
end_date = Date(2011, 12, 31)
popsize = 5000
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did you make these changes just to make the script run faster? if yes can you now please revert the changes?

@@ -348,7 +348,7 @@ def plot_modal_gbd_deaths_by_age_group(self):
start_date = Date(2010, 1, 1)
end_date = Date(2030, 1, 1)

resourcefilepath = Path("./resources") # Path to resource files
resourcefilepath = './resources' # Path to resource files
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any reason why you're changing from Path to string here?

Comment on lines +41 to +42
end_date = Date(2011, 7, 1)
pop_size = 1000
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same as above, if the intention was to make this run faster, revert the changes


# Path to the resource files used by the disease and intervention methods
resources = "./resources"
resourcefilepath = "./resources"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we standardise path here i.e. making it to resourcefilepath = Path("./resources") and read it in read_parameters as read_csv_files(resourcefilepath / resourcefile_folder_name). I feel this will be good as we will initialise path once rather than each module initialising it.

@@ -25,7 +25,7 @@
# %%


resourcefilepath = Path("./resources")
resourcefilepath = './resources'
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same here. why changing? I feel like it will be good to initialise path here rather than in the module

@@ -192,12 +191,12 @@ def __init__(self, name=None, resourcefilepath=None):
)
}

def read_parameters(self, data_folder):
def read_parameters(self, resourcefilepath=None):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

make it optional?

"""Setup parameters used by the module, now including disability weights"""

# Update parameters from the resourcefile
self.load_parameters_from_dataframe(
pd.read_excel(Path(self.resourcefilepath) / "ResourceFile_Breast_Cancer.xlsx",
pd.read_excel(Path(resourcefilepath) / "ResourceFile_Breast_Cancer.xlsx",
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

could be good if you could have initialised resourcefilepath as path object and avoid creating it here

@@ -256,7 +255,7 @@ def __init__(self, name=None, resourcefilepath=None, do_log_df: bool = False, do
self.lms_event_death = dict()
self.lms_event_symptoms = dict()

def read_parameters(self, data_folder):
def read_parameters(self, resourcefilepath=None):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

make it optional?

@@ -273,7 +272,7 @@ def read_parameters(self, data_folder):
ResourceFile_cmd_events_hsi.xlsx = HSI parameters for events

"""
cmd_path = Path(self.resourcefilepath) / "cmd"
cmd_path = Path(resourcefilepath) / "cmd"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

resourcefilepath could be passed as a path object already and avoid creating it here

def __init__(
self,
*,
start_date: Date,
seed: Optional[int] = None,
log_config: Optional[dict] = None,
show_progress_bar: bool = False,
resourcefilepath: Optional[Path] = None,
resourcefilepath: Optional[str | Path] = None,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this should only take Path. I think string option should be removed

@jkumwenda
Copy link
Collaborator Author

Thank you, Wati and Joel, for the excellent work on this PR. Below are my comments, which may seem many but primarily revolve around the following key points:

  1. I suggest initialising resourcefilepath as a path object in both the analysis scripts and test files. This change will help eliminate repetitive code currently arising from creating a path object from resourcefilepath in each disease module.
  2. I suggest reverting the changes made to the simulation end_date and population sizes in some of the analysis scripts.
  3. I suggest removing str option for resourcefilepath argument in Simulation object.
  4. I suggest making resourcefilepath argument in the read_parameters section optional to improve readability
  5. I suggest removing the condition to check if resourcefilepath is empty in utils. There may be a more efficient way to handle this check.

For changing resourcefilepath from str to path, I couldn't provide a comment on every affected line. However, if you agree that it should be declared as a path object (rather than a string), you can apply this change consistently across all affected areas. Similarly, if you agree with making resourcefilepath in read_parameters section optional, you can apply this adjustment to all relevant sections.

Once again, thank you for the great work on this PR!

Thanks for these comments, we will review and provide feedback line by line.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: Ready for EM review
Development

Successfully merging this pull request may close these issues.

Get resource file path from simulation
5 participants