PR to record the prevalence of disease/condition, still births, neonatal death, maternal mortality by RachelMurray-Watson · Pull Request #1455 · UCL/TLOmodel

RachelMurray-Watson · 2024-08-15T09:58:06Z

The point prevalence is recorded for a number of modules and conditions within modules (Alri, BladderCancer, BreastCancer, CardioMetabolicDisorders ( chronic_ischemic_hd, chronic_kidney_disease, chronic_lower_back_pain, diabetes, hypertension), COPD, Depression, Diarrhoea, Epilepsy, Hiv, Labor (Intrapartum stillbirth), Malaria, Measles, NewbornOutcomes, OesophagealCancer, OtherAdultCancer, PostnatalSupervisor, PregnancySupervisor (Antenatal stillbirth), ProstateCancer, RTI, Schisto, TB, Demography (maternal_deaths, newborn_deaths).

Additional questions:
- Okay to calculate the prevalence of diarrhoea? It is a not really a disease in its own right, more of a symptom
- For some modules (RTI), may be more accurately described by calculating incidence, rather than prevalence. Is that useful/okay? Or should it be skipped?

Other notes:
- COPD is defined as ch_lung_function > 3.
- Have not included events in the CardioMetabolic Module (ever_heart_attack and ever_stroke) as would be a cumulative incidence living people who have had such events

src/tlo/methods/demography.py

tdm32 · 2024-09-04T08:56:41Z

src/tlo/methods/healthburden.py

+            force_cols=self.recognised_modules_names,
+        )
        self._years_written_to_log += [year]
+    def write_to_log_prevalence_monthly(self):


Monthly prevalence logs seem reasonable, but bear in mind that some conditions, e.g. malaria can develop and resolve within 1 week. In these cases, we could think about using the clinical counter - which counts episodes of disease. It may not make too much difference but perhaps running a daily then monthly logger and checking if the prevalences vary considerably would be useful.

I have made this change below

src/tlo/methods/healthburden.py

src/tlo/methods/malaria.py

src/tlo/methods/rti.py

src/tlo/methods/schisto.py

tdm32 · 2024-09-04T10:00:18Z

src/tlo/methods/tb.py


        return health_values.loc[df.is_alive]

+    def report_prevalence(self):


I think here we should log only active cases, this way we can compare with WHO reports / GBD etc. Also the way that we assign latent cases is not identical to other models, we don't have infections -> latent -> active so we would under-estimate the latent infections. Best to stick to symptomatic active cases only.

Okay, that's great to know, thank you!

Thanks for making the change

tdm32 · 2024-09-04T10:07:28Z

tests/test_record_prevalence_healthburden_class.py

+    assert (df.dtypes == orig.dtypes).all()
+
+
+def find_closest_recording(prevalence, target_date, log_value, column_name, multiply_by_pop):


I'm not sure of the usefulness of finding the closest reported prevalence value. Would a useful test perhaps be to check all registered modules are logging prevalence every month and the inverse of this (no prevalence reported if module not registered), or set the incidence to 0 for one disease and check logger not reporting anything above 0, assert prevalence values for 2010 within reasonable range, e.g. test for extreme or unlikely values.

We had a long discussion about the test yesterday on our call Part II. We decided to include a dummy disease with which to compare prevalences (this will be in the new test file), and to see if the prevalence of what it reports matches with what has been reported it its own logging file.

I suppose by doing it the way that I was doing it, I was trying to see if the calculations themselves were working, as well as the general mechanics of logging. But do you think such a test is unnecessary? And that by showing e.g. with a dummy module and/or what you have suggested above, it would suffice?

src/tlo/methods/healthburden.py

tbhallett

just adding here the comment made in-person: I like the way this is being done overall.... and we should add this new method to the Module base class so that it's formally part of the definition of a disease module (in the same way that report_dalys is.)

RachelMurray-Watson · 2024-09-04T14:49:03Z

so that it's formally part of the definition of a disease module

Grand! Have that done

src/tlo/methods/healthburden.py

tests/test_record_prevalence_healthburden_class.py

src/tlo/methods/healthburden.py

RachelMurray-Watson · 2024-09-11T13:36:15Z

just adding here the comment made in-person: I like the way this is being done overall.... and we should add this new method to the Module base class so that it's formally part of the definition of a disease module (in the same way that report_dalys is.)

(Based on conversation yesterday) - changed so that it is no longer in base class (as not all modules are disease modules), but there is an assertion checking to see that if something uses the healthburden module, it must have the report_prevalence function

… involved in an RTI

…logged as non-infected

…ges the frequency of logging. Way to move this outside of function? Removed print statements

…d in the script. Then, changed the offset days to 0 and have daily logging,

…%. Added function to avoid repetition

…e test. Still logging monthly

…ng special case conditions

get test on dummy module working refactor mechanics unify naming conventions

…rmat

* use boolean property for RTI

tbhallett · 2025-12-19T09:29:08Z

Thanks Rachel.
I could see a few things causing the failing tests, so have fixed those.
I've also proposed some changes to hopefully make this more an integrated part of the framework (using module name, allowing different sizes/shapes of returns etc.)
I've refactored to avoid duplicating the code for producing age/sex prevalence results -- but we could wind that back as there are a couple of exceptions, and I did it mostly to experiment with options relating to (1) below.

Some things to discuss:

The measure of 'prevalence' being used mostly is {Number with disease in age-group} / {Number alive}. This is an unusual measure of 'prevalence' and most readers will expect it to be {Number with disease in age-group} / {Number alive in age-group}. There are probably good reasons why you've done your original approach, but it might be that we want to be outputting just the numerators so that the a 'denominator of choice' can be applied in the scripts. I haven't done this as I didn't want to break your scripts.
Hard-coded logic re MMR in demography feels misplaced to me for an addition to master. This kind of thing is coming out the demography logs already, so the advantage of making a change to master to incorporate those metrics during the simulation seem small. If we need it to make the processing for future use-cases easier, then, it's an argument for ....
The whole thing would actually probably fit most neatly in its own module. This would avoid having to make those additions in demography and it's all quite self-contained really. This would be a quick change, but again, didn't want to break your scripts.

None of this affects you using this for your own purposes. I was just reviewing it with a view to it coming into master so that others can use it in the way they expect and can be maintained along with the rest of the codebase. (You can still use it as you want by branching off from before I've added any changes).

So, I'm pausing here to discuss....

Whether these changes would disrupt your work flow. (In which case, just keep using your own versions)
Whether any of these design choices you've already considered and rejected for reasons I'm missing right now.

When we're aligned, I can make those last little changes (refactoring in its own class, updating what the prevalence metric actually is TBD), and get it merged (in the New Year).

RachelMurray-Watson · 2025-12-19T10:03:41Z

Thanks Rachel. I could see a few things causing the failing tests, so have fixed those. I've also proposed some changes to hopefully make this more an integrated part of the framework (using module name, allowing different sizes/shapes of returns etc.) I've refactored to avoid duplicating the code for producing age/sex prevalence results -- but we could wind that back as there are a couple of exceptions, and I did it mostly to experiment with options relating to (1) below.

Some things to discuss:

The measure of 'prevalence' being used mostly is {Number with disease in age-group} / {Number alive}. This is an unusual measure of 'prevalence' and most readers will expect it to be {Number with disease in age-group} / {Number alive in age-group}. There are probably good reasons why you've done your original approach, but it might be that we want to be outputting just the numerators so that the a 'denominator of choice' can be applied in the scripts. I haven't done this as I didn't want to break your scripts.

Hard-coded logic re MMR in demography feels misplaced to me for an addition to master. This kind of thing is coming out the demography logs already, so the advantage of making a change to master to incorporate those metrics during the simulation seem small. If we need it to make the processing for future use-cases easier, then, it's an argument for ....

The whole thing would actually probably fit most neatly in its own module. This would avoid having to make those additions in demography and it's all quite self-contained really. This would be a quick change, but again, didn't want to break your scripts.

None of this affects you using this for your own purposes. I was just reviewing it with a view to it coming into master so that others can use it in the way they expect and can be maintained along with the rest of the codebase. (You can still use it as you want by branching off from before I've added any changes).

So, I'm pausing here to discuss....

Whether these changes would disrupt your work flow. (In which case, just keep using your own versions)

Whether any of these design choices you've already considered and rejected for reasons I'm missing right now.

When we're aligned, I can make those last little changes (refactoring in its own class, updating what the prevalence metric actually is TBD), and get it merged (in the New Year).

Thanks for making all these changes!!

I can't remember the logic of why we did this (it may be that we were just interested in the population prevalence, but that doesn't explain the age group break down). I can definitely "just" output the other numbers, though, to allow more freedom in the end product.
Okay, that's grand - I think there was a lot of back and forth as to how to include the various maternal/neonatal stats anyways, so removing it is no problem.
I hadn't considered using it as it's own module, but if that is what is more convenient, I could do that (perhaps after Christmas).

…into rmw/log_prevalence_all_disease # Conflicts: # src/tlo/methods/demography.py

… - the number of individuals of each age and sex with a specific disease/condition. NB - no longer has a denominator, just raw numbers. This replaces the "report_prevalence" that was previously in the healthburden module.

Included file with parameters Redid test to not use HealthBurden Module any more

Fixed line formatting

as not called until after the initialisation

tbhallett · 2026-01-07T18:16:55Z

closing this PR in favour of #1772

RachelMurray-Watson requested review from joehcollins, marghe-molaro, tbhallett and tdm32 August 21, 2024 20:46

RachelMurray-Watson marked this pull request as ready for review August 22, 2024 09:19

tdm32 reviewed Sep 4, 2024

View reviewed changes

tbhallett reviewed Sep 4, 2024

View reviewed changes

tbhallett requested changes Sep 4, 2024

View reviewed changes

RachelMurray-Watson self-assigned this Sep 5, 2024

RachelMurray-Watson added 19 commits September 30, 2024 09:25

added record of live births

f407f23

redid name of column

2de2e8e

Added in test of neonatal deaths, maternal deaths, stillbirths

184df81

remove unused variable

caf9934

isort

dbaf020

isort

68c81a7

TB recording isn't working...

fd939f7

changed to allow for simplified births logging

ddca18c

removed loading

5ff10eb

removed unused imports

f1d4abd

isort

c1a2da7

removed incorrect assignation

8f0bb0b

Changed calculation of prevalence on simple whether or not person was…

42ab8f8

… involved in an RTI

Changed calculation of prevalence to be the number of people who are …

d4f04e8

…logged as non-infected

Get_Current_Prevalence have test variable that is true/false and chan…

037eb0d

…ges the frequency of logging. Way to move this outside of function? Removed print statements

Made test a parameter that is default set to False and can be modifie…

41a63d6

…d in the script. Then, changed the offset days to 0 and have daily logging,

Ensured indenting was correct. Removed tolerance buffer for days and …

3d7e20a

…%. Added function to avoid repetition

Attempted to make the prevalence logging daily, for the purpose of th…

74b1ce3

…e test. Still logging monthly

Tidied and used function to calculate prevalences.

ad715fe

tbhallett added 13 commits December 18, 2025 16:23

need to schedule the occurence of the event that does all the work!

c6ef4bb

direct pass the df to avoid saving on module

6930608

streamline logic for information collection and processing and removi…

3c23526

…ng special case conditions

update imports

92f3f22

handle arguement for frequency of updating

8c23ffe

move dummy module to test suite

0f7b5bc

get test on dummy module working refactor mechanics unify naming conventions

fully reguluarise returns into flatten returns to dict[str, float] fo…

4b243bd

…rmat

allow nested dict

98465e2

store population age/sex breakdown

442a693

log population age/sex breakdown

ec404f2

* use utility function for ease of refactoring (wherever possible)

b762e69

* use boolean property for RTI

simply refactoring logic

0a0b075

tidy pregnancy prevalence calculation and logic

7723c05

RachelMurray-Watson added 12 commits January 5, 2026 14:15

Remove MMR etc from Demography file

b58d6c9

Merge remote-tracking branch 'origin/rmw/log_prevalence_all_disease' …

6318b3c

…into rmw/log_prevalence_all_disease # Conflicts: # src/tlo/methods/demography.py

Renamed module file

35bcc5a

Included file with parameters Redid test to not use HealthBurden Module any more

--fix flag

2021f22

Added report_disease_numbers method

9cf0f2e

isort src tests

6a39e54

ruff check src tests --fix

e918f34

Rebased onto master

35f7f62

Removed repeat

2f1d9c0

Removed "prevalence" to "number"

2dd6c5b

Fixed line formatting

Removed OPTIONAL_INIT_DEPENDENCIES = {"DiseaseNumbers"}

5fb05e5

as not called until after the initialisation

tbhallett closed this Jan 7, 2026

tbhallett mentioned this pull request Jan 7, 2026

Record the prevalence of disease/condition, still births, neonatal death, maternal mortality #1772

Merged


		return health_values.loc[df.is_alive]

		def report_prevalence(self):

		assert (df.dtypes == orig.dtypes).all()


		def find_closest_recording(prevalence, target_date, log_value, column_name, multiply_by_pop):

Conversation

RachelMurray-Watson commented Aug 15, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

tdm32 Sep 4, 2024

Choose a reason for hiding this comment

Uh oh!

RachelMurray-Watson Sep 4, 2024

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

tdm32 Sep 4, 2024

Choose a reason for hiding this comment

Uh oh!

RachelMurray-Watson Sep 4, 2024

Choose a reason for hiding this comment

Uh oh!

tdm32 Oct 1, 2024

Choose a reason for hiding this comment

Uh oh!

tdm32 Sep 4, 2024

Choose a reason for hiding this comment

Uh oh!

RachelMurray-Watson Sep 4, 2024

Choose a reason for hiding this comment

Uh oh!

Uh oh!

tbhallett left a comment

Choose a reason for hiding this comment

Uh oh!

RachelMurray-Watson commented Sep 4, 2024

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

RachelMurray-Watson commented Sep 11, 2024

Uh oh!

tbhallett commented Dec 19, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

RachelMurray-Watson commented Dec 19, 2025

Uh oh!

tbhallett commented Jan 7, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

RachelMurray-Watson commented Aug 15, 2024 •

edited

Loading

tbhallett commented Dec 19, 2025 •

edited

Loading