-
Notifications
You must be signed in to change notification settings - Fork 80
[RFC] Provide an explicit Cache API #891
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
from openfisca_core.periods import Period | ||
|
||
|
||
class Cache: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This class interface looks very similar to the current
InMemoryStorage
. Maybe these classes could be merged ?
class Cache: | ||
|
||
|
||
def get(self, variable: str, period: Period) -> Optional[Array]: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I initially thought about names longer like
get_cached_array
orput_in_cache
, but as this class is just doing cache, it seemed redondant.
if period is not None: | ||
period = periods.period(period) | ||
|
||
if variable.is_inactive(period): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This replaces the check on
variable.end
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Avoid negation in method name and use is_active(period)
instead?
return | ||
self.get_holder(variable_name).set_input(period, value) | ||
array = variable.cast_to_array(value) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This replaces the
_to_array
method ofHolder
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shorten the name to
to_array()
and add some documentation to the method as the user might wonder why an openfisca variable is involved?
self.get_holder(variable_name).set_input(period, value) | ||
array = variable.cast_to_array(value) | ||
|
||
if len(array) != population.count: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was tempted to abstract this check into a method somewhere else, but the error message is pretty specific to the
set_input
context, and needs access both thevariable
andpopulation
.
) | ||
|
||
if variable.set_input: | ||
return variable.set_input(self, variable, period, array) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The
set_input
hooks need access to bothsimulation
andvariable
. It could for sure be done better, but I'd rather do the minimum adaptation and keep further improvement out of scope as there have been recent work on that by @Morendil.
if variable.set_input: | ||
return variable.set_input(self, variable, period, array) | ||
|
||
variable.check_input_period(period) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This replaces the period checked that were spread on
holder.set_input
andholder._set
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As there is no mention of the input in this method, it seems more general. Rename to check_period
/is_valid_period
/check_period_matching
... and adapt documentation+log message accordingly?
|
||
|
||
@fixture | ||
def cache(): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Methodology question for @Morendil :
Let's assume that
Cache
uses, in its implementation, a bunch ofInMemoryStorage
instances to store data. This isCache
internal business, and is not reflected in the public methods.To write a proper unit test on Cache, do I need to stub/mock
InMemoryStorage
, or can I just write tests on the Cache methods, as below?In other words, can a unit test indirectly run some code that is outside of the class under test, or does it make it an integration test ?
Thanks a lot in advance!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I try not to get too hung up on unit/integration test differences, it's a tactical matter rather than one of principle. It's fine if the methods of the object under test rely on some other collaborators, as long as you're primarily testing that object's behaviour, rather than indirectly testing the collaborators.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, thanks a lot for the explanation 🙂
""" | ||
pass | ||
|
||
def get_known_periods(self, variable: str) -> List[Period]: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not convinced this method needs to be part of the public contract. Who needs it? Under what circumstances?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure about the exact circumstances, but it is definitely used in several places in IPP code, when they want to figure out all what's in the cache.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The cache is sometimes updated to init a calculation in a reform.
We might need a clear interface for reforms and, in the meantime, use get_known_periods, delete...
if period is not None: | ||
period = periods.period(period) | ||
|
||
if variable.is_inactive(period): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Avoid negation in method name and use is_active(period)
instead?
return | ||
self.get_holder(variable_name).set_input(period, value) | ||
array = variable.cast_to_array(value) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shorten the name to
to_array()
and add some documentation to the method as the user might wonder why an openfisca variable is involved?
if variable.set_input: | ||
return variable.set_input(self, variable, period, array) | ||
|
||
variable.check_input_period(period) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As there is no mention of the input in this method, it seems more general. Rename to check_period
/is_valid_period
/check_period_matching
... and adapt documentation+log message accordingly?
|
||
|
||
def test_does_not_retrieve(cache): | ||
value = cache.get_cached_array('toto', period) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
value = cache.get_cached_array('toto', period) | |
value = cache.get('toto', period) |
return variable.set_input(self, variable, period, array) | ||
|
||
variable.check_input_period(period) | ||
self.cache.put_in_cache(variable.name, period, array) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
self.cache.put_in_cache(variable.name, period, array) | |
self.cache.put(variable.name, period, array) |
Connected to #887
PR Content
This PR is WIP as it only defines some methods contracts rather than implementing them. Only one method is pseudo-implemented, to better explicit the separation of tasks.
Some unit tests are also added, to explicit the behavior of the introduces methods.
The intent is to reach a decision on the big picture before getting drowned into implementation details.
Approach
The suggestion here is to:
- Caching calculated variables would be moved to a new class
Cache
- The rest of the logic would be moved to other existing classes
simulation.set_input
the user-facing method to set and input for a variable at a given time