Skip to content

Group YAML tests in a single OpenFisca run to speed up testing time #616

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
guillett opened this issue Jan 31, 2018 · 15 comments
Closed

Group YAML tests in a single OpenFisca run to speed up testing time #616

guillett opened this issue Jan 31, 2018 · 15 comments
Assignees
Labels
kind:perf A code change that improves performance scope:developer-experience

Comments

@guillett
Copy link
Member

guillett commented Jan 31, 2018

Relates to openfisca/openfisca-france#926

Hi,

Thanks a lot for that piece of software.

Currently, tests on large country package are relatively slow.
I have a gut feeling, YAML tests could be merged into a single OpenFisca run.

I haven't created a proof of concept but that would be interesting to give it a try.
In my experience, I rarely run tests on OpenFisca-France because they are too slow.

This isssue is related to #535

I identify more as a:
Developer (I create tools that use the existing OpenFisca code).

@benjello
Copy link
Member

benjello commented Feb 2, 2018

For survey data we essentially use SurveySceanrios

We should be able to find a way of:

  • merging all the tests in one survey-like dataframe
  • identify the every tests by the ids of the individuals
  • compute all the needed variables at once
  • compare all the results to detect the errors
  • notify the erroneous tests

@bonjourmauko
Copy link
Member

Hi @benjello, when you talk about DataFrame you mean pandas?

@benjello
Copy link
Member

benjello commented Jul 9, 2018

Yep @maukoquiroga

@bonjourmauko
Copy link
Member

So, idea would be then to sort of parametrise or vectorise test execution?

I know we can't do with numpy without a big refactoring, as there's a lot of mixed functional python code and numpy computations all over the place (type checks, value checks, loops, etc.)

An alternative option, if pandas is more flexible than numpy when it comes to types (basically if we can just add functions to a DataFrame and let it deal with execution), one option could be to try and adapt https://nose2.readthedocs.io/en/latest/params.html to inject a DataFrame which the scenarios and the expected results?

@benjello
Copy link
Member

benjello commented Jul 9, 2018

pandas comes on the top of numpy.
You can still a simulation with many tests without using pandas. My point was survey-like.
But pandas can be handy to ease manipulation but may come with some performance overhead.

@fpagnoux
Copy link
Member

I don't think there is anything about numpy that prevents us from doing that.

We just need to be able make a single simulation from 10/100/all tests, run it, and compare the results with the expected outcomes.
The "only" tricky parts are:

  • Generate 1 simulation out of n YAML tests
  • Generate a relevant test report out of this simulation
    • When failing, the error message should indicate which test in which file failed
    • Ideally, we want to keep displaying one . per test

@fpagnoux
Copy link
Member

I created a branch, vectorized-tests, with a first refactor that enable to insert an extra step of vectorization 🙂

@fpagnoux
Copy link
Member

fpagnoux commented Jul 18, 2018

Hmm, thinking more about that, I think there is actually some hard limitation to run several tests in a single run.

Let's consider a tax and benefit systems with 3 variables A, B and C, with the following dependencies:
A-->B-->C
A depends on B, and B depends on C.

Let's consider the following tests:

- name: "I - Check A formula"
  input_variables:
    B:
      2018-07: 10
  output_variables:
    A:
      2018-07: 20

- name: "II - Check B formula"
  input_variables:
    C:
      2018-07: 10
  output_variables:
    B:
      2018-07: 20

- name: "III - End to end test"
  input_variables:
    C:
      2018-07: 10
  output_variables:
    A:
      2018-07: 40

First, let's try to merge tests I and II. The input variables of the resulting single simulation would be B and C. But B is also an output of test II!
As B's value will be set as an input, its formula won't be used, and test II won't run properly.

Conclusion: If a variable is both an input of a test and an output of another test, then these tests can't run at the same time.

More subtle, let's consider merging I and III. They don't fall in the previous category. But we would run into the same issue anyway: B is an input variable of the resulting single simulation, so its formula won't be used. But that will prevent test III from running properly, as A depends on B.

Conclusion: If a variable B is an intermediate dependency of a variable A, then we can't merge a test where B is an input and A an output.

By "intermediate dependency" I mean a dependency that has a formula.

To be more precise, we need to take periods into account, and rephrase our conclusions:

  • If a variable-period couple is both an input of a test and an output of another test, then these tests can't run at the same time.
  • If a variable-period couple (B, p1) is an intermediate dependency of a variable-period couple (A, p2), then we can't merge a test where (B, p1) is an input and (A, p2) an output.

I may be missing something, but I'd say that reciprocally, if we don't fall in these two categories, our tests can be safely merged.

@fpagnoux
Copy link
Member

Now, having said that, is there still a chance to manage to run several tests at the same time?

It seems hard.

One way would be to have the test runner be very smart and only group tests that are "mergeable". But the condition is practically hard to check (we would need to parse the variable-period graph).

Another would be to tweak the core of OpenFisca to overcome the issues stated above. The idea would be to enable "partially setting" an input variable. When requesting a partially set variable, the variable would be calculated, and the result merged with the input. That seems theoretically feasible, it would probably fix #564 as well, but that'd be complex and very experimental 🤓.

@MattiSG
Copy link
Member

MattiSG commented Jul 19, 2018 via email

@fpagnoux
Copy link
Member

If the initial need is to speed up testing I'm pretty sure focusing efforts on speeding up startup and computation times would yield good results with more positive side effects considering the complexity of the problems that are being thought about here :)

I agree with the conclusion, but the potential performance gain of test vectorization (x5 speed at least) will be hard to get by just "speeding up startup and computation" :)

@bonjourmauko
Copy link
Member

bonjourmauko commented Jul 19, 2018

If the initial need is to speed up testing I'm pretty sure focusing efforts on speeding up startup

Agree ! (there's #535 )

and computation times

Also agree, with a disclaimer: I've done some profiling an experimentation to find ways to optimise computation. For now I've concluded that:

  • Core's code is already pretty efficient (if we do not consider JIT optimisation, C pre-compile, etc.)
  • There're no performance hotspots we can easily deal with
  • There's a lot of work to be done to gain in performance, although none will yield x5 speed at least.

In the lack of function pattern matching, the most promising refactoring effort in terms of performance lies IMHO in the proper vectorisation of core's code (get rid of every single if and for loop).

The results of my performance forays can be found here openfisca/openfisca-france#1014 (comment) (maybe I'm missing something evident).

@fpagnoux
Copy link
Member

In the lack of function pattern matching, the most promising refactoring effort in terms of performance lies IMHO in the proper vectorisation of core's code (get rid of every single if and for loop).

Not sure I understand. @benjello has always made sure no nun-vectorial code slipped through 😉. There is probably some at simulation initialization, but not for calculations.

The results of my performance forays can be found here openfisca/openfisca-france#1014 (comment) (maybe I'm missing something evident).

Nice! I didn't see this one. Will try to have a look :).

(if we do not consider JIT optimisation, C pre-compile

Have we tried this?

@bonjourmauko
Copy link
Member

bonjourmauko commented Jul 20, 2018

Not sure I understand. @benjello has always made sure no nun-vectorial code slipped through 😉. There is probably some at simulation initialization, but not for calculations.

Have we tried this?

Yeah, I've tried turning:

def calc(self, base, factor = 1, round_base_decimals = None):
    base1 = np.tile(base, (len(self.thresholds), 1)).T
    if isinstance(factor, (float, int)):
        factor = np.ones(len(base)) * factor
    # np.finfo(np.float).eps is used to avoid np.nan = 0 * np.inf creation
    thresholds1 = np.outer(factor + np.finfo(np.float).eps, np.array(self.thresholds + [np.inf]))
    if round_base_decimals is not None:
        thresholds1 = np.round(thresholds1, round_base_decimals)
    a = max_(min_(base1, thresholds1[:, 1:]) - thresholds1[:, :-1], 0)
    if round_base_decimals is None:
        return np.dot(self.rates, a.T)
    else:
        r = np.tile(self.rates, (len(base), 1))
        b = np.round(a, round_base_decimals)
        return np.round(r * b, round_base_decimals).sum(axis = 1)

into:

@numba.jit(nopython=False)
def _tile(rates, shape):
    return np.tile(rates, shape)


@numba.jit(nopython=True)
def _T(tile):
    return tile.T


@numba.jit(nopython=False)
def _threshold1a(thresholds, inf=[np.inf]):
    return np.array(thresholds + inf)


@numba.jit(nopython=True)
def _threshold1b(factor, thresholds, eps=np.finfo(np.float).eps):
    return np.outer(factor + eps, thresholds)


@numba.jit(nopython=False)
def _round(thresholds, round_base_decimals):
    return np.round(thresholds, round_base_decimals)


@numba.jit(nopython=True)
def _max_min(base, thresholds):
    return np.maximum(np.minimum(base, thresholds[:, 1:]) - thresholds[:, :-1], 0)


@numba.jit(nopython=True)
def _shape(base):
    return (len(base), 1)


@numba.jit(nopython=True)
def _multiply(a, b):
    return a * b


@numba.jit(nopython=True)
def _sum(rounded):
    return rounded.sum(axis=1)


@profile
def calc(self, base, factor = 1, round_base_decimals = None):
    base1 = _T(_tile(base, _shape(self.thresholds)))
    if isinstance(factor, (float, int)):
        factor = np.ones(len(base)) * factor
    # np.finfo(np.float).eps is used to avoid np.nan = 0 * np.inf creation
    thresholds1 = _threshold1b(factor, _threshold1a(self.thresholds))
    if round_base_decimals is not None:
        thresholds1 = _round(thresholds1, round_base_decimals)
    a = _max_min(base1, thresholds1)
    if round_base_decimals is None:
        return np.dot(self.rates, a.T)
    else:
        r = _tile(self.rates, _shape(base))
        b = _round(a, round_base_decimals)
        return _sum(_round(_multiply(r, b), round_base_decimals))

We cannot apply JIT optimisation with numba to the entire calc function without before completely vectorising it, otherwise it just fails.

Note: nopython=False means code can't be currently optimised.

Today, calc's signature is as follows:

def calc(self,
         base: ndarray,
         factor: Union[int, ndarray, float] = 1,
         round_base_decimals: Optional[int] = None) -> ndarray

To vectorise it, we need before to strongly type the function signature:

def calc(self,
         base: ndarray,
         factor: float = 1.0,
         round_base_decimals: int = 0) -> ndarray

Then, we can rewrite for example:

if isinstance(factor, (float, int)):
    factor = np.ones(len(base)) * factor

into:

factors = np.ones(base.shape[0]) * factor

Which is C-optimisable.

And so on...

@Morendil
Copy link
Contributor

Morendil commented Apr 4, 2019

Closing this, as an exploration that didn't pan out of a particular solution (group several tests into a single vector computation) to an underlying problem (France tests are slow). The problem may still exist but discussion on this particular issue is too anchored on the proposed solution.

@Morendil Morendil closed this as completed Apr 4, 2019
@bonjourmauko bonjourmauko added kind:perf A code change that improves performance and removed meta:performance kind:feat A feature request, a feature deprecation labels Sep 30, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind:perf A code change that improves performance scope:developer-experience
Projects
None yet
Development

No branches or pull requests

7 participants