Open
Description
The current implementation is too complicated, with too many parts tightly coupled. Here I will give an overview - for others and for myself - and a proposed solution.
A summary:
- Each backend implements a
Profiler
. Profiler
extendsPdata
.Pdata
callsparse.py
functions to parse the code.- The parsing reads the specific format that we created, which is a table, but that is converged to a dictionary.
- The
Pdata
is responsible for creating the cumulative function values. - Each backend implements the
plot
function. - The
parser_options
andprofiler_options
are obtained from the CLI arguments. - A single call is done:
BACKEND.Profiler(parser_options, profiler_options)
.
Issues with this approach:
- Since the tool was built with the CLI in mind, everything is mixed from the start.
- There is no separation of parsing tasks and performance profiling tasks, evidenced by the passing of both arguments to the same class.
- There is also no separation of backend and profiling, since the backend class is the profiler class.
- Accessing perprof as a library is not possible without creating a file (or file-like structure).
Proposal of refactoring:
- Create separate classes for separate functionalities.
SolverData
reads the data of a single solver. It does the parsing and stores the raw values.ProfileData
computes the cumulative function values from an input of SolverData.ProfileBackend
plots the data.
- Consider using
numpy
arrays andpandas.DataFrame
s to store these values. - Make sure that it works with the API.
Some things to keep in mind:
- Keep the CLI working. Since the system is so tightly coupled, it should be possible to change anything inside
Pdata
without worrying about users losing access to the API (nobody should be using it anyway). - If only the most basic performance profile can be integrated, it should already be possible to have a better system.
- Since the API changes will be small, we can release it as 1.2.0.
- This should lead to Proposal: Perprof-py revamp #209 eventually.