-
-
Notifications
You must be signed in to change notification settings - Fork 244
Description
Guiding issues:
- Nicer parameter indexing ModelingToolkit.jl#318
- Parameter representation for controls ModelingToolkit.jl#1088
- Indexing and iterating in ODEProblem and modelingtoolkitize / error messages ModelingToolkit.jl#1678
- Potential gradient issues with Flux chains when changing parameter type NeuralPDE.jl#533
- Consistency in the type behavior of restructure FluxML/Optimisers.jl#95
- https://discourse.julialang.org/t/parameter-types-in-diffeqflux-jl-versus-differentialequations-jl/79635
- Extensive and recursive ForwardDiff promotion DiffEqBase.jl#773
Guiding designs:
- https://github.com/FluxML/Functors.jl
- https://github.com/Metalenz/FlexibleFunctors.jl (https://discourse.julialang.org/t/ann-flexiblefunctors-jl-functors-with-per-instance-parameters/64935)
- https://github.com/rafaqz/ModelParameters.jl
- https://github.com/rafaqz/Flatten.jl
- https://github.com/invenia/ParameterHandling.jl
- https://github.com/SciML/RecursiveArrayTools.jl
- https://github.com/jonniedie/ComponentArrays.jl
- https://github.com/paschermayr/ModelWrappers.jl
For a long time we were able to get away with just "let p
be any struct", though an edge case came up, and then a second one, and then a third one, etc. until it became clear we need to at least have an interface defined for "what can be done with p
". Similarly, we have allowed a loose definition of u0
as "anything that can broadcast effectively, and if required can do linear algebra", but this has become limiting.
The main issue really started cropping up with adjoint sensitivity analysis. In that case, one might want to extend the differential equation to include p
, e.g. [u0;p]
for InterpolatingAdjoint
, which then means that p
needs to be able to "act" like a u0
. If that's the case, then if we assume u0
needs to be able to broadcast and do linear algebra, suddenly this requirement is also on p
. While this works for "most" p
's that people use in practice, it does impose a limitation where using say a tuple of arrays for parameters works for "standard" equation solving, but fails with adjoints.
This then requires that we ask the question: what is "the" SciML parameters or structure interface? Currently it has been essentially AbstractArray
, since AbstractArray
is the only thing that effectively defines linear algebra, with extensions allowed with extra functionality is not used. However, it's probably more than overdue to both tighten and enlarge this interface by directly defining what it's actions should be.
If we do this right, then people should be able to use weird structures everywhere and have things "just work", or tell them which functions to define in order for things to just work. How do we get there?
What should a SciML Parameters interface do?
- It needs to be fast and type-stable. If it's not that, people with ignore it.
- It needs to be overloadable. If someone's structure doesn't work, they should have a clear way to make it work.
- It needs to be able to throw appropriate errors about missing structural information. For example, which function needs to be defined.
- It needs to support standard Julia types out of the box.
- It needs to have clearly defined behavior for restructuring and destructuring to/from arrays. In particular, restructuring should always use the values from the restructure vector, and not change it like the Optimisers.jl behavior on Complex (Consistency in the type behavior of restructure FluxML/Optimisers.jl#95)
- Restructuring/destructuring on the simplest types needs to be a no-op. This is required or else supporting calls to C/Fortran is almost impossible, which is required for many wrapper solver libraries (for example, supporting Sundials) but also even things as pervasive as supporting linear algebra.
- It needs to support naming and named indexing. This would give ModelingToolkit a way to interpret code down to its DSL in a way that preserves naming, and generate structures that are more interpretable for users than its flat vectors.
- It needs to be able to add tags that some things are "parameters" while other things are not. What things should optimize, what things are constant.
- Non-SciMLStructure
p
's still need to work in the solvers: e.g. only the places that need more control should checkisscimlstructure
(and if false, throw an error that this is required).
Ideas from Other Libraries / Do we need a separate library for this? Can we use another one?
- This would be core enough to SciML that we would need to make sure we had admin rights all over it and the ability to hotfix etc. as necessary, so from that fact alone it would need to live in SciML. I think that's a no brainer.
- There have been a number of factors for why it was not adopted more widely throughout SciML, the main reason being that restructure/destructure was undeniably broken until fairly recently, where it still has some unintuitive behaviors for the purpose of equation solvers (Potential gradient issues with Flux chains when changing parameter type NeuralPDE.jl#533, Consistency in the type behavior of restructure FluxML/Optimisers.jl#95). However, the general idea of Functions.jl is a good starting place, and I think no matter what structure we settle on we should support Functor definitions at least as a subset of the interface. It (and nothing else) check all of the boxes, like having a hierarchical naming system for ModelingToolkit.
- https://github.com/Metalenz/FlexibleFunctors.jl might be a better starting point than Functors.jl (to the point where starting as a fork may be helpful). Its design is very simple and intuitive, but it's missing a lot, like potentially necesssary mapping operations, hierarchical naming, etc.
- https://github.com/rafaqz/ModelParameters.jl and https://github.com/jonniedie/ComponentArrays.jl come from a very different angle and support the hierarchical naming, but require very specific type wrappers, meaning they aren't "interfaces" but types. While their types are great (and a current good solution for using DiffEq), if we're going to modify the solvers we probably want to do it around interfaces so that "simple" and intuitive types that users gravitate towards (like arrays of arrays) are supported.
Questions
Do we extend the solvers to allow for a SciML Structure/Parameter for u0
or always flatten?
This is potentially a considerable amount of work, but with a potentially considerable amount of gain. It would be an interesting solution to the adjoint sensitivity problem: if [u0,p]
"just works" because arrays of arrays just work because arrays of arrays are supported in the structure interface, then some code becomes easier. But other things, like how to define mass matrices, become 😕. Maybe all matrices need to only be defined on the flattened form, and then LinearSolve.jl can do the restructure/destructure to make it all work nicely?
Supporting this in OrdinaryDiffEq.jl is probably not "too" hard. Every 2 * k1
would need to be like a fmap(*,k1,2)
. It would probably be best to just define that as an operator 2 ⊗ k1
or something so the code is easier to read, but that would up the barrier to entry. For the broadcast operations, it would just require that we specialize @..
effectively, which would probably be the hardest part (if all(isSciMLStructure,vals)
-> special dispatches?). @..
would no longer just be "Fast Broadcast" anymore though, since it would have a different definition/operation on things like arrays of arrays (but then would allow their support without requiring a RecursiveArrayTools.VectorOfArray wrapper, which groups like DynamicalSystems.jl have been asking for years).
For other solvers, we'd have to do destructure/restructure in the wrapper code so it just presents as arrays to the underlying solver. To not have a performance hit, we will need (6) to hold.
I think we should do it, but it will be a challenge.
Do we provide a tables interface for reading/writing to data?
I can see this as very valuable for people working with CSVs and the such. https://github.com/rafaqz/ModelParameters.jl showed it's possible.