Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Defining a Parameter/Structures Interface #881

Closed
ChrisRackauckas opened this issue Jul 10, 2022 · 12 comments
Closed

Defining a Parameter/Structures Interface #881

ChrisRackauckas opened this issue Jul 10, 2022 · 12 comments

Comments

@ChrisRackauckas
Copy link
Member

ChrisRackauckas commented Jul 10, 2022

Guiding issues:

Guiding designs:

For a long time we were able to get away with just "let p be any struct", though an edge case came up, and then a second one, and then a third one, etc. until it became clear we need to at least have an interface defined for "what can be done with p". Similarly, we have allowed a loose definition of u0 as "anything that can broadcast effectively, and if required can do linear algebra", but this has become limiting.

The main issue really started cropping up with adjoint sensitivity analysis. In that case, one might want to extend the differential equation to include p, e.g. [u0;p] for InterpolatingAdjoint, which then means that p needs to be able to "act" like a u0. If that's the case, then if we assume u0 needs to be able to broadcast and do linear algebra, suddenly this requirement is also on p. While this works for "most" p's that people use in practice, it does impose a limitation where using say a tuple of arrays for parameters works for "standard" equation solving, but fails with adjoints.

This then requires that we ask the question: what is "the" SciML parameters or structure interface? Currently it has been essentially AbstractArray, since AbstractArray is the only thing that effectively defines linear algebra, with extensions allowed with extra functionality is not used. However, it's probably more than overdue to both tighten and enlarge this interface by directly defining what it's actions should be.

If we do this right, then people should be able to use weird structures everywhere and have things "just work", or tell them which functions to define in order for things to just work. How do we get there?

What should a SciML Parameters interface do?

  1. It needs to be fast and type-stable. If it's not that, people with ignore it.
  2. It needs to be overloadable. If someone's structure doesn't work, they should have a clear way to make it work.
  3. It needs to be able to throw appropriate errors about missing structural information. For example, which function needs to be defined.
  4. It needs to support standard Julia types out of the box.
  5. It needs to have clearly defined behavior for restructuring and destructuring to/from arrays. In particular, restructuring should always use the values from the restructure vector, and not change it like the Optimisers.jl behavior on Complex (Consistency in the type behavior of restructure FluxML/Optimisers.jl#95)
  6. Restructuring/destructuring on the simplest types needs to be a no-op. This is required or else supporting calls to C/Fortran is almost impossible, which is required for many wrapper solver libraries (for example, supporting Sundials) but also even things as pervasive as supporting linear algebra.
  7. It needs to support naming and named indexing. This would give ModelingToolkit a way to interpret code down to its DSL in a way that preserves naming, and generate structures that are more interpretable for users than its flat vectors.
  8. It needs to be able to add tags that some things are "parameters" while other things are not. What things should optimize, what things are constant.
  9. Non-SciMLStructure p's still need to work in the solvers: e.g. only the places that need more control should check isscimlstructure (and if false, throw an error that this is required).

Ideas from Other Libraries / Do we need a separate library for this? Can we use another one?

  • This would be core enough to SciML that we would need to make sure we had admin rights all over it and the ability to hotfix etc. as necessary, so from that fact alone it would need to live in SciML. I think that's a no brainer.
  • There have been a number of factors for why it was not adopted more widely throughout SciML, the main reason being that restructure/destructure was undeniably broken until fairly recently, where it still has some unintuitive behaviors for the purpose of equation solvers (Potential gradient issues with Flux chains when changing parameter type NeuralPDE.jl#533, Consistency in the type behavior of restructure FluxML/Optimisers.jl#95). However, the general idea of Functions.jl is a good starting place, and I think no matter what structure we settle on we should support Functor definitions at least as a subset of the interface. It (and nothing else) check all of the boxes, like having a hierarchical naming system for ModelingToolkit.
  • https://github.com/Metalenz/FlexibleFunctors.jl might be a better starting point than Functors.jl (to the point where starting as a fork may be helpful). Its design is very simple and intuitive, but it's missing a lot, like potentially necesssary mapping operations, hierarchical naming, etc.
  • https://github.com/rafaqz/ModelParameters.jl and https://github.com/jonniedie/ComponentArrays.jl come from a very different angle and support the hierarchical naming, but require very specific type wrappers, meaning they aren't "interfaces" but types. While their types are great (and a current good solution for using DiffEq), if we're going to modify the solvers we probably want to do it around interfaces so that "simple" and intuitive types that users gravitate towards (like arrays of arrays) are supported.

Questions

Do we extend the solvers to allow for a SciML Structure/Parameter for u0 or always flatten?

This is potentially a considerable amount of work, but with a potentially considerable amount of gain. It would be an interesting solution to the adjoint sensitivity problem: if [u0,p] "just works" because arrays of arrays just work because arrays of arrays are supported in the structure interface, then some code becomes easier. But other things, like how to define mass matrices, become 😕. Maybe all matrices need to only be defined on the flattened form, and then LinearSolve.jl can do the restructure/destructure to make it all work nicely?

Supporting this in OrdinaryDiffEq.jl is probably not "too" hard. Every 2 * k1 would need to be like a fmap(*,k1,2). It would probably be best to just define that as an operator 2 ⊗ k1 or something so the code is easier to read, but that would up the barrier to entry. For the broadcast operations, it would just require that we specialize @.. effectively, which would probably be the hardest part (if all(isSciMLStructure,vals) -> special dispatches?). @.. would no longer just be "Fast Broadcast" anymore though, since it would have a different definition/operation on things like arrays of arrays (but then would allow their support without requiring a RecursiveArrayTools.VectorOfArray wrapper, which groups like DynamicalSystems.jl have been asking for years).

For other solvers, we'd have to do destructure/restructure in the wrapper code so it just presents as arrays to the underlying solver. To not have a performance hit, we will need (6) to hold.

I think we should do it, but it will be a challenge.

Do we provide a tables interface for reading/writing to data?

I can see this as very valuable for people working with CSVs and the such. https://github.com/rafaqz/ModelParameters.jl showed it's possible.

@ranocha
Copy link
Member

ranocha commented Jul 10, 2022

I would like to emphasize a requirement (that is somewhat hidden in the list above): If people do not use something like sensitivity analysis, they should not need to care about this and can still use everything they have used before. For example, we pass our semidiscretization as p in Trixi.jl, containing numerical parameters, the mesh, caches for mutating operations etc.

@ChrisRackauckas
Copy link
Member Author

Yes important point, added to the list.

@ArnoStrouwen
Copy link
Member

@ToucheSir
Copy link

Note that FlexibleFunctors is now part of Functors.jl.

See also the monster discussion in JuliaGaussianProcesses/ParameterHandling.jl#43. TL;DR this is fiendishly complex to do in general, but it would be amazing to have more standardization around interfaces.

@ChrisRackauckas
Copy link
Member Author

We don't have to be too general, just have an interface which provides everything SciML needs, and update it as SciML needs more.

@briochemc
Copy link

Suggestion: Add https://github.com/paschermayr/ModelWrappers.jl to the guiding designs?

@ChrisRackauckas
Copy link
Member Author

I talked with @YingboMa and the approach we're going to go with is to prefer canonicalization, i.e. the internals to the solvers won't need any fmap or special broadcast because everything will be represented by simple arrays, and then this would mean that we're restructure/destructure at the boundary to the user (f calls and getproperty overloads) to mask this.

@isaacsas
Copy link
Member

Does that have any potential (performance?) impacts on using parameter objects with mixed types (i.e. including a mix of functions, floats, ints, etc in a type-stable manner)? Or would this only be enforced in places where having a fixed type for parameters is needed anyways for the problem being solved to make sense?

@ChrisRackauckas
Copy link
Member Author

It should be more efficient. We plan to have a non-allocating restructure!/destructure!

@lamorton
Copy link

lamorton commented Aug 31, 2022

Edit: Thought this was under ModelingToolkit. Making a separate issue over there instead.

@vpuri3
Copy link
Member

vpuri3 commented Mar 12, 2023

All,

We are redesigning the operator interface to SciML.ai in https://github.com/SciML/SciMLOperators.jl. This issue has come up in discussions on how to pass in parameters to function objects. The problem here, as I see it, is that there are two kinds of parameters we want to pass to functions/operators:

  1. parameters we want to do automatic-differentiation/ adjoint calculation on
  2. operator specific context or struct objects that contain mesh, numerical parameters, discretization information, etc.

The former can in most cases be cast as an AbstractArray, while the latter is almost always a complex data-structure. Currently SciML only permits passing in one parameter via the p positional argument in f(du, u, p, t). Problems arise when one has to do adjoint calculations with p when it is a complex data-structure. Obviously writing restructure, restructure methods for all the data-types in p can become cumbersome.

SciML/SciMLOperators.jl#143 is solving this problem by allowing users pass the two types of parameters separately. The proposed design would allow you to pass array-like AD parameters in p, and function-dependent data-structures using keyword arguments:

    # Test scalar operator which expects keyword argument to update, modeled in the style of a DiffEq W-operator.
    γ = ScalarOperator(0.0; update_func=(args...; dtgamma) -> dtgamma, accepted_kwarg_fields=(:dtgamma,))

    dtgamma = rand()
    @test γ(u,p,t; dtgamma)  dtgamma * u

@ChrisRackauckas
Copy link
Member Author

https://github.com/SciML/SciMLStructures.jl is the solution.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

8 participants