Skip to content

Documentation overhaul and restructuring #3505

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
isaacsas opened this issue Mar 26, 2025 · 18 comments
Open

Documentation overhaul and restructuring #3505

isaacsas opened this issue Mar 26, 2025 · 18 comments

Comments

@isaacsas
Copy link
Member

Per recent discussions we've been told that Catalyst has been using private kwargs in system constructors in some cases. For example, observed was mentioned (though as I previously mentioned, we use it because of the discussion in #1343 for which I thought we had been ok'ed to pass explicit observed equations -- as common in SBML and other bio file formats).

It would be nice to update the API documentation or doc strings within the systems to make clear which kwargs are API and which are intended to be internal. This really isn't clear to me at this point, which makes it challenging to know which ones we can use from Catalyst.

@ChrisRackauckas ChrisRackauckas changed the title clarify in API documentation which system kwargs are public vs. private Documentation overhaul and restructuring Mar 27, 2025
@ChrisRackauckas
Copy link
Member

We generally just do not document the kwargs that are internal. For ODESystem for example we do not document any of the keyword arguments. https://docs.sciml.ai/ModelingToolkit/dev/systems/ODESystem/#ModelingToolkit.ODESystem notice that no observed keyword argument is documented here. But we do document internal fields, which is a bit odd.

I think the bigger thing is that the API parts of the MTK documentation just need a complete overhaul. The issue that we had before was that there was no docstring repetitions allowed. This meant that if you used get_eqs docstring on one page, you couldn't use it in another. This was a weird Documenter limitation that led to things like https://docs.sciml.ai/ModelingToolkit/dev/systems/ODESystem/#Composition-and-Accessor-Functions, rather than actually putting docstrings, because again that was a Documenter limitation and the docs just wouldn't build. This also meant that we couldn't put many constructor docstrings in the docs because they would get repeated in a few odd ways, which means all constructors had to be manually written.

Now you can set a canonical location, so in theory all of those weird little "use these functions" things should be changed to the docstrings. And in the docstrings we then fully document the public API. But it would probably be nice to extend the SciMLStyle to have some documentation style for how to handle private API, since in MTK some of this is used to signal to other authors since the compiler can get complicated.

So this is going to be a significant effort to get us onto the right track. I want to pull @asinghvi17 into some of this since he always has good ideas, at least for the design, and it's good to have a newer set of eyes that doesn't know everything already. @vyudu @asinghvi17 @AayushSabharwal let's come up with a plan next week for a documentation structure that will scale better here, and reset the docs to this structure. I think we have a lot of the content already written but I think the structure needs a rethink.

@TorkelE
Copy link
Member

TorkelE commented Mar 29, 2025

Documentation is always the never ending task, isn't it?

I guess one thing which makes MTK documentation tricky is that, as a package, it essentially serves two roles. It is both a modelling package that users can use to build their models directly, but also an internal representation for packages like Catalyst (and MTK itself).

There is much overlap, but it might make sense to (at some point) have specific documentation aimed towards package developers who might like to use the MTK internal representation (either for building a new modelling package, or for hooking a package that does some form of analysis into the SciML system).

Long-term, I think a goal should be to attract as many non-SciML developers to hook into the SciML intermediate representation, this could potentially amplify MTK's impact manyfold. As part of this it would make sense to point out that this is something the system is designed with in mind, and offering appropriate support.

(Finally some pure dev docs for MTK itself would also be great, but that is probably something for further into the future)

@Datseris
Copy link
Contributor

Datseris commented Mar 31, 2025

This was a weird Documenter limitation that led to things like https://docs.sciml.ai/ModelingToolkit/dev/systems/ODESystem/#Composition-and-Accessor-Functions, rather than actually putting docstrings, because again that was a Documenter limitation and the docs just wouldn't build

I just wanna make sure that this is known, that now documenter does allow for docstrings to be repeated: https://documenter.juliadocs.org/stable/man/syntax/#noncanonical-block

@ChrisRackauckas
Copy link
Member

Yes keyword was. We need to redo a lot now that it's fixed.

@AayushSabharwal
Copy link
Member

AayushSabharwal commented Apr 3, 2025

Maybe I'm biased, but I've always been a fan of Agents.jl's documentation. I'm basing my suggestion below off of it.

  1. Introduction
    • Our current home page works here.
  2. Tutorial
    • A simple straightforward tutorial page which takes a standard example to demonstrate common user workflows. Agents uses the Schelling model to demonstrate building and simulating a model alongside data collection and visualization.
    • This shouldn't go into details, but should cover most of the things that a user might encounter when building a model.
    • At the very least, we need one for a DAE that MTK simulates. It can demonstrate building a model (both normally and with @mtkmodel), simplification, state priority, inspecting the system, building a problem, initialization, solving, solution indexing and plotting. We should also throw callbacks in there somewhere.
    • We could also have one such page for every problem type. ODEProblem, NonlinearProblem (includes least squares and SCC), OptimizationProblem, SDEProblem, DDEProblem, SDDEProblem, BVProblem, DiscreteProblem, ImplicitDiscreteProblem.
    • One for linearization-related stuff too?
  3. Examples
    • A small set of examples that demonstrate more niche functionality not covered in the tutorial
    • Parameter initialization, model transformations (liouville, changing iv, etc.), hierarchical models with connect, callable parameters, FMUs all go here. Basically a lot of what is in the "Basics" section of the current docs.
  4. API
    • More-or-less a dump of all public functions but grouped into neat categories. I'm in favour of putting all our canonical docs in one big page. Being able to Ctrl+F is a blessing. We could, however, have separate pages for model-building-related public API and DSL-building-related public API. The top of the page should also be an index, so it's easy to see all the functions in there and jump to any one.
    • MTK doesn't just have "exported functions", we have interfaces and expectations. Pages in here should document behavior such as initialization semantics, complete, the AbstractSystem interface and keywords for constructor functions to name just a few. Every "system" (referring to a collection of infrastructure that implements a feature) that MTK has and is not fully explainable via docstrings should have a page.
  5. FAQ and Comparison pages
    • We should look into turning some of the FAQ stuff into their own pages, either examples or API interface docs.
  6. Developer docs
    • Internal documentation. No guarantees of public API, just documenting what certain fields mean, internal dataflow useful internal functions (we end up duplicating stuff sometimes) etc.
    • Every referenced internal function and struct should be put in an @docs block on this page. This makes sure we have internal docstrings and fails our docs build when something is removed, so we know to update the page.
    • This can also include links to issues with topics for discussion. For example, the MTKv10 discussion.

I'm basically thinking out loud as I write this, so there is definitely scope for improvement. Hopefully it is a decent starting point.

@Datseris
Copy link
Contributor

Datseris commented Apr 3, 2025

This generalizes well to an extra "integrations" page, that Agents.jl also has. The BifurcationKit.jl and Attractors.jl "Examples" can go there. So can a small example showcasing of Catalyst.jl? I don't know how many MTK users automatically know Catalyst as well.

@TorkelE
Copy link
Member

TorkelE commented Apr 3, 2025

I think one problem is that MTK's doc (which has actually gotten quite good) have been written quite incohesively, and especially from different times of a package that have evolved quite a lot. I think if/when things stabilise it might make sense to sit down and think about how to do it. However, I still think it might be nice to split it into three sections:

  • One for users on how to do various types of modelling,
  • One more advanced on how things work for people who'd like to do build things on top of MTK, and
  • One dev doc section for people working with developing MTK.

Then within those one could split things further.

@ChrisRackauckas
Copy link
Member

That's a very good point. I think @AayushSabharwal 's proposal + the split of the Manual / API pages into those 3 groups is a nice format.

More-or-less a dump of all public functions but grouped into neat categories. I'm in favour of putting all our canonical docs in one big page. Being able to Ctrl+F is a blessing. We could, however, have separate pages for model-building-related public API and DSL-building-related public API. The top of the page should also be an index, so it's easy to see all the functions in there and jump to any one.

Not quite. SciML/SciMLDocs#108 is a good read and has a bit on this. We should have API material included, yes, that is a major issue with the current MTK pages. But, every page with API needs a high level summary that explains what the API pieces are going to mean. Two pages that I would point to are:

Notice that the meat of the information comes from API docs, very complete docstrings, but it's organized with summaries and such to make the page clearly flow. So when I look at for example https://docs.sciml.ai/ModelingToolkit/dev/systems/ODESystem/, the issue here is that it's not a complete API dump.

get_eqs(sys) or equations(sys): The equations that define the ODE.
get_unknowns(sys) or unknowns(sys): The set of unknowns in the ODE.
get_ps(sys) or parameters(sys): The parameters of the ODE.
get_iv(sys): The independent variable of the ODE.
get_u0_p(sys, u0map, parammap) Numeric arrays for the initial condition and parameters given var => value maps.
continuous_events(sys): The set of continuous events in the ODE.
discrete_events(sys): The set of discrete events in the ODE.
alg_equations(sys): The algebraic equations (i.e. that does not contain a differential) that defines the ODE.
get_alg_eqs(sys): The algebraic equations (i.e. that does not contain a differential) that defines the ODE. Only returns equations of the current-level system.
diff_equations(sys): The differential equations (i.e. that contain a differential) that defines the ODE.
get_diff_eqs(sys): The differential equations (i.e. that contain a differential) that defines the ODE. Only returns equations of the current-level system.
has_alg_equations(sys): Returns true if the ODE contains any algebraic equations (i.e. that does not contain a differential).
has_alg_eqs(sys): Returns true if the ODE contains any algebraic equations (i.e. that does not contain a differential). Only considers the current-level system.
has_diff_equations(sys): Returns true if the ODE contains any differential equations (i.e. that does contain a differential).
has_diff_eqs(sys): Returns true if the ODE contains any differential equations (i.e. that does contain a differential). Only considers the current-level system.

This section is just bad 😅 . What we need to do is make all of those just show docstrings that are completely documented according to the style guide (i.e. documenting arguments, etc.) https://github.com/SciML/SciMLStyle?tab=readme-ov-file#documentation .

So the issue is that the system pages should be amazing. The ODESystem page should be "show me every function I can call on ODESystem, and organize it into different headers". What has gone wrong with the API pages is that Documenter.jl used to not allow for putting the same docstring onto two different pages. This meant that we could not put a docstring of parameters(sys) to document that exists on the ODESystem page and the SDESystem page. But, with the new canonical location system, that limitation is now lifted. So what we should do is make the System page share all of the functions / actions /etc. you can do on a system, and then just repeat those docstrings in each system type where it's relevant.

I think one of the other things that has gone wrong is that https://docs.sciml.ai/ModelingToolkit/dev/basics/AbstractSystem/ is supposed to be a page on "things you can do to all systems", which I don't think comes across well. No one goes to this page. I think any function here should be repeated to the specific systems, it's okay to duplicate. This page should document the interface and the shared functions, but we shouldn't require someone to understand an AbstractSystem to know that ODEsystem can do parameters(sys).

@asinghvi17
Copy link
Contributor

Speaking generally I would agree with Aayush and Torkel's points: we need to speak different languages for different audiences. MTK can be arcane if not viewed with the proper context, so we need to lead first time users down the garden path to get that context in an engaging way before they lose interest.

When I look at the ODESystem page, one thing I notice is that there are a lot of places where we should insert cross references.

Having docstrings embedded in the page is IMO not immediately a good idea - there is too much for a user to go through. I actually quite like the "invocation and brief description" style of function index that we have there. But we could expand on that in two ways:

  • formalizing the brief description as harvesting the first line of the docstring (this will also help REPL docs)
  • automating it by creating a custom Documenter block that you can just pass a bunch of function signatures to, which will automatically generate the cross references, something like this:
```@functionindex
equations(::ODESystem) || get_eqs(::ODESystem)
get_unknowns(sys))|| unknowns(sys)
get_ps(sys) || parameters(sys)
get_iv(sys)
get_u0_p(sys, u0map, parammap) "Numeric arrays for the initial condition and parameters given var => value maps" # override the description
...
```

which would expand to something like

[`get_eqs(sys)`](@ref get_eqs) or [`equations(sys)`](@ref equations): The equations that define the ODE.
get_unknowns(sys) or unknowns(sys): The set of unknowns in the ODE.
get_ps(sys) or parameters(sys): The parameters of the ODE.
get_iv(sys): The independent variable of the ODE.
get_u0_p(sys, u0map, parammap) Numeric arrays for the initial condition and parameters given var => value maps.

(insert all the rest of the refs at your leisure, I didn't want to write all of them out :D)

The next stage after that would be to have a short popup along with the docs. But I imagine even this would be a good start.

The next step for that block would be to do something like this, a collapsed block of docs blocks:

Image

but with the "Function" word replaced by the short description of the function. That's probably the best of both worlds: you can view the docstring without a context switch but at the same time have a high level overview.

If there are examples associated with the functions, we could also engineer some way to link those from later on in the page.

@TorkelE
Copy link
Member

TorkelE commented Apr 4, 2025

I don't think we need (or should) rush into things, especially while MTK is seeing some amount of flux. But considering some stuff that Chris was mentioning about future workflows where Chris mentioned that users might not create specific systems or problems, but just do System(eqs) and Problem(sys) and solve(prob). I was thinking a top-level split like

  • Model creation, solving, and workflows. Basically for top-level users. Describes various types of models, solving strategies. Also analysis stuff like identifiability analysis, parameter fitting, and bifurcation diagrams could go here. Basically what most users probably would want.
  • More detailed stuff on what is going on. E.g. if old ODESystems still remain, stuff could go here. Mostly for people building packages on top of modeling toolkit, or people who just need some more in--depth knowledge for carrying out some custom analysis.
  • Dev-docs. Goes through things like internal functions which are required when working with MTK, but not recommended for outside use.
    might make sense. Then e.g. the first part could have similar structure to a normal package documentation.

Another thing that is a bit confusing is that models can be created using both macro and programmatical methods. But it feels a bit haphazardous when which approach is used is actually used, and these could be introduced better.

@Datseris
Copy link
Contributor

Datseris commented Apr 4, 2025

Another thing that is a bit confusing is that models can be created using both macro and programmatical methods. But it feels a bit haphazardous when which approach is used is actually used, and these could be introduced better.

I really second that, and for me this was always a struggle to navigate this in the docs and as a user and as a developer. Both as a user and as a developer I always want to rely on programmatical, and simplest, methods. So for me the DESystem(eqs) was always the way to go. But in the docs this was presented as the "less favoured approach", at least that was my impression. Perhaps it really is worth considering whether both approaches must exist, and whether only having a single approach is possible instead (in which case I argue it should be the simple and programmatical approach)?

@AayushSabharwal
Copy link
Member

Yeah so @mtkmodel is convenient but also a pain to maintain. The parser is massive, and adding new syntax is difficult. The old approach of creating a function returning an ODESystem is the "assembly language" approach. It can do everything MTK can, but it has pitfalls.

@AayushSabharwal
Copy link
Member

I don't think @mtkmodel is going away any time soon, if only because it would require rewriting the entire standard library

@cstjean
Copy link
Contributor

cstjean commented Apr 7, 2025

I would also prefer writing my models programmatically with DESystem(eqs), but I went with @mtkmodel because then it's much easier to show the equations to my non-programming colleagues.

@TorkelE
Copy link
Member

TorkelE commented Apr 7, 2025

I usually use the progrmmatic approach, but mostly because I haven't bothered learning the macro properly (and last time I read through the MTK docs from start to finish it wasn't that well described, but it might be different now). Long-term, I wouldn't be surprised if it became the standard approach, though.

@Datseris
Copy link
Contributor

Datseris commented Apr 7, 2025

I would also prefer writing my models programmatically with DESystem(eqs), but I went with @mtkmodel because then it's much easier to show the equations to my non-programming colleagues.

I would think that the best way to show the equations would be with their LaTeX rendered format, in which case the equations vector passed into the DESystem(eqs) already does the trick?

@cstjean
Copy link
Contributor

cstjean commented Apr 8, 2025

All my components are in one big Literate.jl file, which makes it easy to collabarate since there's only "one language". My non-programming colleagues can even do PRs every once in a while, it's a nice setup.

In any case, @mtkmodel also exists to ease in Modelica users, so I doubt that its existence is up for debate...

ayush2281 added a commit to ayush2281/ModelingToolkit.jl that referenced this issue Apr 8, 2025
…); Added troubleshooting guide for detection process (SciML#515)
@ayush2281
Copy link

I've submitted a PR that refactors the ODESystem composition and accessor function docs as discussed here: #3505.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

8 participants