Skip to content

Class DenseOutput to get dense output of solution #464

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 6 commits into from

Conversation

lisawim
Copy link
Collaborator

@lisawim lisawim commented Jul 28, 2024

As already mentioned in the pySDC v6 discussion it would be nice to get the solution also at points not computed in the simulation by doing interpolation. At least I need that for computing a reference solution of the Van der Pol DAE problem that has a quite steep slope requiring a very small time step size $\Delta t\leq 10^{-9}$ to have a good resolution of this. However the computation takes forever.. Thus, a computation of adaptivity to speedup the computation might make a sense here!

To realize this I implemented hooks to log the solution at all nodes together with nodes in LogSolution and provided a test for that. After simulation you obtain

u_dense = get_sorted(stats, type="u_dense", sortby="time")
nodes = get_sorted(stats, type="nodes", sortby="time")

and then the class DenseOutput is used:

sol = DenseOutput(nodes, u_dense)

Evaluating at requested times t_eval you get

u_eval = [sol(t) for t in t_eval]

The class DenseOutput realizes the handling of calling the numerical solution and interpolation at requested time points. For interpolation qmat package is used.

I was not sure where to store the DenseOutput class in pySDC so I decided to first store it in the DAE project together with the test.

I assume it is a nice-to-have here also in pySDC even routines like scipy.integrate.solve_ivp also have that feature. For my issue: Since the Van der Pol DAE is a DAE of index $1$ scipy.integrate.solve_ivp can still be used for solving. However, due to the steep slope the routine fails at some point and provided the error message that Required step size is less than spacing between numbers where for this problem solving this very stiff example seem to not suitable anymore. Nice to know: I discovered that someone is working on a routine of scipy to solve DAEs. I was very happy about that and played around with that. So I put Van der Pol into it and look! I got the same error message.. so sad.. Anyways. Using the DenseOutput this should speedup the things.

What do you think about that? As usual: Improvements, corrections, comments, etc. are really appreciated! Many thanks in advance! :)

Copy link
Contributor

@brownbaerchen brownbaerchen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems a bit more complicated than needed in places. One of my favourite design principles is KISS, which always reminds me that we are all just a bunch of idiots. ;)
To me, KISS applies mainly in refactoring. After I "finish" something to the point that it does what I want, I go back and check how much more it is doing that I need and how much less generic it is than possible. This is best done by unit testing which tests only a single feature rather then the whole thing. This often exposes what input is really needed for a function. If you think it's needed, you can add an integration test afterwards to test the whole thing is working.
Everybody has their own style of coding. I learnt that writing the unit tests at the same time as I write the functions is what works best for me. As Robert's book recommendation The Pragmatic Programmer points out, the test is the first client of your code. Meaning that as you switch between writing code and test, you switch between being developer and user. Crucially, you should put yourself in the role of a generic user. Don't develop the test for exactly what you need, do it as generically as possible.

@@ -6,19 +6,25 @@

class LogSolution(Hooks):
"""
Store the solution at the end of each step as "u".
Store the solution at the end of each step as "u". It is also possible to get the solution
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please put this in a separate hook. For large simulations, this may be a huge amount of data and possibly a huge amount of communication. This should not be the default behaviour.

uTmp = L.u if not L.sweep.coll.left_is_node else L.u[1:]

# Determine the shape of arrays to handle None values
shape = max((me.shape for me in uTmp if me is not None), default=(0,))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is fishy to me. Why would the entries have different shapes? Why not simply use the shape of L.u[0], which all ranks have?

recvBuf = np.empty(size * len(uFlat), dtype='d')

# Use Allgather to collect data from all processes
comm.Allgather(uFlat, recvBuf)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need to gather on all ranks, actually? Why not gather it on rank 0 only?

uFlat = np.concatenate(u)

# Prepare the buffer to receive data from all processes
recvBuf = np.empty(size * len(uFlat), dtype='d')
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what about recvBuf = np.empty_like(uFlat)? If you specify d as datatype, this cannot be used for complex data.

comm.Allgather(uFlat, recvBuf)

# Reshape and combine data from all processes
recvBuf = recvBuf.reshape(size, -1, shape[0])
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need to communicate in flattened form? I think mpi4py can handle any numpy array as buffer, no?

"""Initialization routine"""

# Extract the stage times and values
self.nodes = [np.array(entry[1]) for entry in nodes]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please use numpy arrays here. I guess your 40 days simulation will generate 1e9123832 data points. The difference between numpy arrays and lists will likely be around 14 ages of the universe. (Rough estimate ;))

"""

for index, tau in enumerate(self.nodes):
if tau[0] <= t <= tau[-1]:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You should use masks here for faster computation.

uValuesSub = [subElement[subIndex] for subElement in uValuesComponents]

# Create an LagrangeApproximation object for this dimension
sol = LagrangeApproximation(points=nodes, fValues=uValuesSub)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you need this for all components? The interpolation matrix should be the same, no? Why not generate one interpolation matrix and multiply all component values?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do you need to worry about the components anyways? Multiplying the 1xM interpolation matrix with an MxXxYxZ matrix return an XxYxZ array, so you can interpolate everything in one operation with no regard as to what kind of mesh this is, no?

@pytest.mark.base
@pytest.mark.parametrize("quad_type", ['RADAU-RIGHT', 'LOBATTO'])
@pytest.mark.parametrize("problemType", ['ODE', 'DAE'])
def test_interpolate(quad_type, problemType):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would prefer you not generate the data using a simulation. Ideally, tests will target as small a unit of code as possible. So the tests should fail only if what they test breaks not if somebody changed some parameter somewhere else or whatever. Also, it's faster to just generate a dictionary than to run a simulation.


import numpy as np

nodesRandom = {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, it's not really random... This is overselling a bit :D

@tlunet
Copy link
Member

tlunet commented Jul 28, 2024

Seems a bit more complicated than needed in places. One of my favourite design principles is KISS, which always reminds me that we are all just a bunch of idiots. ;) To me, KISS applies mainly in refactoring. After I "finish" something to the point that it does what I want, I go back and check how much more it is doing that I need and how much less generic it is than possible. This is best done by unit testing which tests only a single feature rather then the whole thing. This often exposes what input is really needed for a function. If you think it's needed, you can add an integration test afterwards to test the whole thing is working. Everybody has their own style of coding. I learnt that writing the unit tests at the same time as I write the functions is what works best for me. As Robert's book recommendation The Pragmatic Programmer points out, the test is the first client of your code. Meaning that as you switch between writing code and test, you switch between being developer and user. Crucially, you should put yourself in the role of a generic user. Don't develop the test for exactly what you need, do it as generically as possible.

Totally agree with you ... and I would suggest something similar to your messages @brownbaerchen 😉

@lisawim lisawim closed this Jul 28, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants