-
Notifications
You must be signed in to change notification settings - Fork 156
Description
Even though we generate constructor pullbacks now, we still don't generate constructor_reverse_forw, which should be used in the forward pass. To make matters worse, we don't warn the user that they need to provide a custom one. Also, constructors are often implicit (e.g., they arise when passing objects by value), and it might be completely not obvious to the user that they should write a custom derivative. That's exactly what happens in issue #1469.
It's worth mentioning that, when a custom constructor_reverse_forw is not provided, we sometimes use this pattern:
S s {...};
S _d_s(s);
clad::zero_init(_d_s);
However, the approach is only designed to cover iterable containers and purely numerical types like std::complex<double>. If we deal with pointer fields, we just zero-initialize them (with clad::zero). In other words, there's no guarantee that this approach actually gives us a correct answer. Writing an analysis to diagnose this would most likely take more effort than switching to a different system. Also note that this system only works when the constructor is used as a VarDecl initializer.
Once we fix the problem generally, we should probably get rid of this hack.
We have the following solutions to this problem:
- Automatic
constructors_reverse_forw: would take more effort to implement, but would give us a very general solution. It would look somewhat like this:
class S {
double m_x, m_y;
S(double x, double y) {
m_x = x;
m_y = y;
}
};
clad::ValueAndAdjoint<S, S>
S::constructor_reverse_forw(double x, double y, double _d_x, double _d_y) {
S* _this = malloc(sizeof(S));
S* _d_this = malloc(sizeof(S));
_this->m_x = x;
_d_this->m_x = _d_x;
_this->m_y = y;
_d_this->m_y = _d_y;
return {*_this, *_d_this};
}
The malloc arises because we're trying to generate a function that has traits of constructors (raw uninitialized memory allocation). This is the exact same approach that we use with constructor_pullback.
- If reverse_forw is not provided, we use a pair of constructors. E.g. with copy constructors
<copy_constructor>(s) ->
<copy_constructor>(s), <copy_constructor>(_d_s)
This would probably cover 95% of the cases and make them look very good, but the remaining 5% would fail silently and be only solvable with custom derivatives.
- Produce a warning/error to force the user to write a custom derivative. This is a very awkward solution and can only be used as a temporary measure.