Skip to content

Differentiate constructors in the forward pass properly #1577

@PetroZarytskyi

Description

@PetroZarytskyi

Even though we generate constructor pullbacks now, we still don't generate constructor_reverse_forw, which should be used in the forward pass. To make matters worse, we don't warn the user that they need to provide a custom one. Also, constructors are often implicit (e.g., they arise when passing objects by value), and it might be completely not obvious to the user that they should write a custom derivative. That's exactly what happens in issue #1469.

It's worth mentioning that, when a custom constructor_reverse_forw is not provided, we sometimes use this pattern:

S s {...};
S _d_s(s);
clad::zero_init(_d_s);

However, the approach is only designed to cover iterable containers and purely numerical types like std::complex<double>. If we deal with pointer fields, we just zero-initialize them (with clad::zero). In other words, there's no guarantee that this approach actually gives us a correct answer. Writing an analysis to diagnose this would most likely take more effort than switching to a different system. Also note that this system only works when the constructor is used as a VarDecl initializer.
Once we fix the problem generally, we should probably get rid of this hack.

We have the following solutions to this problem:

  1. Automatic constructors_reverse_forw: would take more effort to implement, but would give us a very general solution. It would look somewhat like this:
class S {
  double m_x, m_y;
  S(double x, double y) {
    m_x = x;
    m_y = y;
  }
};
clad::ValueAndAdjoint<S, S>
S::constructor_reverse_forw(double x, double y, double _d_x, double _d_y) {
  S* _this = malloc(sizeof(S));
  S* _d_this = malloc(sizeof(S));
  _this->m_x = x;
  _d_this->m_x = _d_x;
  _this->m_y = y;
  _d_this->m_y = _d_y;
  return {*_this, *_d_this};
}

The malloc arises because we're trying to generate a function that has traits of constructors (raw uninitialized memory allocation). This is the exact same approach that we use with constructor_pullback.

  1. If reverse_forw is not provided, we use a pair of constructors. E.g. with copy constructors
    <copy_constructor>(s) ->
    <copy_constructor>(s), <copy_constructor>(_d_s)

This would probably cover 95% of the cases and make them look very good, but the remaining 5% would fail silently and be only solvable with custom derivatives.

  1. Produce a warning/error to force the user to write a custom derivative. This is a very awkward solution and can only be used as a temporary measure.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions