Skip to content

Conversation

@jessegrabowski
Copy link
Member

@jessegrabowski jessegrabowski commented Jan 1, 2026

Description

Allows optimize.minimize and optimize.root to be called with multiple inputs of arbitrary shapes, as in:

    x0, x1, x2 = pt.dvectors("x1", "x2", "d3")
    x3 = pt.dmatrix("x3")
    b0, b1, b2 = pt.dscalars("b0", "b1", "b2")
    b3 = pt.dvector("b3")

    y = pt.dvector("y")

    y_hat = x0 * b0 + x1 * b1 + x2 * b2 + x3 @ b3
    objective = ((y - y_hat) ** 2).sum()

    minimized_x, success = minimize(
        objective,
        [b0, b1, b2, b3],
        jac=True,
        hess=True,
        method="Newton-CG",
        use_vectorized_jac=True,
    )

Internally, pack and unpack are used to convert the problem to 1d and return results in the same shape as the inputs.

pack and unpack are also used in the gradients, to simplify the construction of the jacobian with respect to arguments to the optimization function. We should consider simply using pack/unpack in the jacobian function itself, and add an option to get back the unpacked form (what we currently give back -- the columns of the jacobian matrix) or the backed form (a single matrix).

Tests are failing because of a bug in scan, I'm going to have to beg @ricardoV94 to help me understand how to fix that.

This PR also adds L_op implementations for SplitDims and JoinDims. I found this was easier than constantly rewriting the graph to try to remove these ops. Their pullbacks are also SplitDims and JoinDims, so in the end the gradients will be rewritten into Reshape as well, so I don't see any harm.

Related Issue

Checklist

Type of change

  • New feature / enhancement
  • Bug fix
  • Documentation
  • Maintenance
  • Other (please specify):

@jessegrabowski jessegrabowski added bug Something isn't working enhancement New feature or request feature request SciPy compatibility labels Jan 1, 2026
@jessegrabowski
Copy link
Member Author

The scan bug happens here, because shape information is being destroyed somewhere. If I comment out that check, all tests pass.

@jessegrabowski jessegrabowski force-pushed the optimize-use-pack branch 2 times, most recently from 2fd5ae0 to be39ef6 Compare January 1, 2026 04:12
@jessegrabowski
Copy link
Member Author

jessegrabowski commented Jan 1, 2026

Regarding the notebook error reported in #1586, the notebooks runs now but with rewrite warnings. The specific rewrite has to do with squeeze, but it is arising because of the use of vectorize_graph on the gradients of root.

We potentially ought to rewrite to scalar_minimize in that case, but scipy.minimize handles this case gracefully and we should too.

Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR enhances optimize.minimize and optimize.root to accept multiple input variables of arbitrary shapes, addressing issues #1550, #1465, and #1586. The implementation uses pack and unpack operations to handle multiple variables by flattening them into a single vector for scipy optimization, then reshaping results back to their original forms.

Key Changes

  • Added pack/unpack support to minimize and root for handling multiple variables of different shapes
  • Implemented L_op (gradient) methods for JoinDims and SplitDims ops to support autodiff through pack/unpack operations
  • Refactored implict_optimization_grads to use packed variables internally, simplifying jacobian construction

Reviewed changes

Copilot reviewed 4 out of 4 changed files in this pull request and generated 9 comments.

File Description
pytensor/tensor/optimize.py Added _maybe_pack_input_variables_and_rewrite_objective function; updated minimize and root signatures to accept sequences; refactored gradient computation to use packed variables
pytensor/tensor/reshape.py Added L_op implementations for JoinDims and SplitDims; added connection_pattern to SplitDims; improved unpack to handle single-element lists without splitting
tests/tensor/test_optimize.py Added comprehensive tests for multiple minimands, MVN logp gradient regression, multiple root inputs, and vectorized root gradients
tests/tensor/test_reshape.py Added gradient verification tests for join_dims and split_dims; changed rewrite pass from "specialize" to "canonicalize"

@jessegrabowski jessegrabowski force-pushed the optimize-use-pack branch 2 times, most recently from 66cf591 to 3e8a3f3 Compare January 1, 2026 21:07
@ricardoV94
Copy link
Member

ricardoV94 commented Jan 2, 2026

I'm not sure about moving the complexity of handling multiple inputs to our op helpers. Is it that hard to ask users to use pack/unpack themselves. This way the PR is also harder to review as you're doing new feature and bugfix together, and the changes are by no means trivial.

In doing so, you're also moving quite away from the scipy API so it will be less obvious how these work

@jessegrabowski
Copy link
Member Author

I want this API as the front end. We're already far from the scipy API -- we don't take jac or hess functions, and we don't take args. The use-defined single vector case is still valid, so this is a strict upgrade with 100% backwards compat.

I was quite careful to keep the commits split by change, it's difficult to review commit by commit? I am willing to split them into separate PRs if you insist.

@ricardoV94
Copy link
Member

It's not hard to review commit by commit, but I have less trust that the bugfix commit fixed the bug, and not the API change logic for instance

@jessegrabowski
Copy link
Member Author

Sure I can split it out then. I can also address #1466 in a single bugfix PR then circle back to this.

@jessegrabowski
Copy link
Member Author

On further consideration, the other two bugs are directly addressed by this PR. I split the gradients out, since those are different.

Both #1550 #1586 are reporting the same bug. Root/minimize currently fail when computing gradients with respect args with ndim > 2. This PR will handle that natively by using pack/unpack. Specifically, this line assumes that the return from jacobian is always <2d, which isn't the case in general.

An intermediate PR would have to ravel all the args and do book-keeping on their shapes, which is exactly what pack/unpack are for. So I don't see anything to split out, except the L_ops, which I already did.

@jessegrabowski
Copy link
Member Author

I got rid of the eager graph_rewrite, which fixed the scan bug I was hitting.

As a result, I had to implement some logic to filter non-differentiable args, which was a bug you had previously hit. The disconnected_inputs='raise' case works now, so I adjusted the test.

@ricardoV94
Copy link
Member

Sounds like nice progress. If you're just splitting the lop, don't bother, those are simple enough

@jessegrabowski
Copy link
Member Author

Sounds like nice progress. If you're just splitting the lop, don't bother, those are simple enough

Too late. You're already tagged too.

@ricardoV94
Copy link
Member

Sounds like nice progress. If you're just splitting the lop, don't bother, those are simple enough

Too late. You're already tagged too.

You better not have used reshape

@ricardoV94
Copy link
Member

ricardoV94 commented Jan 12, 2026

I was too aggressive in the last commit. Also one of the new tests fails in numba mode, have to investigate that

outputs[1][0] = np.bool_(res.success)

def L_op(self, inputs, outputs, output_grads):
# TODO: Handle disconnected inputs
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
# TODO: Handle disconnected inputs


def f(x, a, b):
objective = (x - a * b) ** 2
def f(x, a, c):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it's paramount we call it c :D

@ricardoV94
Copy link
Member

ricardoV94 commented Jan 13, 2026

Okay numba failures seems properly fixed by #1846 which I would merge there and then rebase this one on top (and squash everything that's left here?)

@jessegrabowski do you want to review the final state of this PR?

@jessegrabowski
Copy link
Member Author

I still see the blockwise changes here, is that right?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug Something isn't working enhancement New feature or request feature request SciPy compatibility

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Gradient of MinimizeOp fails with certain parameter shapes

2 participants