Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Array input arguments in nonlinear expressions #2402

Open
odow opened this issue Jan 17, 2024 · 5 comments
Open

Array input arguments in nonlinear expressions #2402

odow opened this issue Jan 17, 2024 · 5 comments
Labels
Project: next-gen nonlinear support Issues relating to nonlinear support Submodule: Nonlinear About the Nonlinear submodule
Milestone

Comments

@odow
Copy link
Member

odow commented Jan 17, 2024

Context: Yesterday I had a chat to @rluce about nonlinear expressions, particularly as they relate to Gurobi's upcoming nonlinear interface. We broadly agree on scalar nonlinear functions, and he had some preliminary ideas for vector-inputs.

cc @ccoffrin and @chriscoey for thoughts.

Our ultimate goal is to support examples like the following:

using JuMP, Ipopt
model = Model(Ipopt.Optimizer)
@variable(model, x[1:2, 1:2])
@objective(model, Max, log(det(X)))
# or perhaps the easier to manage;
@objective(model, Max, log_det(X))
using JuMP, Ipopt
model = Model(Ipopt.Optimizer)
@variable(model, x[1:3] >= 1)
p = 2
@objective(model, Min, norm(x, p))

Another key consumer of this would be https://github.com/jump-dev/MiniZinc.jl.

Changes required in JuMP

  • We'd need to support Array in GenericNonlinearExpr and their mapping to moi_function. This seems pretty easy.

Changes required in MOI

  1. We'd need to support Array in MOI.(Scalar,Vector)NonlinearFunction. This seems pretty easy.
  2. We'd need to support Array in MOI.Nonlinear and be able to compute derivatives, etc.

The tricky part is all in 2.

We'd likely need some sort of Node(NODE_VECTOR, parent, n).

But matrices are a bit more complicated. We'd need to encode (rows, cols). One option would be to store the size as a packed value::Int64. That'd mean that we couldn't have matrices with side dimension greater than typemax(Int32)... but that seems okay.

encode(m::Int64, n::Int64) = encode(Int32(m), Int32(n))
encode(m::Int32, n::Int32) = reinterpret(Int64, (m, n))
decode(x::Int64) = Int64.(reinterpret(Tuple{Int32,Int32}, x))

norm(x) would look something like:

expr = Expression(
    [
        Node(NODE_CALL_MULTIVARIATE, -1, OP_NORM             ),
        Node(NODE_ARRAY,              1, 3 #= encode(3, 0) =#),
        Node(NODE_VARIABLE,           2, 1 #= x1 =#          ),
        Node(NODE_VARIABLE,           2, 2 #= x2 =#          ),
        Node(NODE_VARIABLE,           2, 3 #= x3 =#          ),
    ],
    [],
);

log_det(X) would then look something like:

expr = Expression(
    [
        Node(NODE_CALL_MULTIVARIATE, -1, OP_LOGDET                     ),
        Node(NODE_ARRAY,              1, 8589934594  #= encode(2, 2) =#),
        Node(NODE_VARIABLE,           2, 1 #= x11 =#                   ),
        Node(NODE_VARIABLE,           2, 2 #= x21 =#                   ),
        Node(NODE_VARIABLE,           2, 2 #= x12 =#                   ),
        Node(NODE_VARIABLE,           2, 3 #= x22 =#                   ),
    ],
    [],
);

Once you have the data structure, it seems to me that the AD should follow fairly well.

Output arguments

Still absolutely no idea.

@odow odow added Project: next-gen nonlinear support Issues relating to nonlinear support Submodule: Nonlinear About the Nonlinear submodule labels Jan 17, 2024
@odow
Copy link
Member Author

odow commented Jan 18, 2024

For vector-outputs, we could potentially do something like this:

A = [1 2; 3 4]
x = [MOI.VariableIndex(1), MOI.VariableIndex(2)]

f(x) = sum(A * x)

|  i | node_type    | parent | value  | 
|  1 | SCALAR_CALL  |     -1 | OP_SUM |
|  2 | RESULT_ARRAY |      1 | (2, 0) |
|  3 | RESULT       |      2 | NaN    |
|  4 | RESULT       |      2 | NaN    |
|  5 | VECTOR_CALL  |      2 | OP_MUL |
|  6 | ARRAY        |      5 | (2, 2) |
|  7 | CONSTANT     |      6 | 1      |
|  8 | CONSTANT     |      6 | 2      |
|  9 | CONSTANT     |      6 | 3      |
| 10 | CONSTANT     |      6 | 4      |
| 11 | ARRAY        |      5 | (2, 0) |
| 12 | VARIABLE     |     11 | 1      |
| 13 | VARIABLE     |     11 | 2      |

[1.0, 3.0, 2.0, 4.0]

@rluce, we could have dropped the .values list if you re-interpret the .value field of a NODE_VALUE to be float64 instead of an int64 index into the array of float64...

That'd simplify things to:

|  i | node_type    | parent | value  | 
|  1 | CALL         |     -1 | OP_SUM |
|  2 | RESULT_ARRAY |      1 | (2, 0) |
|  3 | RESULT       |      2 | NaN    |
|  4 | RESULT       |      2 | NaN    |
|  5 | CALL         |      2 | OP_MUL |
|  6 | ARRAY        |      5 | (2, 2) |
|  7 | CONSTANT     |      6 | 1.0    |
|  8 | CONSTANT     |      6 | 2.0    |
|  9 | CONSTANT     |      6 | 3.0    |
| 10 | CONSTANT     |      6 | 4.0    |
| 11 | ARRAY        |      5 | (2, 0) |
| 12 | VARIABLE     |     11 | 1      |
| 13 | VARIABLE     |     11 | 2      |

@mlubin
Copy link
Member

mlubin commented Jan 18, 2024

Gurobi will support array-valued nonlinear expressions and also needs JuMP to compute derivatives? Not sure I understand what the interface is.

@odow
Copy link
Member Author

odow commented Jan 18, 2024

Gurobi will support array-valued nonlinear expressions and also needs JuMP to compute derivatives?

No. They're going to do their own thing, and won't support arrays to start with.

But their interface is going to look very similar to the MOI data structure so we were just riffing on some ideas.

It'll look something like

GRBaddnonlinearexpression(
    grb,
    result_variable::Int,
    n_nodes::Int,
    n_data::Int,
    node_types::Vector{Int},
    node_parents::Vector{Int},
    node_values::Vector{Int},
    data::Vector{Float64},
)

@rluce
Copy link

rluce commented Jan 18, 2024

Correct, no vector in- our output planned for Gurobi, but the concept is very tempting 😄 .

But matrices are a bit more complicated. We'd need to encode (rows, cols). One option would be to store the size as a packed value::Int64

Another option could be to put dimensionality information in the value array, viz.

A = [1 2; 3 4]
x = [MOI.VariableIndex(1), MOI.VariableIndex(2)]

f(x) = sum(A * x)

|  i | node_type    | parent | value  | 
|  1 | SCALAR_CALL  |     -1 | OP_SUM |
|  2 | RESULT_ARRAY |      1 | 9      |
|  3 | RESULT       |      2 | NaN    |
|  4 | RESULT       |      2 | NaN    |
|  5 | VECTOR_CALL  |      2 | OP_MUL |
|  6 | ARRAY        |      5 | 7      |
|  7 | CONSTANT     |      6 | 1      |
|  8 | CONSTANT     |      6 | 2      |
|  9 | CONSTANT     |      6 | 3      |
| 10 | CONSTANT     |      6 | 4      |
| 11 | ARRAY        |      5 | 5      |
| 12 | VARIABLE     |     11 | 1      |
| 13 | VARIABLE     |     11 | 2      |

[1.0, 3.0, 2.0, 4.0, 2.0, 0.0, 2.0, 2.0, 2.0, 0.0]

with the understanding that an ARRAY node will always read 2 consecutive values from the value array. Not pretty, but maybe more consistent with the idea that value is the place to store static data.

@odow
Copy link
Member Author

odow commented Jan 18, 2024

I guess there are multiple options for the size. It probably doesn't really matter which one you go with, so long as it is consistent.

The other option, instead of directly coding RESULT values, is to have a OP_GET_INDEX node which can lookup into the array.

|  i | OP_GET_INDEX       |      2 | 1  |

The benefit of this is that you don't need all the RESULT rows. but the downside is that on the reverse pass you'd need to allocate storage for the array ops, where as the RESULT rows could be filled in-place like normal.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Project: next-gen nonlinear support Issues relating to nonlinear support Submodule: Nonlinear About the Nonlinear submodule
Development

No branches or pull requests

3 participants