Skip to content

Commit 024d7f7

Browse files
committed
Implement constrained optimization and improve L-BFGS
This commit introduces support for constrained optimization problems and improves the L-BFGS algorithm. All functions will receive the position to evaluate and return an entire state (x, f(x), grad*, hessian*) instead of treating them separately. This simplifies the interface, while template parameters take care of treating functions according their differentiability. To support constrained problems, a new solver is implemented which can wrap all existing solvers to solver the unconstrained one via penalty method and augmented Lagrangian. It supports now: min f(x) s.t. c1(x) >= 0 c2(x) = 0 Several changes are made to L-BFGS for more robust numerical handling of edge-cases.
1 parent 59975b8 commit 024d7f7

File tree

19 files changed

+748
-237
lines changed

19 files changed

+748
-237
lines changed

BUILD

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -4,5 +4,6 @@ load("//:generator.bzl", "build_example", "build_test")
44

55

66
build_example("simple")
7+
build_example("constrained_simple")
78
build_test("verify")
89

README.md

Lines changed: 74 additions & 47 deletions
Original file line numberDiff line numberDiff line change
@@ -1,43 +1,75 @@
11
# CppNumericalSolvers (A header-only C++17 optimization library)
22

3-
CppNumericalSolvers stands as a robust and efficient header-only C++17
4-
optimization library, meticulously crafted to address numerical optimization
5-
challenges. This library offers a suite of powerful solvers for optimization
6-
problems, placing a strong emphasis on simplicity, adherence to modern C++
7-
standards, and seamless integration into projects.
3+
CppNumericalSolvers is a header-only C++17 optimization library providing a
4+
suite of solvers for both unconstrained and constrained optimization problems.
5+
The library is designed for efficiency, modern C++ compliance, and easy
6+
integration into existing projects. It is distributed under a permissive
7+
license, making it suitable for commercial use.
88

99
### Example Usage: Minimizing the Rosenbrock Function
1010

11-
Let's delve into a straightforward example that illustrates the ease of
12-
utilizing CppNumericalSolvers for numerical optimization. In this instance, we
13-
will showcase the minimization of the classic Rosenbrock function using the
14-
BFGS solver.
11+
Let minimize of the classic Rosenbrock function using the BFGS solver.
1512

1613
```cpp
17-
using FunctionXd = cppoptlib::function::Function<double>;
14+
/**
15+
* @brief Alias for a 2D function with first-order differentiability in cppoptlib.
16+
*
17+
* This defines a function template that supports differentiation, specialized for
18+
* a 2-dimensional input vector.
19+
*/
20+
using Functiond2_dx = cppoptlib::function::Function<
21+
double, 2, cppoptlib::function::Differentiability::First>;
1822

1923
/**
20-
* @brief Definition of the Rosenbrock function for optimization.
24+
* @brief Implementation of the Rosenbrock function with gradient computation.
25+
*
26+
* This class represents the Rosenbrock function:
27+
*
28+
* f(x) = (1 - x₁)² + 100 * (x₂ - x₁²)²
2129
*
22-
* This class represents the Rosenbrock function, a classic optimization problem.
30+
* It includes both function evaluation and its gradient for optimization algorithms.
31+
* The function has a global minimum at (x₁, x₂) = (1, 1), where f(x) = 0.
32+
*
33+
* @tparam T The scalar type (e.g., double or float).
2334
*/
24-
class Rosenbrock : public FunctionXd {
25-
public:
26-
EIGEN_MAKE_ALIGNED_OPERATOR_NEW
27-
28-
/**
29-
* @brief Computes the value of the Rosenbrock function at a given point.
30-
*
31-
* @param x The input vector.
32-
* @return The value of the Rosenbrock function at the given point.
33-
*/
34-
double operator()(const Eigen::VectorXd &x) const {
35-
const double t1 = (1 - x[0]);
36-
const double t2 = (x[1] - x[0] * x[0]);
37-
return t1 * t1 + 100 * t2 * t2;
38-
}
39-
40-
// Gradient and Hessian implementation can be omitted.
35+
class RosenbrockGradient : public Functiond2_dx {
36+
public:
37+
/// Eigen macro for proper memory alignment.
38+
EIGEN_MAKE_ALIGNED_OPERATOR_NEW
39+
40+
// Import necessary typedefs from the base class.
41+
using typename FunctionX2_dx::state_t;
42+
using typename FunctionX2_dx::scalar_t;
43+
using typename FunctionX2_dx::vector_t;
44+
45+
/**
46+
* @brief Computes the Rosenbrock function value and its gradient at a given point.
47+
*
48+
* @param x A 2D Eigen vector representing the input point.
49+
* @return A state_t object containing the function value, input vector, and gradient.
50+
*/
51+
state_t operator()(const vector_t &x) const override {
52+
state_t state;
53+
54+
// Compute function value: f(x) = (1 - x₁)² + 100 * (x₂ - x₁²)²
55+
const scalar_t t1 = (1 - x[0]); // First term: (1 - x₁)
56+
const scalar_t t2 = (x[1] - x[0] * x[0]); // Second term: (x₂ - x₁²)
57+
58+
state.value = t1 * t1 + 100 * t2 * t2;
59+
state.x = x; // Store the input vector for reference.
60+
61+
// Initialize gradient vector (∇f)
62+
state.gradient = vector_t::Zero(2);
63+
64+
// Compute partial derivatives:
65+
// ∂f/∂x₁ = -2(1 - x₁) + 200(x₂ - x₁²)(-2x₁)
66+
state.gradient[0] = -2 * t1 + 200 * t2 * (-2 * x[0]);
67+
68+
// ∂f/∂x₂ = 200(x₂ - x₁²)
69+
state.gradient[1] = 200 * t2;
70+
71+
return state;
72+
}
4173
};
4274

4375
int main(int argc, char const *argv[]) {
@@ -47,40 +79,35 @@ int main(int argc, char const *argv[]) {
4779
// Initial guess for the solution.
4880
Eigen::VectorXd x(2);
4981
x << -1, 2;
50-
std::cout << "Initial point: " << x << std::endl;
51-
std::cout << "Function value at initial point: " << f(x) << std::endl;
52-
53-
// Check the correctness of the gradient and Hessian (against numerical implementation).
54-
std::cout << "Is Gradient correctly implemented? " << cppoptlib::utils::IsGradientCorrect(f, x) << std::endl;
55-
std::cout << "Is Hessian correctly implemented? " << cppoptlib::utils::IsHessianCorrect(f, x) << std::endl;
82+
std::cout << "Initial point: " << x.transpose() << std::endl;
5683

5784
// Evaluate
58-
auto state = f.Eval(x);
85+
auto state = f(x);
86+
std::cout << "Function value at initial point: " << state.value << std::endl;
5987
std::cout << "Gradient at initial point: " << state.gradient << std::endl;
60-
if (state.hessian) {
61-
std::cout << "Hessian at initial point: " << *(state.hessian) << std::endl;
62-
}
6388

6489
// Minimize the Rosenbrock function using the BFGS solver.
6590
using Solver = cppoptlib::solver::Bfgs<Rosenbrock>;
6691
Solver solver;
67-
auto [solution, solver_state] = solver.Minimize(f, x);
92+
auto [solution, solver_progress] = solver.Minimize(f, x);
6893

6994
// Display the results of the optimization.
7095
std::cout << "Optimal solution: " << solution.x.transpose() << std::endl;
7196
std::cout << "Optimal function value: " << solution.value << std::endl;
72-
std::cout << "Number of iterations: " << solver_state.num_iterations << std::endl;
73-
std::cout << "Solver status: " << solver_state.status << std::endl;
97+
std::cout << "Number of iterations: " << solver_progress.num_iterations << std::endl;
98+
std::cout << "Solver status: " << solver_progress.status << std::endl;
7499

75100
return 0;
76101
}
77102
```
78103
79-
This example demonstrates the usage of the BFGS solver to minimize the
80-
Rosenbrock function. You can easily adapt this code for your specific
81-
optimization problem by defining your objective function and selecting an
82-
appropriate solver from CppNumericalSolvers. Dive into the implementation for
83-
more details on available solvers and advanced usage.
104+
You can easily adapt this code for your specific optimization problem by
105+
defining your objective function and selecting an appropriate solver from
106+
CppNumericalSolvers. Dive into the implementation for more details on available
107+
solvers and advanced usage.
108+
109+
110+
See the examples for a constrained problem.
84111
85112
### References
86113
Lines changed: 208 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,208 @@
1+
// Copyright 2025 Amazon.com, Inc. or its affiliates. All Rights Reserved.
2+
#ifndef INCLUDE_CPPOPTLIB_CONSTRAINED_FUNCTION_H_
3+
#define INCLUDE_CPPOPTLIB_CONSTRAINED_FUNCTION_H_
4+
5+
#include <Eigen/Core>
6+
#include <array>
7+
#include <functional>
8+
#include <string>
9+
#include <utility>
10+
11+
#include "function.h"
12+
namespace cppoptlib::function {
13+
14+
template <class function_t, std::size_t NumConstraints>
15+
struct ConstrainedFunction;
16+
17+
template <class cfunction_t>
18+
struct ConstrainedState : public State<cfunction_t> {
19+
static_assert(cfunction_t::NumConstraints > -1,
20+
"need a constrained function");
21+
22+
using state_t = ConstrainedState<cfunction_t>;
23+
24+
typename cfunction_t::scalar_t value = 0;
25+
typename cfunction_t::vector_t x;
26+
typename cfunction_t::vector_t gradient;
27+
28+
std::array<typename cfunction_t::scalar_t, cfunction_t::NumConstraints>
29+
lagrange_multipliers;
30+
std::array<typename cfunction_t::scalar_t, cfunction_t::NumConstraints>
31+
violations;
32+
typename cfunction_t::scalar_t penalty = 0;
33+
34+
ConstrainedState() {
35+
lagrange_multipliers.fill(typename cfunction_t::scalar_t(0));
36+
violations.fill(typename cfunction_t::scalar_t(0));
37+
penalty = typename cfunction_t::scalar_t(10);
38+
}
39+
40+
ConstrainedState(const state_t &rhs) { CopyState(rhs); } // nolint
41+
42+
ConstrainedState operator=(const state_t &rhs) {
43+
CopyState(rhs);
44+
return *this;
45+
}
46+
47+
void CopyState(const state_t &rhs) {
48+
value = rhs.value;
49+
x = rhs.x.eval();
50+
gradient = rhs.gradient.eval();
51+
penalty = rhs.penalty;
52+
lagrange_multipliers = rhs.lagrange_multipliers;
53+
violations = rhs.violations;
54+
}
55+
56+
State<typename cfunction_t::unconstrained_function_t> AsUnconstrained()
57+
const {
58+
State<typename cfunction_t::unconstrained_function_t> state;
59+
state.value = value;
60+
state.x = x.eval();
61+
state.gradient = gradient.eval();
62+
return state;
63+
}
64+
};
65+
66+
template <class cfunction_t>
67+
class UnconstrainedFunctionAdapter
68+
: public cfunction_t::unconstrained_function_t {
69+
public:
70+
EIGEN_MAKE_ALIGNED_OPERATOR_NEW
71+
72+
UnconstrainedFunctionAdapter(cfunction_t original,
73+
typename cfunction_t::state_t state)
74+
: original(original), state(state) {}
75+
76+
typename cfunction_t::unconstrained_function_t::state_t operator()(
77+
const typename cfunction_t::types::vector_t &x) const override {
78+
const typename cfunction_t::state_t inner =
79+
original(x, state.lagrange_multipliers, state.penalty);
80+
typename cfunction_t::unconstrained_function_t::state_t outer;
81+
outer.value = inner.value;
82+
outer.x = inner.x;
83+
outer.gradient = inner.gradient;
84+
return outer;
85+
}
86+
87+
private:
88+
cfunction_t original;
89+
typename cfunction_t::state_t state;
90+
};
91+
92+
template <class function_t, std::size_t TNumConstraints>
93+
struct ConstrainedFunction {
94+
static constexpr int NumConstraints = TNumConstraints;
95+
96+
using types = typename function_t::types;
97+
using scalar_t = typename function_t::types::scalar_t;
98+
using vector_t = typename function_t::types::vector_t;
99+
using matrix_t = typename function_t::types::matrix_t;
100+
using unconstrained_function_t = function_t;
101+
102+
public:
103+
using state_t =
104+
ConstrainedState<ConstrainedFunction<function_t, NumConstraints>>;
105+
static constexpr Differentiability diff_level = function_t::diff_level;
106+
ConstrainedFunction(
107+
const function_t *objective,
108+
const std::array<const function_t *, TNumConstraints> &constraints)
109+
: objective_(objective), constraints_(constraints) {}
110+
state_t operator()(const typename types::vector_t &x,
111+
std::array<scalar_t, TNumConstraints> lagrange_multipliers,
112+
scalar_t penalty) const {
113+
const typename function_t::state_t objective_state =
114+
objective_->operator()(x);
115+
116+
state_t constrained_state;
117+
constrained_state.x = objective_state.x;
118+
constrained_state.value = objective_state.value;
119+
constrained_state.gradient = objective_state.gradient;
120+
121+
// Sum augmented penalties for hard constraints.
122+
for (std::size_t i = 0; i < TNumConstraints; ++i) {
123+
const typename function_t::state_t constraint_state =
124+
constraints_[i]->operator()(x);
125+
const scalar_t cost = constraint_state.value;
126+
const scalar_t violation = cost;
127+
128+
const scalar_t lambda = lagrange_multipliers[i];
129+
const scalar_t aug_cost =
130+
violation + lambda * violation +
131+
static_cast<scalar_t>(0.5) * penalty * violation * violation;
132+
constrained_state.value += aug_cost;
133+
// Augmented gradient (only active if the constraint is violated).
134+
const scalar_t a = scalar_t(1) + lambda + penalty * violation;
135+
const typename types::vector_t scaled_local_grad =
136+
a * constraint_state.gradient;
137+
typename types::vector_t aug_grad = (cost > scalar_t(0))
138+
? scaled_local_grad
139+
: types::vector_t::Zero(x.size());
140+
constrained_state.gradient += aug_grad;
141+
constrained_state.violations[i] = violation;
142+
}
143+
144+
return constrained_state;
145+
}
146+
147+
const function_t *objective_;
148+
std::array<const function_t *, TNumConstraints> constraints_;
149+
};
150+
151+
template <typename function_t>
152+
class ZeroConstraint : public function_t {
153+
public:
154+
EIGEN_MAKE_ALIGNED_OPERATOR_NEW
155+
156+
// Construct with the original constraint function.
157+
explicit ZeroConstraint() {}
158+
159+
typename function_t::state_t operator()(
160+
const typename function_t::vector_t &x) const override {
161+
// Evaluate the original constraint: c(x)
162+
typename function_t::state_t state = constraint_(x);
163+
// Save the original constraint value.
164+
typename function_t::scalar_t c_val = state.value;
165+
// Transform to squared penalty: f(x) = c(x)^2, with gradient 2*c(x)*c'(x).
166+
state.value = c_val * c_val;
167+
state.gradient = 2 * c_val * state.gradient;
168+
return state;
169+
}
170+
171+
private:
172+
function_t constraint_;
173+
};
174+
175+
template <typename function_t>
176+
class NonNegativeConstraint : public function_t {
177+
public:
178+
EIGEN_MAKE_ALIGNED_OPERATOR_NEW
179+
180+
// Construct with the original constraint function.
181+
explicit NonNegativeConstraint() {}
182+
183+
typename function_t::state_t operator()(
184+
const typename function_t::vector_t &x) const override {
185+
// Evaluate the original constraint: c(x)
186+
typename function_t::state_t state = constraint_(x);
187+
// For inequality constraints, we only penalize when c(x) < 0.
188+
if (state.value >= 1e-7) {
189+
// Constraint satisfied; no penalty.
190+
state.value = 0;
191+
state.gradient.setZero();
192+
} else {
193+
// Constraint violated; penalty = ( - c(x) )^2.
194+
typename function_t::scalar_t violation = -state.value;
195+
state.value = violation * violation;
196+
// Chain rule: gradient = 2*violation * (-c'(x))
197+
state.gradient = 2 * violation * (-state.gradient);
198+
}
199+
return state;
200+
}
201+
202+
private:
203+
function_t constraint_;
204+
};
205+
206+
} // namespace cppoptlib::function
207+
208+
#endif // INCLUDE_CPPOPTLIB_CONSTRAINED_FUNCTION_H_

0 commit comments

Comments
 (0)