cartan

Riemannian Trust Region

A second-order optimizer that models the objective with a quadratic on each tangent space and restricts steps to a trust region ball. Achieves second-order convergence near the minimum.

Algorithm Summary

PropertyValue
Functionminimize_rtr
ConfigRTRConfig
Trait requirementsManifold + Retraction + Connection
ConvergenceSecond-order: near minimum
Inner solverTruncated conjugate gradient (tCG) on tangent space

Algorithm

At each iterate with trust radius , solve the subproblem:

where is the Riemannian Hessian approximated via the connection. The trust radius is updated based on the ratio of actual to predicted reduction.

RTRConfig

pub struct RTRConfig {
    pub max_iters:      usize,  // default: 200
    pub grad_tol:       f64,    // default: 1e-6
    pub delta_init:     f64,    // default: 1.0  - initial trust radius
    pub delta_max:      f64,    // default: 10.0 - max trust radius
    pub rho_accept:     f64,    // default: 0.1  - min ratio to accept step
    pub rho_expand:     f64,    // default: 0.75 - ratio threshold to expand Delta
    pub max_tcg_iters:  usize,  // default: 20   - inner tCG iterations
}

Code Examples

Minimize on S²

use cartan::prelude::*;
use cartan::manifolds::Sphere;
use cartan_optim::{minimize_rtr, RTRConfig};
use nalgebra::SVector;
 
let s2 = Sphere::<3>;
let cost  = |p: &[f64; 3]| -p[0];
let rgrad = |p: &[f64; 3]| {
    s2.project_tangent(p, &SVector::from([-1.0, 0.0_f64, 0.0]))
};
 
let x0 = s2.project_point(&[-0.5_f64, 0.5, 0.707]);
let result = minimize_rtr(&s2, cost, rgrad, x0, &RTRConfig::default());
assert!(result.converged);
// RTR converges in far fewer iterations than RGD near the minimum

Numerical Notes

Hessian approximation: RTR uses the connection (Levi-Civita parallel transport) to approximate the Riemannian Hessian via finite differences of the gradient. This requires Connection but not an explicit Hessian closure.

delta_init: the initial trust radius should be on the order of the expected step size. Too large and the inner tCG wastes iterations; too small and the algorithm behaves like gradient descent initially.

When to use: RTR is most valuable when:

  • High accuracy is needed (grad_tol < 1e-8)
  • Function evaluations are cheap but iterations are expensive
  • The objective is well-approximated by a quadratic near the solution

For rough problems or when Connection is unavailable, prefer RCG.

Applications

  • High-precision pose estimation on SE(3)
  • Fréchet mean computation on SPD(N) to high accuracy
  • Eigenvalue optimization on Grassmann manifolds