Optimization

cartan-optim provides first- and second-order Riemannian optimization algorithms that work on any manifold implementing the appropriate traits from cartan-core.

Algorithm Overview

Algorithm	Function	Trait requirements	Convergence
Riemannian Gradient Descent	`minimize_rgd`	`Manifold + Retraction`	First-order
Riemannian Conjugate Gradient	`minimize_rcg`	`+ ParallelTransport`	First-order (superlinear)
Riemannian Trust Region	`minimize_rtr`	`+ Connection`	Second-order
Fréchet Mean (Karcher flow)	`frechet_mean`	`Manifold`	First-order

When to Use Each

RGD: simplest, use when you need a quick result or the manifold only implements Retraction. Good for non-convex problems where you just want to descend.

RCG: preferred for smooth, well-conditioned problems. Superlinear convergence near the minimum. Requires ParallelTransport to transport the conjugate direction. Choose the CgVariant (Fletcher-Reeves or Polak-Ribière) based on the problem.

RTR: second-order convergence; best when gradient evaluations are cheap relative to function evaluations and you need high accuracy. Requires Connection for the quadratic model.

Fréchet mean: use instead of minimize_rgd when the objective is specifically the sum of squared geodesic distances to a set of points. Karcher flow is more numerically stable for this special case.

Usage Pattern

use cartan::prelude::*;
use cartan::manifolds::Sphere;
use cartan_optim::{minimize_rgd, RGDConfig};
use nalgebra::SVector;
 
let s2 = Sphere::<3>;
 
// Define the cost and its Riemannian gradient
let cost   = |p: &[f64; 3]| -p[0];
let rgrad  = |p: &[f64; 3]| {
    let eg = SVector::<f64, 3>::from([-1.0, 0.0, 0.0]);
    s2.project_tangent(p, &eg)
};
 
let x0 = s2.random_point(&mut rand::rng());
let result = minimize_rgd(&s2, cost, rgrad, x0, &RGDConfig::default());
 
assert!(result.converged);
println!("min at {:?}, f = {}", result.point, result.value);

`OptResult`

All algorithms return OptResult<M::Point>:

pub struct OptResult<P> {
    pub point: P,        // final iterate
    pub value: f64,      // f(point)
    pub grad_norm: f64,  // ||grad f(point)||
    pub iterations: usize, // iterations taken
    pub converged: bool, // true if grad_tol met before max_iters
}