publications | Théo Uscidda

2024

NeurIPSSpotlight

Mirror and Preconditioned Gradient Descent in Wasserstein Space

Clément Bonet, Théo Uscidda, Adam David and 2 more authors

In the 38th Annual Conference on Neural Information Processing Systems
Spotlight Presentation
Wasserstein Gradient Flows Optimization

Abstract PDF

As the problem of minimizing functionals on the Wasserstein space encompasses many applications in machine learning, different optimization algorithms on \mathbbR^d have received their counterpart analog on the Wasserstein space. We focus here on lifting two explicit algorithms: mirror descent and preconditioned gradient descent. These algorithms have been introduced to better capture the geometry of the function to minimize and are provably convergent under appropriate (namely relative) smoothness and convexity conditions. Adapting these notions to the Wasserstein space, we prove guarantees of convergence of some Wasserstein-gradient-based discrete-time schemes for new pairings of objective functionals and regularizers. The difficulty here is to carefully select along which curves the functionals should be smooth and convex. We illustrate the advantages of adapting the geometry induced by the regularizer on ill-conditioned optimization tasks, and showcase the improvement of choosing different discrepancies and geometries in a computational biology task of aligning single-cells.
NeurIPS

GENOT: Entropic (Gromov) Wasserstein Flow Matching with Applications to Single-Cell Genomics

Dominik Klein*, Théo Uscidda*, Fabian Theis and 1 more author

In the 38th Annual Conference on Neural Information Processing Systems

Generative Modeling Multimodality Neural Optimal Transport Flow Matching

Abstract PDF

Optimal transport (OT) theory has reshaped the field of generative modeling: Combined with neural networks, recent \textitNeural OT (N-OT) solvers use OT as an inductive bias, to focus on “thrifty” mappings that minimize average displacement costs. This core principle has fueled the successful application of N-OT solvers to high-stakes scientific challenges, notably single-cell genomics. N-OT solvers are, however, increasingly confronted with practical challenges: while most N-OT solvers can handle squared-Euclidean costs, they must be repurposed to handle more general costs; their reliance on deterministic Monge maps as well as mass conservation constraints can easily go awry in the presence of outliers; mapping points \textitacross heterogeneous spaces is out of their reach. While each of these challenges has been explored independently, we propose a new framework that can handle, natively, all of these needs. The \textitgenerative entropic neural OT (GENOT) framework models the conditional distribution \pi_\varepsilon(\*y|\*x) of an optimal \textitentropic coupling \pi_\varepsilon, using conditional flow matching. GENOT is generative, and can transport points \textitacross spaces, guided by sample-based, unbalanced solutions to the Gromov-Wasserstein problem, that can use any cost. We showcase our approach on both synthetic and single-cell datasets, using GENOT to model cell development, predict cellular responses, and translate between data modalities.
ICLR

Unbalancedness in Neural Monge Maps Improves Unpaired Domain Translation

Luca Eyring*, Dominik Klein*, Théo Uscidda* and 4 more authors

In the 12th International Conference on Learning Representations

Generative Modeling Neural Optimal Transport Flow Matching

Abstract PDF

In optimal transport (OT), a Monge map is known as a mapping that transports a source distribution to a target distribution in the most cost-efficient way. Recently, multiple neural estimators for Monge maps have been developed and applied in diverse unpaired domain translation tasks, e.g. in single-cell biology and computer vision. However, the classic OT framework enforces mass conservation, which makes it prone to outliers and limits its applicability in real-world scenarios. The latter can be particularly harmful in OT domain translation tasks, where the relative position of a sample within a distribution is explicitly taken into account. While unbalanced OT tackles this challenge in the discrete setting, its integration into neural Monge map estimators has received limited attention. We propose a theoretically grounded method to incorporate unbalancedness into any Monge map estimator. We improve existing estimators to model cell trajectories over time and to predict cellular responses to perturbations. Moreover, our approach seamlessly integrates with the OT flow matching (OT-FM) framework. While we show that OT-FM performs competitively in image translation, we further improve performance by incorporating unbalancedness (UOT-FM), which better preserves relevant features. We hence establish UOT-FM as a principled method for unpaired image translation.
CCAI @ ICLR

On the potential of Optimal Transport in Geospatial Data Science

Nina Wiedemann, Théo Uscidda, and Martin Raubal

In the Tackling Climate Change with Machine Learning Workshop at the 12th International Conference on Learning Representations

Geographic Information Transportation Planning

Abstract PDF

Prediction problems in geographic information science and transportation are frequently motivated by the possibility to enhance operational efficiency. Examples range from predicting car sharing demand for optimizing relocation to forecasting traffic congestion for navigation purposes. However, conventional accuracy metrics do not account for the spatial distribution of predictions errors, despite its relevance for operations. We put forward Optimal Transport (OT) as a spatial evaluation metric and loss function. The proposed OT metric assesses the utility of spatial prediction models in terms of the relocation costs caused by prediction errors. In experiments on real and synthetic data, we demonstrate that 1) the spatial distribution of the prediction errors is relevant in many applications and can be translated to real-world costs, 2) in contrast to other metrics, OT reflects these spatial costs, and 3) OT metrics improve comparability across spatial and temporal scales. Finally, we advocate for leveraging OT as a loss function in neural networks to improve the spatial correctness of predictions. This approach not only aligns evaluation in GeoAI with operational considerations, but also signifies a step forward in refining predictions within geospatial applications. To facilitate the adoption of OT in GIS, we provide code and tutorials at this https URL: https://github.com/mie-lab/geospatialOT/.

2023

ICML

The Monge Gap: A Regularizer to Learn All Transport Maps

Théo Uscidda, and Marco Cuturi

In the 40th International Conference on Machine Learning

Generative Modeling Neural Optimal Transport

Abstract PDF

Optimal transport (OT) theory has been been used in machine learning to study and characterize maps that can push-forward efficiently a probability measure onto another. Recent works have drawn inspiration from Brenier’s theorem, which states that when the ground cost is the squared-Euclidean distance, the “best” map to morph a continuous measure in \mathcalP(\Rd) into another must be the gradient of a convex function. To exploit that result, [Makkuva+ 2020, Korotin+2020] consider maps T=∇f_θ, where f_θ is an input convex neural network (ICNN), as defined by Amos+2017, and fit θ with SGD using samples. Despite their mathematical elegance, fitting OT maps with ICNNs raises many challenges, due notably to the many constraints imposed on θ; the need to approximate the conjugate of f_θ; or the limitation that they only work for the squared-Euclidean cost. More generally, we question the relevance of using Brenier’s result, which only applies to densities, to constrain the architecture of candidate maps fitted on samples. Motivated by these limitations, we propose a radically different approach to estimating OT maps: Given a cost c and a reference measure ρ, we introduce a regularizer, the Monge gap M_ρ^c(T) of a map T. That gap quantifies how far a map T deviates from the ideal properties we expect from a c-OT map. In practice, we drop all architecture requirements for T and simply minimize a distance (e.g., the Sinkhorn divergence) between T♯μ and ν, regularized by M_ρ^c. We study M_ρ^c, and show how our simple pipeline outperforms significantly other baselines in practice.