2025Activity reportProject-TeamOCKHAM
RNSR: 202324392T- Research center Inria Lyon Centre
- In partnership with:Ecole normale supérieure de Lyon, Université Claude Bernard (Lyon 1)
- Team name: Optimization, pHysical Knowledge, Algorithms and Models
- In collaboration with:Laboratoire de l'Informatique du Parallélisme (LIP)
Creation of the Project-Team: 2023 March 01
Each year, Inria research teams publish an Activity Report presenting their work and results over the reporting period. These reports follow a common structure, with some optional sections depending on the specific team. They typically begin by outlining the overall objectives and research programme, including the main research themes, goals, and methodological approaches. They also describe the application domains targeted by the team, highlighting the scientific or societal contexts in which their work is situated.
The reports then present the highlights of the year, covering major scientific achievements, software developments, or teaching contributions. When relevant, they include sections on software, platforms, and open data, detailing the tools developed and how they are shared. A substantial part is dedicated to new results, where scientific contributions are described in detail, often with subsections specifying participants and associated keywords.
Finally, the Activity Report addresses funding, contracts, partnerships, and collaborations at various levels, from industrial agreements to international cooperations. It also covers dissemination and teaching activities, such as participation in scientific events, outreach, and supervision. The document concludes with a presentation of scientific production, including major publications and those produced during the year.
Keywords
Computer Science and Digital Science
- A3.5. Social networks
- A3.5.1. Analysis of large graphs
- A5.3.2. Sparse modeling and image representation
- A5.8. Natural language processing
- A5.9. Signal processing
- A5.9.4. Signal processing over graphs
- A5.9.5. Sparsity-aware processing
- A5.9.6. Optimization tools
- A6.3.1. Inverse problems
- A8.2. Optimization
- A8.6. Information theory
- A8.12. Optimal transport
- A9.2.1. Supervised learning
- A9.2.4. Optimization and learning
- A9.2.6. Neural networks
- A9.2.7. Kernel methods
- A9.2.8. Deep learning
- A9.11. Generative AI
Other Research Topics and Application Domains
- B2.6. Biological and medical imaging
- B6.6. Embedded systems
- B7.2.1. Smart vehicles
- B9.5.1. Computer science
- B9.5.2. Mathematics
- B9.5.6. Data science
- B9.10. Privacy
1 Team members, visitors, external collaborators
Research Scientists
- Remi Gribonval [Team leader, INRIA, Senior Researcher, HDR]
- Paulo Goncalves [INRIA, Senior Researcher, HDR]
- Mathurin Massias [INRIA, Researcher]
- Titouan Vayer [INRIA, Researcher]
Faculty Members
- Marion Foare [CPE LYON, Associate Professor]
- Elisa Riccietti [ENS DE LYON, Associate Professor, from Sep 2025]
Post-Doctoral Fellows
- Alice Brenon [ENS DE LYON]
- Etienne Lasalle [ENS DE LYON, Post-Doctoral Fellow, until Feb 2025]
- Guillaume Lauga [ENS DE LYON, Post-Doctoral Fellow, from Feb 2025 until Mar 2025]
- Hugo Lebeau [INRIA, Post-Doctoral Fellow, from Feb 2025]
- Manon Verbockhaven [ENS DE LYON, Post-Doctoral Fellow, from Dec 2025]
PhD Students
- Giuseppe Carrino [ENS DE LYON, from Nov 2025]
- Mael Chaumette [INRIA]
- Edgar Desainte-Mareville [ENS DE LYON]
- Anne Gagneux [UNIV LYON I]
- Arthur Lebeurrier [ENS DE LYON]
- Sibylle Marcotte [ENS PARIS]
- Can Pouliquen [ENS DE LYON]
Technical Staff
- Pascal Carrivain [INRIA, Engineer]
Interns and Apprentices
- Ilias Bouhss [ENS DE LYON, Intern, from Jun 2025 until Aug 2025]
- Giuseppe Carrino [ENS DE LYON, Intern, from Mar 2025 until Jun 2025]
- Chady Essouabri [CNRS, Intern, from May 2025 until Aug 2025]
- Florian Kozikowski [CNRS, Intern, from Mar 2025 until Aug 2025]
- Damien Rouchouse [INRIA, Intern, from Apr 2025 until Sep 2025]
Administrative Assistant
- Emilie Gatignol [INRIA]
Visiting Scientist
- Laurent Jacques [Univ UCLouvain, from Sep 2025]
2 Overall objectives
Building on a culture at the interface of signal modeling, mathematical optimization and statistical machine learning, the global objective of OCKHAM is to develop computationally efficient and mathematically founded methods and models to process high-dimensional data. Our ambition is to develop frugal signal processing and machine learning methods able to exploit structured models, intrinsically associated to resource-efficient implementations, and endowed with solid statistical guarantees.
Challenge 1: Developing frugal methods with robust expressivity.
The idea of frugal approaches means algorithms relying on a controlled use of computing resources, but also methods whose expressivity and flexibility provably relies on the versatile notion of sparsity. This is expected to avoid the current pitfalls of costly over-parameterizations and to robustify the approaches with respect to adversarial examples and overfitting. More specifically, it is essential to contribute to the understanding of methods based on neural networks, in order to improve their performance and most of all, their efficiency in resource-limited environments.
Challenge 2: Integrating models in learning algorithms.
To make statistical machine learning both more frugal and more interpretable, it is important to develop techniques able to exploit not only high-dimensional data but also models in various forms when available. When some partial knowledge is available about some phenomena related to the processed data, e.g. under the form of a physical model such as a partial differential equation, or as a graph capturing local or non-local correlations, the goal is to use this knowledge as an inspiration to adapt machine learning algorithms. The main challenge is to flexibly articulate a priori knowledge and data-driven information, in order to achieve a controlled extrapolation of predicted phenomena much beyond the particular type of data on which they were observed, and even in applications where training data is scarce.
Challenge 3: Guarantees on interpretability, explainability, and privacy.
The notion of sparsity and its structured avatars –notably via graphs– is known to play a fundamental role in ensuring the identifiability of decompositions in latent spaces, for example for high-dimensional inverse problems in signal processing. The team's ambition is to deploy these ideas to ensure not only frugality but also some level of explainability of decisions and an interpretability of learned parameters, which is an important societal stake for the acceptability of “algorithmic decisions”. Learning in small-dimensional latent spaces is also a way to spare computing resources and, by limiting the public exposure of data, it is expected to enable tunable and quantifiable tradeoffs between the utility of the developed methods and their ability to preserve privacy.
3 Research program
This project is resolutely at the interface of signal modeling, mathematical optimization and statistical machine learning, and concentrates on scientific objectives that are both ambitious –as they are difficult and subject to a strong international competition– and realistic thanks to the richness and complementarity of skills they mobilize in the team.
Sparsity constitutes a backbone for this project, not only as a target to ensure resource-efficiency and privacy, but also as prior knowledge to be exploited to ensure the identifiability of parameters and the interpretability of results. Graphs are its necessary alter ego, to flexibly model and exploit relations between variables, signals, and phenomena, whether these relations are known a priori or to be inferred from data. Lastly, advanced large-scale optimization is a key tool to handle in a statistically controlled and algorithmically efficient way the dynamic and incremental aspects of learning in varying environments.
The scientific activity of the project is articulated around the three axes described below. A common endeavor to these three axes consists in designing structured low-dimensional models, algorithms of bounded complexity to adjust these models to data through learning mechanisms, and a control of the performance of these algorithms to exploit these models on tasks ranging from low-level signal processing to the extraction of high-level information.
3.1 Axis 1: Sparsity for high-dimensional learning.
As now widely documented, the fact that a signal admits a sparse representation in some signal dictionary 66 is an enabling factor not only to address a variety of inverse problems with high-dimensional signals and images, such as denoising, deconvolution, or declipping, but also to speedup or decrease the cost of the acquisition of analog signals in certain scenarios compatible with compressive sensing 68, 60. The flexibility of the models, which can incorporate learned dictionaries 100, as well as structured and/or low-rank variants of the now-classical sparse modeling paradigm 78, has been a key factor of the success of these approaches. Another important factor is the existence of algorithms of bounded complexity with provable performance, often associated to convex regularization and proximal strategies 56, 63, allowing to identify latent sparse signal representations from low-dimensional indirect observations.
While being now well-mastered (and in the core field of expertise of the team), these tools are typically constrained to relatively rigid settings where the unknown is described either as a sparse vector or a low-rank matrix or tensor in high (but finite) dimension. Moreover, the algorithms hardly scale to the dimensions needed to handle inverse problems arising from the discretization of physical models (e.g., for 3D wavefield reconstruction). A major challenge is to establish a comprehensive algorithmic and theoretical toolset to handle continuous notions of sparsity 61, which have been identified as a way to potentially circumvent these bottlenecks. The other main challenge is to extend the sparse modeling paradigm to resource-efficient and interpretable statistical machine learning. The methodological and conceptual output of this axis provides tools for Axes 2 and 3, which in return fuel the questions investigated in this axis.
-
1.1 Versatile and efficient sparse modeling. The goal is to propose flexible and resource-efficient sparse models, possibly leveraging classical notions of dictionaries and structured factorization, but also the notion of sparsity in continuous domains (e.g. for sketched clustering, mixture model estimation, or image super-resolution), low-rank tensor representations, and neural networks with sparse connection patterns.
Besides the empirical validation of these models and of the related algorithms on a diversity of targeted applications, the challenge is to determine conditions under which their success can be mathematically controlled, and to determine the fundamental tradeoffs between the expressivity of these models and their complexity.
- 1.2 Sparse optimization. The main objectives are: a) to define cost functions and regularization penalties that integrate not only the targeted learning tasks, but also a priori knowledge, for example under the form of conservation laws or as relation graphs, cf Axis 2; b) to design efficient and scalable algorithms 67, 80 to optimize these cost functions in a controlled manner in a large-scale setting. To ensure the resource-efficiency of these algorithms, while avoiding pitfalls related to the discretization of high-dimensional problems (aka curse of dimensionality), we investigate the notion of “continuous” sparsity (i.e., with sparse measures), of hierarchies (along the ideas of multilevel methods), and of reduced precision (cf also Axis 3). The nonconvexity and non-smoothness of the problems are key challenges, and the exploitation of proximal algorithms and/or convexifications in the space of Borelian measures are privileged approaches.
- 1.3 Identifiability of latent sparse representations. To provide solid guarantees on the interpretability of sparse models obtained via learning, one needs to ensure the identifiability of the latent variables associated to their parameters. This is particularly important when these parameters bear some meaning due to the underlying physics. Vice-versa, physical knowledge can guide the choice of which latent parameters to estimate. By leveraging the team's know-how obtained in the field of inverse problems, compressive sensing and source separation in signal processing, we aim at establishing theoretical guarantees on the uniqueness (modulo some equivalence classes to be characterized) of the solutions of the considered optimization problems, on their stability in the presence of random or adversarial noise, and on the convergence and stability of the algorithms.
3.2 Axis 2: Learning on graphs and learning of graphs.
Graphs provide synthetic and sparse representations of the interactions between potentially high-dimensional data, whether in terms of proximity, statistical correlation, functional similarity, or simple affinities. One central task in this domain is how to infer such discrete structures, from the observations, in a way that best accounts for the ties between data, without becoming too complex due to spurious relationships. The graphical lasso 69 is among the most popular and successful algorithm to build a sparse representation of the relations between time series (observed at each node) and that unveils relevant patterns of the data. Recent works (e.g. 79) strived to emphasize the clustered structure of the data by imposing spectral constraints to the Laplacian of the sought graphs, with the aim to improve the performance of spectral approaches to unsupervised classification. In this direction, several challenges remain, such as for instance the transposition of the framework to graph-based semi-supervised learning 57, where natural models are stochastic block models rather than strictly multi-component graphs (e.g. Gaussian mixtures models). As it is done in 105, the standard -norm penalization term of graphical lasso could be questioned in this case. On another level, when low-rank (precision) matrices and / or when preservation of privacy are important stakes, one could be inspired by the sketching techniques developed in 74 and 62 to work out a sketched graphical lasso. There exists other situations where the graph is known a priori and does not need to be inferred from the data. This is for instance the case when the data naturally lie on a graph (e.g. social networks or geographical graphs) and so, one has to combine this data structure with the attributes (or measures) carried by the nodes or the edges of these graphs. Graph signal processing (GSP) 978, which underwent methodological developments at a very rapid pace in recent years, is precisely an approach to jointly exploit algebraically these structures and attributes, either by filtering them, by re-organizing them, or by reducing them to principal components. However, as it tends to be more and more the case, data collection processes yield very large data sets with high dimensional graphs. In contrast to standard digital signal processing that relies on regular graph structures (cycle graph or cartesian grid) treating complex structured data in a global form is not an easily scalable task 71. Hence, the notion of distributed GSP 64, 65 has naturally emerged. Yet, very little has been done on graph signals supported on dynamical graphs that undergo vertices/edges editions.
- 2.1 Learning of graphs. When the graphical structure of the data is not known a priori, one needs to explore how to build it or to infer it. In the case of partially known graphs, this raises several questions in terms of relevance with respect to sparse learning. For example, a challenge is to determine which edges should be kept, whether they should be oriented, and how attributes on the graph could be taken into account (in particular when considering time-series on graphs) to better infer the nature and structure of the un-observed interactions. We strive to adapt known approaches such as the graphical lasso to estimate the covariance under a sparsity constraint (integrating also temporal priors), and investigate diffusion approaches to study the identifiability of the graphs. In connection with Axis 1.2, a particular challenge is to incorporate a priori knowledge coming from physical models that offer concise and interpretable descriptions of the data and their interactions.
-
2.2 Distributed and adaptive learning on graphs. The availability of a known graph structure underlying training data offers many opportunities to develop distributed approaches, open perspectives where graph signal processing and machine learning can mutually fertilize each other.
Some classifiers can be formalized as solutions of a constrained optimization problem, and an important objective is then to reduce their global complexity by developing distributed versions of these algorithms. Compared to costly centralized solutions, distributing the operations by restricting them to local node neighborhoods will enable solutions that are both more frugal and more privacy-friendly. In the case of dynamic graphs, the idea is to get inspiration from adaptive processing techniques to make the algorithms able to track the temporal evolution of data, either in terms of structural evolution or of temporal variations of the attributes. This aspect finds a natural continuation in the objectives of Axis 3.
3.3 Axis 3: Dynamic and frugal learning.
With the resurgence of neural networks approaches in machine learning, training times of the order of days, weeks, or even months are common. Mainstream research in deep learning somehow applies it to an increasingly large class of problems and uses the general wisdom to improve the models prediction accuracy by “stacking more layers”, making the approach ever more resource-hungry. Underpinning theory on which resources are needed for a network architecture to achieve a given accuracy is still in its infancy. Efficient scaling of such techniques to massive sample sizes or dimensions in a resource-restricted environment remains a challenge and is a particularly active field of academic and industrial R&D, with recent interest in techniques such as sketching, dimension reduction, and approximate optimization.
A central challenge is to develop novel approximate techniques with reduced computational and memory imprint. For certain unsupervised learning tasks such as PCA, unsupervised clustering, or parametric density estimation, random features (e.g. random Fourier features 95) allow to compute aggregated sketches guaranteed to preserve the information needed to learn, and no more: this has led to the compressive learning framework, which is endowed with statistical learning guarantees 74 as well as privacy preservation guarantees 62. A sketch can be seen as an embedding of the empirical probability distribution of the dataset with a particular form of kernel mean embedding 98. Yet, designing random features given a learning task remains something of an art, and a major challenge is to design provably good end-to-end sketching pipelines with controlled complexity for supervised classification, structured matrix factorization, and deep learning.
Another crucial direction is the use of dynamical learning methods, capable of exploiting wisely multiple representations at different scales of the problem at hand. For instance, many low and mixed-precision variants of gradient-based methods have been recently proposed 103, 102, which are however based on a static reduced precision policy, while a dynamic approach can lead to much improved energy-efficiency. Also, despite their massive success, gradient-based training methods still possess many weaknesses (low convergence rate, dependence on the tuning of the learning parameters, vanishing and exploding gradients) and the use of dynamical information promises to allow for the development of alternative methods, such as second-order or multilevel methods, which are as scalable as first-order methods but with faster convergence guarantees 96, 104.
The overall objective in this axis is to adapt in a controlled manner the information that is extracted from datasets or data streams and to dynamically use such information in learning, in order to optimize the tradeoffs between statistical significance, resource-efficiency, privacy-preservation and integration of a priori knowledge.
- 3.1 Compressive and privacy-preserving learning. The goal is to compress training datasets as soon as possible in the processing workflow, before even starting to learn. In the spirit of compressive sensing, this is desirable not only to ensure the frugal use of ressources (memory and computation), but also to preserve privacy by limiting the diffusion of raw datasets and controlling the information that could actually be extracted from the targeted compressed representations, called sketches, obtained by well-chosen nonlinear random projections. We aim to build on a compressive learning framework developed by the team with the viewpoint that sketches provide an embedding of the data distribution, which should preserve some metrics, either associated to the specific learning task or to more generic optimal transport formulations. Besides ensuring the identifiability of the task-specific information from a sketch (cf Axis 1.3), an objective is to efficiently extract this information from a sketch, for example via algorithms related to avatars of continuous sparsity as studied in Axis 1.2. A particular challenge, connected with Axis 2.1 when inferring dynamic graphs from correlation of non-stationary times series, and with Axis 3.2 below, is to dynamically adapt the sketching mechanism to the analyzed data stream.
- 3.2 Sequential sparse learning. Whether aiming at dynamically learning on data streams (cf. Axes 2.1 and 2.2), at integrating a priori physical knowledge when learning, or at ensuring domain adaptation for transfer learning, the objective is to achieve a statistically near-optimal update of a model from a sequence of observations whose content can also dynamically vary. When considering time-series on graphs, to preserve resource-efficiency and increase robustness, the algorithms further need to update the current models by dynamically integrating the data stream.
- 3.3 Dynamic-precision learning. The goal is to propose new optimization algorithms to overcome the cost of solving large scale problems in learning, by dynamically adapting the precision of the data. The main idea is to exploit multiple representations at different scales of the problem at hand. We explore in particular two different directions to build the scales of problems: a) exploiting ideas coming from multilevel optimization to propose dynamical hierarchical approaches exploiting representations of the problem of progressively reduced dimension; b) leveraging the recent advances in hardware and the possibility of representing data at multiple precision levels provided by them. We aim at improving over state-of-the-art training strategies by investigating the design of scalable multilevel and mixed-precision second-order optimization and quantization methods, possibly derivative-free.
4 Application domains
The primary objectives of this project, which is rooted in Signal Processing and Machine Learning methodology, are to develop flexible methods, endowed with solid mathematical foundations and efficient algorithmic implementations, that can be adapted to numerous application domains. We are nevertheless convinced that such methods are best developed in strong and regular connection with concrete applications, which are not only necessary to validate the approaches but also to fuel the methodological investigations with relevant and fruitful ideas. The following application domains are primarily investigated in partnership with research groups with the relevant expertise.
4.1 Frugal AI on embedded devices
There is a strong need to drastically compress signal processing and machine learning models (typically, but not only, deep neural networks) to fit them on embedded devices. For example, on autonomous vehicles, due to strong constraints (reliability, energy consumption, production costs), the memory and computing resources of dedicated high-end image-analysis hardware are two orders of magnitude more limited than what is typically required to run state-of-the-art deep network models in real-time. The research conducted in the OCKHAM project finds direct applications in these areas, including: compressing deep neural networks to obtain low-bandwidth video-codecs that can run on smartphones with limited memory resources; sketched learning and sparse networks for autonomous vehicles; or sketching algorithms tailored to exploit optical processing units for energy efficient large-scale learning.
4.2 Imaging in physics and medicine
Many problems in imaging involve the reconstruction of large scale data from limited and noise-corrupted measurements. In this context, the research conducted in OCKHAM pays a special attention to modeling domain knowledge such as physical constraints or prior medical knowledge. This finds applications from physics to medical imaging, including: multiphase flow image characterization; near infrared polarization imaging in circumstellar imaging; compressive sensing for joint segmentation and high-resolution 3D MRI imaging; or graph signal processing for radio astronomy imaging with the Square Kilometer Array (SKA).
4.3 Interactions with computational social sciences
Based on collaborations with the relevant experts the team also regularly investigates applications in computational social science. For example, modeling infection disease epidemics requires efficient methods to reduce the complexity of large networked datasets while preserving the ability to feed effective and realistic data-driven models of spreading phenomena. In another area, estimating the vote transfer matrices between two elections is an ill-posed problem that requires the design of adapted regularization schemes together with the associated optimization algorithms.
5 Highlights of the year
The paper “On the closed form of flow matching: generalization does not arise from stochasticty” 1 was accepted as an oral presentation at NeurIPS 2025 (top 0.3% of more than 22000 submissions).
The paper “Transformative or conservative? Conservation laws for ResNets and Transformers” 26 was accepted as an oral presentation at ICML 2025 (top 1% of about 12000 submissions)
The paper “Rapture of the deep: highs and lows of sparsity in a world of depths” 4 has been accepted in the Signal Processing Magazine.
Antoine Gonon, former Ph.D. student of the Ockham team, was awarded a Honorable mention (2nd ex-aequo) of the 2025 Ph.D. award of the Société Savante Francophone en Apprentissage Machine.
6 Latest software developments, platforms, open data
6.1 Latest software developments
6.1.1 skglm
-
Keywords:
Optimization, Machine learning, Sparsity
-
Functional Description:
skglm is a Python package that offers fast estimators for Generalized Linear Models (GLMs) that are compatible with scikit-learn. It is highly flexible and supports a wide range of GLMs. Its main feature is flexibility: you can implement virtually any estimator as a combination of datafit and penalty.
Thanks to this flexible design, skglm supports many missing models in scikit-learn while ensuring high performance. There are several reasons to opt for skglm:
- Support for many fast solvers able to tackle large datasets, either dense or sparse, with millions of features up to 100 times faster than scikit-learn - User-friendly API than enables composing custom estimators with any combination of existing datafits and penalties - Flexible design that makes it simple and easy to implement new datafits and penalties, a matter of few lines of code - Estimators fully compatible with the scikit-learn API and drop-in replacements of its GLM estimators
skglm is integrated into scikit-learn via the scikit-learn-contrib organization.
- URL:
- Publication:
-
Contact:
Mathurin Massias
-
Participant:
2 anonymous participants
6.1.2 Benchopt
-
Keywords:
Benchmarking, Machine learning, Optimization
-
Functional Description:
BenchOpt is a package to simplify, make more transparent and more reproducible the comparisons of optimization algorithms. It is written in Python but it is available with many programming languages. So far it has been tested with Python, R, Julia and compiled binaries written in C/C++ available via a terminal command. If it can be installed via conda, it should just work!
BenchOpt is used through a simple command line and ultimately running and replicating an optimization benchmark should be as easy a cloning a repo and launching the computation with a single command line. For now, BenchOpt features benchmarks for around 10 convex optimization problems and we are working on expanding this to feature more complex optimization problems. We are also developing a website to display the benchmark results easily.
-
Release Contributions:
https://github.com/benchopt/benchopt/releases/tag/1.5.1
- Publication:
-
Contact:
Thomas Moreau
-
Participant:
4 anonymous participants
6.1.3 lazylinop
-
Name:
lazylinop
-
Keywords:
Signal processing, Numerical algorithm, Scientific computing
-
Scientific Description:
lazylinop is an easy way to combine existing operators into more complex operators with direct access to its adjoint.
-
Functional Description:
Lazy evaluation of linear operators applied to vectors or matrices. lazylinop aims at providing an easy way to combine existing operators into more complex operators with direct access to its adjoint. Thanks to the lazy computation paradigm, lazylinop offers potential performances gains and memory sparing.
-
Release Contributions:
- Basic linear operators: Kronecker product, addition, diagonal, block-diagonal, concatenation ... - Polynomial of linear operators. - Usual signal processing linear operators. - Usual image processing linear operators. - Butterfly linear operators. - Near optimal Butterfly (real values) quantification. - Lazylinop operators take as input NumPy/CuPy arrays or torch tensors (via array-api)
Work-In-Progress: - Near optimal Butterfly (complex values) quantification.
- URL:
-
Contact:
Pascal Carrivain
-
Participant:
4 anonymous participants
6.1.4 Celer
-
Keywords:
Mathematical Optimization, Machine learning, Sparsity
-
Functional Description:
celer is a Python package that solves Lasso-like problems and provides estimators that under the popular scikit-learn API. Thanks to a tailored implementation, celer provides a fast solver that tackles large-scale datasets with millions of features up to 100 times faster than scikit-learn. It handles Lasso, ElasticNet, Group Lasso, Multitask Lasso and Sparse Logistic regression, and comes with - automated parallel cross-validation - support of sparse and dense data - optional feature centering and normalization - unpenalized intercept fitting
celer also provides easy-to-use estimators as it is designed under the scikit-learn API.
- URL:
- Publications:
-
Contact:
Mathurin Massias
-
Participant:
2 anonymous participants
6.1.5 TorchDR
-
Keywords:
Optimal transportation, Machine learning, Dimensionality reduction, High Dimensional Data
-
Scientific Description:
TorchDR is an open-source dimensionality reduction (DR) library using PyTorch. Its goal is to accelerate the development of new DR methods by providing a common simplified framework.
-
Functional Description:
TorchDR is an open-source dimensionality reduction (DR) library using PyTorch. Its goal is to accelerate the development of new DR methods by providing a common simplified framework.
- URL:
-
Contact:
Titouan Vayer
-
Participant:
5 anonymous participants
6.1.6 FAuST
-
Keywords:
Matrix calculation, Multilayer sparse factorisation
-
Scientific Description:
FAuST allows to approximate a given dense matrix by a product of sparse matrices, with considerable potential gains in terms of storage and speedup for matrix-vector multiplications.
-
Functional Description:
FAUST is a C++ toolbox designed to decompose a given dense matrix into a product of sparse matrices in order to reduce its computational complexity (both for storage and manipulation).
Faust includes Matlab and Python wrappers and scripts to reproduce the experimental results of the following papers: - Le Magoarou L. and Gribonval R,. "Flexible multi-layer sparse approximations of matrices and applications", Journal of Selected Topics in Signal Processing, 2016. - Le Magoarou L., Gribonval R., Tremblay N. "Approximate fast graph Fourier transforms via multi-layer sparse", IEEE Transactions on Signal and Information Processing over Networks, 2018 - Quoc-Tung Le, Rémi Gribonval. Structured Support Exploration For Multilayer Sparse Matrix Factorization. ICASSP 2021 – IEEE International Conference on Acoustics, Speech and Signal Processing, Jun 2021, Toronto, Ontario, Canada. pp.1-5. - Sibylle Marcotte, Amélie Barbe, Rémi Gribonval, Titouan Vayer, Marc Sebban, et al.. Fast Multiscale Diffusion on Graphs. 2021.
-
Release Contributions:
Faust 1.x contains Matlab routines to reproduce experiments of the PANAMA team on learned fast transforms.
Faust 2.x contains a C++ implementation with preliminary Matlab / Python wrappers.
Faust 3.x includes Python and Matlab wrappers around a C++ core with GPU acceleration, new algorithms.
- URL:
-
Publications:
hal-03212764, hal-01416110, hal-01627434, hal-01167948, hal-01254108, tel-01412558, hal-01156478, hal-01104696, hal-01158057, hal-03132013
-
Contact:
Remi Gribonval
-
Participant:
6 anonymous participants
7 New results
7.1 Integrating Structured Models in Machine Learning and Signal Processing
7.1.1 Physics-informed neural networks
Participants: Elisa Riccietti.
Collaboration with Alena Kopanicakova (IRIT, Toulouse), Stefania Bellavia and Mahsa Yousefi (UNIFI, Florence, Italy).
Physics-informed neural networks (PINNs) are specialized network architectures designed for the solution of partial differential equations (PDEs) that take into account the underlying physics of the problem. We investigated their use both for direct and inverse problems involving PDEs.
In the context of the postdoc of Mahsa Yousefi, we pursued the work started last year on the investigation of their ability to deal with ill-posed inverse problems, focusing especially on parameter identification problems. We have proposed a two-step training strategy that first fits the available noisy observations and later adds the physics information. The strategy is shown to improve the solution of such ill-posed problems.
In collaboration with Alena Kopanicakova, we have proposed a book chapter on scientific machine learning with a focus on the training of physics-informed neural networks, guided by the neural tangent kernel theory to correct the spectral bias.
7.1.2 Differentiable and learning-based methods for structure representation: application to sparse precision matrices
Participants: Can Pouliquen, Paulo Goncalves, Mathurin Massias, Titouan Vayer.
The PhD of Can Pouliquen, defended sucessfully in December 2025, is devoted to the estimation of structures from signals, such as sparse precision matrices. For the latter problem we have adopted the mathematical framework of the Graphical Lasso, and pursued several directions. We have introduced SpodNet, a new deep neural network architecture for positive definite matrix estimation. In particular, it is the first architecture which can guarantee a simultaneously sparse and symmetric positive definite output. This highly desirable property was so far a missing feature of existing architectures, and has many potential applications in graph learning beyond neurosciences 28. This work was accepted to ICLR 2025. We have also developed a bilevel optimization framework, that eases the tuning of individual correlation strengths in the Graphical Lasso penalty 94. Finally, we have proposed a fast and modular benchmark for the Graphical Lasso, together with high quality open source implementations of fast solvers 35.
7.1.3 New penalties and proximal operators
Participants: Anne Gagneux, Remi Gribonval, Mathurin Massias.
Collaboration with Emmanuel Soubies (CNRS, IRIT, Toulouse).
Finishing the internship work of Anne Gagneux, we have studied the properties of sorted non convex penalties. Convex sorted penalties such as SLOPE are known to automatically cluster coefficients associated to correlated variables; non convex penalties on the other hand mitigate the well-known amplitude bias of the L1 norm. Combining non-convexity with automatic grouping is therefore a promising venue. However the technical difficulties raised by such new penalties are many (non convexity, non smoothness). We have derived an algorithm based on the Pool Adjacent Violators Algorithm (PAVA) that computes the exact proximal operator of a first kind of sorted penalties (sorted MCP, sorted Log-sum). We have also extended it to compute the proximal operators of the sorted () penalties, which presented more difficulties due to non Lipschitzianity. This work has been submitted to IEEE TSP 44.
7.1.4 Inverse problems for medical imaging
Participants: Marion Foare.
Collaboration with Luis Enrique Amador Arya (Creatis, Villeurbanne), Hélène Ratiney (Creatis, Villeurbanne), Éric Van Reeth (Creatis, Villeurbanne), and Siemens Healthcare, Saint Denis
It is of particular interest in the field of medical imaging to quickly acquire low-resolution volumes (compromise between acquisition time, SNR and spatial resolution), and enhance their resolution as a post-processing step. In particular, isotropic super-resolution (ISR) techniques consist in reconstructing an isotropic volume from the combination of several anisotropic volumes acquired with different orientations.
In the context of the PhD work of Luis Enrique Amador Araya, we pursed the development of specialized piecewise-smooth variational methods combining data fitting terms with geometric priors (e.g. the Discrete Mumford-Shah model) to build faithful super-resolution images in 3D Magnetic Resonance Imaging (MRI).
In particular, we explored new regularization terms to extend this approach to multi-constrasts ISR, that is, to reconstruct isotropic and multi-contrasts high resolution images from multi-contrasts anisotropic acquisitions. Preliminary results were accepted for publication at the conference ISBI 2026.
7.1.5 Gromov hyperbolicity for tree representation of relational data
Participants: Titouan Vayer.
Collaboration with Pierre Houedry, Nicolas Courty, Florestan Martin-Baillon, Laetitia Chapel from Université Bretagne Sud.
Trees and the associated shortest-path tree metrics provide a powerful framework for representing hierarchical and combinatorial structures in data. Designing algorithms that can produce a tree from pairwise relationship between data points is a vivid subject of interest. However, most common approaches are either heuristical and lack guarantees, or perform moderately well. In 24 we develop a geometrical framework for learning such trees, based on the notion of Gromov hyperbolicity, that encodes to which extent a metric space deviate from a tree structure. We introduce a novel differentiable optimization framework, coined DeltaZero, that solves this problem. Experiments on synthetic and real-world datasets demonstrate that our method consistently achieves state-of-the-art distortion. This work was accepted in NeurIPS 2025.
7.1.6 Contrastive pre-training of transformer encoders for SEEG-based seizure onset zone detection
Participants: Paulo Goncalves.
Collaboration with Pierre Borgnat (ENS de Lyon), Julien Jung (Hôpital Neurologique, HCL, CRNL).
Within the context of his Master 2 internship, Zacharie Rodière pursued the work of Gaetan Frusque, a former PhD student in our group 70 on the clinical study of epilepsy. Zacharie developed a transformer encoder for the detection of Seizure Onset Zone (SOZ) from stereo-EEG. It integrates clinically grounded time-frequency features with spatial contrastive pre-training. While prior spatial transformer approaches analyse learned representations, the proposed method uniquely combines: (1) engineered time-frequency representations (TFRs) encoding epileptic spikes and oscillations, and (2) a contrastive objective leveraging anatomical relationships between the electrode contacts that are either inside the SOZ or outside. Attention heads provide interpretable connectivity patterns, bridging data-driven learning with the study of functional connectivity networks. Zacharie presented his preliminary results at the Graph Signal Processing Workshop 2025 37.
7.2 Deep neural networks : theory and algorithms
7.2.1 Mathematics of deep learning: rescaling invariances, generalization bounds, and conservation laws
Participants: Rémi Gribonval, Elisa Riccietti, Sibylle Marcotte, Arthur Lebeurrier, Titouan Vayer.
Collaborations with Nicolas Brisebarre (ARIC team, ENS de Lyon), and with Gabriel Peyré (DMA, ENS, Paris)
Rescaling invariance in ReLU networks. Neural networks with the ReLU activation function are described by weights and bias parameters, and implemented into a piecewise linear continuous function. Natural scalings and permutations operations on the parameters leave the realization unchanged, leading to equivalence classes of parameters that yield the same realization.
Path-embedding and path-norm based generalization bounds. The path-embedding of parameters that we introduced in 99 was invariant to such scalings but limited to strictly layered ReLU architectures. In the context of the PhD of Antoine Gonon 73 (defended on 12/11/2024), we extended it 72 to fully encompass general DAG ReLU networks with biases, skip connections and any operation based on the extraction of order statistics: max pooling, GroupSort etc. The norm of the resulting embedding is called a path-norm, and we established a general toolkit to obtain statistical generalization bounds for such modern neural networks. The resulting bounds are not only the most widely applicable path-norm based ones, but also recover or beat the sharpest known bounds of this type. These extended path-norms further enjoy the usual benefits of path-norms: ease of computation, invariance under the symmetries of the network, and improved sharpness on feed-forward networks compared to the product of operators’ norms, another complexity measure most commonly used. The versatility of the toolkit and its ease of implementation allowed us to challenge the concrete promises of path-norm-based generalization bounds, by numerically evaluating the sharpest known bounds for ResNets on ImageNet. Building on this toolkit, we more recently investigated a rescaling-invariant Lipschitz bound on the mapping from parameter space to function space and illustrated its potential for neural network pruning and quantization 22 in a paper published at ICML 2025.
Conservation laws. In the thesis of Sibylle Marcotte (defended on 21/11/2025), the above path-embedding also served as a key enabler for the analysis of conservation laws in gradient descent dynamics of ReLU networks 91. Understanding the geometric properties of gradient descent dynamics is indeed a key ingredient in deciphering the recent success of very large machine learning models. A striking observation is that trained over-parameterized models retain some properties of the optimization initialization. This "implicit bias" is believed to be responsible for some favorable properties of the trained models and could explain their good generalization properties.
Out initial work on this topic 91 was conducted with a motivation that was threefold. First, we rigorously exposed the definition and basic properties of "conservation laws", which are maximal sets of independent quantities conserved during gradient flows of a given model (e.g. of a ReLU network with a given architecture) with any training data and any loss. Then we explained how to find the exact number of these quantities by performing finite-dimensional algebraic manipulations on the Lie algebra generated by the Jacobian of the model. Finally, we provided algorithms (implemented in SageMath) to: a) compute a family of polynomial laws; b) compute the number of (not necessarily polynomial) conservation laws. We provided showcase examples that we fully work out theoretically. Besides, applying the two algorithms confirmed for a number of ReLU network architectures that all known laws are recovered by the algorithm, and that there are no other laws. Such computational tools paved the way to understanding desirable properties of optimization initialization in large machine learning models.
We then studied 92 the notion of conservation law and the corresponding algorithms for optimzation flows associated to non-Euclidean geometries and momentum-based dynamics. We characterized "all" conservation laws in this general setting. In stark contrast to the case of gradient flows, we proved that the conservation laws for momentum-based dynamics exhibit temporal dependence. Additionally, we often observed a "conservation loss" when transitioning from gradient flow to momentum dynamics. Specifically, for linear networks, our framework allowed us to identify all momentum conservation laws, which are less numerous than in the gradient flow case except in sufficiently over-parameterized regimes. With ReLU networks, no conservation law remains. This phenomenon also manifests in non-Euclidean metrics, used e.g. for Nonnegative Matrix Factorization (NMF): all conservation laws can be determined in the gradient flow context, yet none persists in the momentum case.
This year, we extended the analysis 26 to extensively cover ResNets and attention layers. For this, we first showed that basic building blocks such as ReLU (or lin- ear) shallow networks, with or without convolu- tion, have easily expressed conservation laws, and no more than the known ones. In the case of a single attention layer, we also completely de- scribed all conservation laws, and we showed that residual blocks have the same conservation laws as the same block without skip connection. We then introduce the notion of conservation laws that depend only on a subset of parameters (cor- responding e.g. to a pair of consecutive layers, to a residual block, or to an attention layer). We demonstrate that the characterization of such laws can be reduced to the analysis of the correspond- ing building block in isolation. Finally, we ex- amined how these newly discovered conservation principles, initially established in the continuous gradient flow regime, persist under discrete opti- mization dynamics, particularly in the context of Stochastic Gradient Descent (SGD).
This year we investigated the consequences of conservation laws to characterize whether a (path)lifted representation has in intrinsic training dynamics 45, as a stepping stone to so-called implicit bias analysis. We expressed a so-called intrinsic dynamic property and showed how it is related to the study of conservation laws associated with the lifting function. This lead to a simple criterion based on the inclusion of kernels of linear maps which yields a necessary condition for this property to hold. Applying our theory to general ReLU networks of arbitrary depth, with the path lifting, we showed that the dynamic is intrinsic for any initialization. In the case of linear networks with a natural lifting defined as the product of weight matrices, so-called balanced initializations were also known to enable such an intrinsic dynamic; we generalized this result to a broader class of relaxed balanced initializations, showing that, in certain configurations, these are the only initializations that ensure the intrinsic dynamic property. Finally, for the linear neural ODE associated with the limit of infinitely deep linear networks, with relaxed balanced initialization, we explicitly expressed the corresponding intrinsic dynamics.
Path-conditioning for faster training. Finally in the context of the PhD thesis of Arthur Lebeurrier, we are investigating how to leverage the path-lifiting framework to better understand the dynamic of neural networks and to eventually accelerate the training of the parameters. We plan to submit this work for ICML 2026.
7.2.2 Quantized networks: theory and algorithms
Participants: Rémi Gribonval, Elisa Riccietti, Giuseppe Carrino, Mael Chaumette.
Collaboration with Nicolas Brisebarre (ARIC team, ENS de Lyon), with Silviu Filip and El-Mehdi El arar (IRISA, Rennes), and with Theo Mary (LIP6, Paris)
Motivated by the importance of quantizing networks besides pruning them to achieve sparsity, we studied different aspects related to this topic.
Quantization of neural networks: the multi-linear case As a first step towards a better understanding of nonlinear quantized networks, we studied the simpler multi-linear case. Particularly, we investigated the problem of optimally quantizing low rank matrices by exploiting scaling invariances inherent to the optimization problem. We proposed 76, 77 an optimal solution algorithm with polynomial complexity in the dimension of the problem and exponential complexity in the number of bits. We showed that it provides much more accurate quantizations than the simple round to nearest strategy. Particularly we used this algorithm in combination with the hierarchical procedure in 90, to design a heuristic strategy to efficiently quantize the family of butterfly matrices, which very often occur in fast transforms and machine learning applications, for instance to sparsify dense neural networks. Our work may help to improve the compression rate in this context by coupling sparsification and quantization. The corresponding algorithms have been incorporated in the quantization module of the lazylinop library 6.1.3.
In the context of the thesis of Mael Chaumette we extended this approach to complex valued matrices 30. This extension is important since most of the fast transforms that involve butterfly matrices, such as the Fourier transform, are complexed valued and cannot be quantized by the previously proposed strategy. Building this extension has not been straightforward from the real case: this rised new questions and required to propose new algorithms. A journal version is in preparation as well as an implementation in the lazylinop library 6.1.3.
Quantization of neural networks: mixed-precision inference In order to further exploit the benefits of quantization in neural networks and the multiple reduced numerical formats made available by modern computer architectures, we studied the introduction of mixed precision in the inference of neural networks 42. We proposed an analysis on the propagation of the error in the forward pass of neural networks, which suggests a good rule to choose the numerical format of each line of the weight matrices, yielding a mixed-precision procedure that provides the same accuracy of classical inference but with a lower energy consumption.
Quantization of neural networks: mixed-precision training As a first step towards a mixed precision training of neural networks, in the context of the master internship and of the PhD thesis of Giuseppe Carrino, we have studied the convergence theory of the Newton's method in finite precision 38. This analysis allows for understanding the impact of the different errors on the convergence and thus to guide the choice of the precision in each step of the method, leading to a mixed-precision algorithm. Further research will deal with an extension to the stochastic case, which would be adapted to the training of neural networks.
7.2.3 Sparse regularization, unfolding, and approximation theory
Participants: Marion Foare.
Collaborations with Nelly Pustelnik (Physics lab, ENS de Lyon) and Audrey Repetti (Heriot-Watt University, Edinburgh).
In the PhD work of Hoang Trieu Vy Le, we investigated several unfolding strategies of standard proximal algorithms and their associated accelerated version in the context of image denoising, deconvolution. The goal was to study the impact of accelerated schemes on learning performance and robustness. Currently, we are studying various unrolling approaches to tackle the joint task of image restoration and edge detection. First, we proposed a two-step procedure mimicking the Blake-Zisserman minimization strategy, and relying on a smoothing Proximal Neural Network, followed by an edge detection layer (86).
On the other hand, we are working on the unrolling procedure of the (non-convex) Mumford-Shah model, which allows to jointly perfom image restoration and edge detection using a single model-based proximal neural network. The proposed architecture is significantly lighter than recent learning models designed only for edge detection, both in terms of number of learnable parameters and inference time. This work was published in Eusipco 2025 87.
7.2.4 Deep sparsity: from hardness to deformable butterfly algorithms
Participants: Rémi Gribonval, Elisa Ricietti, Pascal Carrivain.
Collaboration with Leon Zheng (Huawei), Quoc-Tung Le (TSE, Toulouse)
Matrix factorization with sparsity constraints plays an important role in many machine learning and signal processing problems such as dictionary learning, data visualization, dimension reduction.
We have deeply investigated this subject in the last years in the context of the thesis of Quoc-Tung Le 85 and Léon Zheng 106.
Building on this series of work on the hardness, tractability, and uniqueness properties of sparse matrix factorizations under various sparsity constraints 108, 89, 90, we prepared this year a tutorial paper 4 for the signal processing magazine (SPM) Special Issue ”Mathematics of Deep Learning”, in which we propose an overview on the role of sparsity in a deep learning context.
This work includes our previous results on the subject.
First of all, it includes the extension of the tractable algorithm for so-called butterfly sparsity patterns (which somehow factorizes a given matrix essentially at the cost of a single matrix-vector multiplication, with exact recovery guarantees) to so-called deformable butterlies. We have studied its performance guarantees beyond the case of matrices admitting an exact factorization 17. The corresponding algorithm has been incorporated in the lazylinop software library 6.1.3.
Second, it includes also our study on the understanding on how to fully exploit the specific structure of butterfly factors and translate it into practical time gains, published at ICML 2025 23. Specifically, we have studied how to optimize memory access to the matrix elements and we implemented a CUDA kernel to multiply on GPU a dense matrix with a deformable butterfly factor. This is also available in lazylinop 6.1.3. In the paper we benchmark our implementation against existing matrix-vector multiplication algorithms to select the optimal one.
Going beyond the linear case, the paper also includes our results on neural networks. We have indeed shown that the pitfalls that we had identified for certain sparse matrix factorization problems 90 also hold for certain sparse ReLU neural network training problems 88. In particular, there exist settings where the optimization is necessarily instable, in the sense that minimizing the loss function can only be achieved by letting some coefficients diverge to infinity.
Finally, the paper includes also our developed heuristics to handle butterfly approximations for matrices under unknown permutations of rows and/or columns 107.
7.2.5 Plug and play methods
Participants: Elisa Riccietti, Rémi Gribonval, Mathurin Massias, Anne Gagneux.
Collaboration with Emmanuel Soubies (CNRS, IRIT), Nelly Pustelnik and Julian Tachella (CNRS, ENS Lyon), Nils Laurent (LASPI Roanne)
In imaging tasks, Plug and Play (PnP) methods leverage the strength of pre-trained denoisers, often deep neural networks, by integrating them in optimization schemes, ensuring better reconstructions than classical variational methods.
In the early PhD work of Anne Gagneux, we have investigated the use of neural networks to implement convex functions. Learning convex functions has many applications in imaging (notably in Plug and Play methods) and in optimal transport. In 13 we have studied the expressive power of Input Convex Neural Networks (ICNNs), a special architectural constraint. In particular, we have shown that ICNNs are restrictive, and may require more neurons than unconstrained networks to implement a given convex function.
One of the main pitfalls of PnP methods is their slow rate of convergence and high computational cost. To overcome this, in the context of the postdoc of Nils Laurent, we have studied the use of multilevel schemes in conjunction with plug and play (PnP) methods. Since these methods involve neural networks, the strategy to integrate multilevel schemes is naturally different from the one used so far in classical image denoising problems. We have proposed 18 a multilevel PnP method that leverages images of smaller sizes and lighter denoisers at coarse levels.
7.2.6 Generative models
Participants: Anne Gagneux, Rémi Gribonval, Mathurin Massias.
Collaboration with Quentin Bertrand, Rémi Emonet (INRIA Malice, Université Jean Monnet), Ségolène Martin, Paul Hagemann, Gabriele Steidl (TU Berlin).
Since mid 2024, the team has started to study generative modelling, with an initial focus on diffusion and flow matching methods for image generation. In 6 (work done a summer internship at TU Berlin), OCKHAM PhD student Anne Gagneux has proposed to use generative models, namely flow matching, in the PnP framework. This is achieved by defining a time-dependent denoiser using a pre-trained FM model. The algorithm alternates between gradient descent steps on the data-fidelity term, reprojections onto the learned FM path, and denoising. On tasks such as denoising, super-resolution, deblurring, and inpainting, the algorithm demonstrates superior results compared to existing PnP algorithms and Flow Matching based state-of-the-art methods. The algorithm has been released publicly on GitHub.
In a collaboration with members of the Inria MALICE team, we have written an introductory blog post on flow matching, with is now considered as one of the reference materials on the topic 51.
In 1, we have shown that perfectly trained flow matching (and diffusion) models admit a closed-form solution, which can only generate points from their training data. We have shown that these models produce new data when they fail to perfectly learn their target, and that failure at small generation times was particularly important. This work was accepted as an oral presentation at NeurIPS 2025 (top 0.3% of submitted papers). We have pursued this research direction in 43, adopting a denoising perspective on the task of generating images: complementary to 6, we show how to build a generative model from a denoiser, and leverage this framework to produce new insights on the generation dynamics of flow matching.
We are currently pursuing several directions: conditional generation (text-to-image models), links with optimal transport through Schrodinger bridges, discrete flow matching for text generation, and application to molecule discovery.
7.3 Statistical learning, dimension reduction, and privacy preservation
7.3.1 Theoretical foundations of compressive learning: sketches, kernels, and optimal transport
Participants: Hugo Lebeau, Rémi Gribonval, Titouan Vayer.
The compressive learning framework proposes to deal with the large scale of datasets by compressing them into a single vector of generalized random moments, called a sketch, from which the learning task is then performed. In past works we established statistical guarantees on the generalization error of this procedure, first in a general abstract setting illustrated on PCA 2, then for the specific case of compressive -means and compressive Gaussian Mixture Modeling 75. The overall framework is described in a tutorial paper 3.
Theoretical guarantees in compressive learning fundamentally rely on comparing certain metrics between probability distributions as explored in a previous paper 10. Preliminary works on the relations between sketching and random matrix theory were conducted this year. We began to investigate the sharpness of the existing theoretical guarantees by looking at different metrics between probability distributions, which naturally arise when ones try to bound the excess risk of sketching methods.
7.3.2 Practical exploration of sketching and methods with limited resources
Participants: Etienne Lassalle, Rémi Gribonval, Titouan Vayer, Paulo Goncalves.
Collaborations with Rémi Vaudaine (previously postdoctoral researcher), Marton Karsai (CEU, Vienne, Austria) and Pierre Borgnat (Physics Lab, ENS deLyon)
We explored the sketching approach in the context of graph clustering, a key task in graph analysis. Many methods, like spectral clustering, are impractical for large graphs due to computational constraints. To address this, we introduced PASCO in 16, a sketching-based overlay that accelerates clustering algorithms. PASCO involves: 1- generating small, structure-preserving coarse graphs from the input graph, 2- running clustering algorithms in parallel on these graphs to produce partitions, and 3- aligning and merging these partitions using optimal transport. The PASCO framework is based on two key contributions: a novel global algorithm structure designed to enable parallelization and a fast, empirically validated graph coarsening algorithm that preserves structural properties. This work was published in the journal Machine Learning, 2025 and presented at ECML-PKDD 2025.
7.3.3 Dimensionality reduction and optimal transport
Participants: Titouan Vayer, Etienne Lasalle.
Collaborations with Franck Picard (DR CNRS, ENS Lyon), Chady Essouabri (intern, ENS Lyon), Hugues Van Assel (PhD student, ENS Lyon), Cédric Vincent-Cuaz (post-doctoral researcher, EPFL), Rémi Flamary (CMAP, Ecole Polytechnique), Nicolas Courty (IRISA, Université Bretagne Sud), Pascal Frossard (EPFL).
Exploring and analyzing high-dimensional data is a core problem of data science that requires building low-dimensional and interpretable representations of the data through dimensionality reduction (DR). In a series of work we provide new methods an analysis for DR, inspired from optimal transport (OT). A key requirement for dimensionality reduction is to incorporate global dependencies among original and embedded samples while preserving clusters in the embedding space. In a previous work 101, we introduced and explored an innovative nonlinear dimensionality reduction method by utilizing the optimal transport framework and entropic affinities.
Building on these results, we extended our work to generalize dimension reduction, as detailed in 9, accepted at TMLR 2025. Our approach leverages OT, specifically the Gromov-Wasserstein distance (GW), to propose a framework that simultaneously reduces both the dimensionality and the number of points in a dataset, enabling significant data compression. Notably, when the number of points is preserved, we demonstrated strong connections between our method and traditional dimensionality reduction techniques, such as spectral methods and t-SNE. We refer to our framework as "Distributional Dimension Reduction" which can be interpreted as projecting a distribution, and a geometry encoding the relationships among data points in high-dimensional space, into a lower-dimensional space using the GW perspective. Based on these principles, we developed a library for dimensionality reduction in Pytorch 6.1.5. Finally, we investigated the relations between OT and mixture models, and write a small tutorial on the subject in 48. These works are at the core of further research on OT and self-supervised learning methods, as explored during the intership of Chady Essouabri in collaboration with Franck Picard.
7.4 Large-scale convex and nonconvex optimization
7.4.1 Multilevel schemes for image restoration
Participants: Elisa Riccietti, Paulo Gonçalves, Edgar Desainte-Mareville.
Collaboration with Nelly Pustelnik (CNRS, ENS de Lyon), Nils Laurent (ENS de Lyon)
In the context of the Ph.D. work of Guillaume Lauga (defended on the 18/12/2024), we studied the combination of multilevel schemes and proximal methods 5, 83, 84, 81, 82. Pushing further in this direction, we studied the link between multilevel and block coordinate methods and their convergence analysis 33. This line of research is also the object of the PhD thesis of Edgar Desainte-Mareville. Its aim is to investigate how to unroll such multilevel strategies in order to learn important ingredients such as the transfer operators. In order to do that, an improved understanding of the link between multilevel and block methods is essential.7.4.2 Stochastic multilevel schemes
Participants: Elisa Riccietti.
Collaboration with Margherita Porcelli (UNIFI, Firenze, Italy) and Filippo Marini (UNIBO, Bologna, Italy)
Classical deterministic multilevel schemes are limited by the need of regularly handling the high level expensive objective function and are usuited to solve stochastic problems such as expected risk minimization. We proposed a stochastic extension of the multilevel framework 46 that does not require the finest approximation to coincide with the original objective function along all the optimization process. This allows for significantly decreasing the cost of the multilevel paradigm, for instance in data-fitting problems, where considering all the data at each iteration can be avoided.
7.4.3 Reproducible benchmarking of optimization algorithms
Participants: Mathurin Massias, Florian Kozikowski.
Collaboration with Thomas Moreau (MIND, Inria Saclay), Badr Moufad (Ecole Polytechnique), Nelly Pustelnik (CNRS, ENS de Lyon).
The team continues working on reproducible optimisation benchmarks, with Benchopt 7, a collaborative framework to automate, reproduce and publish benchmarks in machine learning across programming languages and hardware architectures. We continued to publish open source implementations of state-of-the-art solvers on major ML problems, and a detailed comparison of the regimes in which they succeed and fail respectively. In 2025, thanks to the internship of Florian Kozikowski, we implemented new benchmarks (Poisson regression). We are currently planning to develop benchmarks related to generative models.
7.4.4 Algorithms for large scale sparse linear models
Participants: Mathurin Massias.
Collaboration with Quentin Bertrand (INRIA MALICE), Badr Moufad (Ecole Polytechnique)
Based on our seminal works in 93 and 59, we continued to develop and implement new state-of-the-art solvers for optimization problems with millions of variables in the context of sparse linear models 58, implemented in the skglm package (see Section 6.1.1), that was integrated into the ecosystem of the scikit-learn package. In 2025, the internship work of Florian Kozikowski allowed implementing new solvers (Poisson, Group Poisson and Gamma regression) as well as a complete rewriting of the documentation.
8 Bilateral contracts and grants with industry
8.1 Bilateral grants with industry
-
CIFRE contract with CNES, Paris on "Optimized on-board decision with fast energy-efficient neural networks". This PhD thesis is in collaboration with Stéphane May, engineer at CNES.
Participants: Rémi Gribonval, Titouan Vayer, Arthur Lebeurrier.
Duration: 3 years (2024-2027)
Partners: CNES, Paris; ENS de Lyon
Funding: CNES, Paris; PEPR IA SHARP
Context: ANR Chaire IA AllegroAssai 9.2.2
This thesis aims to develop compact, high-performance neural networks tailored to on-board constraints, enabling optimized decision-making on low-energy platforms. It includes an exploration of parsimony structures suited for deep networks and a comprehensive study of quantization and optimization techniques for neural networks.
-
Funding from Facebook Artificial Intelligence Research, Paris
Participants: Rémi Gribonval.
Duration: 5 years (2021-2025)
Partners: Facebook Artificial Intelligence Research, Paris; ENS de Lyon
Funding: Facebook Artificial Intelligence Research, Paris
Context: Chaire IA AllegroAssai 9.2.2
This is supporting the research conducted in the framework of the Chaire IA AllegroAssai.
9 Partnerships and cooperations
9.1 International research visitors
Laurent JACQUES
-
Status:
researcher
-
Institution of origin:
Université de Louvain
-
Country:
Belgium
-
Dates:
Sept. 1, 2025 till June 30, 2026
-
Context of the visit:
Inria chair from the Collegium of Lyon
-
Mobility program/type of mobility:
sabbatical
9.2 National initiatives
9.2.1 PEPR IA project : SHARP
Participants: Rémi Gribonval [correspondant], Paulo Goncalves, Elisa Ricietti, Marion Foare, Mathurin Massias, Titouan Vayer, Arthur Lebeurrier, Mael Chaumette.
Partnership with LAMSADE (PSL); LIGM (ENPC); GENESIS (Inria London & University College London); IRISA; CEA List; ISIR (Sorbonne Université)
Duration of the project: 2023 - 2029.The vision of the SHARP proposal is that the resources required to train ML models can be decreased by several orders of magnitude, with negligible performance loss compared to the state of the art. This means significantly reducing the dimensionality of predictors (to reduce inference costs) and of their gradients (to reduce training and bandwidth costs in distributed settings), the amount of data needed to learn (to address data scarce settings up to zero-shot learning, and incremental learning scenarios), and compressing datasets before learning (to reduce storage and compute requirements, and address privacy concerns).
9.2.2 ANR IA Chaire : AllegroAssai
Participants: Rémi Gribonval [correspondant], Paulo Goncalves, Elisa Ricietti, Marion Foare, Mathurin Massias, Léon Zheng, Quoc-Tung Le, Antoine Gonon, Titouan Vayer, Ayoub Belhadji, Clement Lalanne, Can Pouliquen.
Past members: Luc Giffon.
Duration of the project: 2020 - 2025.
AllegroAssai focuses on the design of machine learning techniques endowed both with statistical guarantees (to ensure their performance, fairness, privacy, etc.) and provable resource-efficiency (e.g. in terms of bytes and flops, which impact energy consumption and hardware costs), robustness in adversarial conditions for secure performance, and ability to leverage domain-specific models and expert knowledge. The vision of AllegroAssai is that the versatile notion of sparsity, together with sketching techniques using random features, are key in harnessing these fundamental tradeoffs. The first pillar of the project is to investigate sparsely connected deep networks, to understand the tradeoffs between the approximation capacity of a network architecture (ResNet, U-net, etc.) and its “trainability” with provably-good algorithms. A major endeavor is to design efficient regularizers promoting sparsely connected networks with provable robustness in adversarial settings. The second pillar revolves around the design and analysis of provably-good end-to-end sketching pipelines for versatile and resource-efficient large-scale learning, with controlled complexity driven by the structure of the data and that of the task rather than the dataset size.
9.2.3 ANR DataRedux
Participants: Paulo Goncalves [correspondant], Rémi Gribonval, Marion Foare.
Collaboration with Marton Karsai (former PI, ECU Austria), Pierre Borgnat (ENS de Lyon)
Duration of the project: February 2020 - January 2024 prolonged to March 31, 2026.
DataRedux puts forward an innovative framework to reduce networked data complexity while preserving its richness, by working at intermediate scales (“mesoscales”). Our objective was to contribute to the theoretical understanding and representation of rich and complex networked datasets for use in predictive data-driven models. Our main novelty has been to define network reduction techniques in two particular usecases: one in relation with the dynamical processes occurring on the networks, and the second related to the clustering of large size graphs. Both approches relied on the extracting information and knowledge at different scales in a human-accessible way by extracting structures from high-resolution, diverse and heterogeneous data.
Our guideline in the DataRedux project was to identify methods for aggregating data at intermediate scales and new types of data representations related to dynamic processes, which preserve the richness of information contained in the original data, while retaining their most relevant models for easy integration into data-based digital models to facilitate decision-making and obtain actionable information.
9.2.4 ANR JCJC EROSION
Participants: Mathurin Massias.
Duration of the project: December 2023 - December 2026.
Collaboration with Emmanuel Soubies (PI of the project, CNRS, IRIT), Paul Escande (CR CNRS, I2M), Cédric Févotte (DR CNRS, IRIT), Henrique Goulart (MdC INP, IRIT) and Joseph Salmon (Prof. Université de Montpellier, IMAG)
The promise of EROSION is to push the frontiers of sparse and low-rank optimization by combining the strengths of exact relaxations and local optimization. More precisely, we propose to move away from the appealing convex relaxation requiring too strong assumptions to ensure the equivalence with the original problem. Instead, EROSION will address the following two research objectives. 1 : Deriving exact relaxations of regression (= same global minimizers) which, although still non-convex, are more amenable to non-convex local optimization (e.g., less local minimizers, wider basins of attraction). 2 : Developing new local optimization strategies that exploit the nice properties of such exact relaxations so as to improve both the quality of reached local extrema and the convergence speed over existing solvers.
In OCKHAM, this collaboration has lead to the internship of Anne Gagneux (co-supervized with Emmanuel Soubiès), on the design of new sorted non-convex penalties and the computation of their proximal operators.
9.2.5 ANR JCJC MEPHISTO
Participants: Elisa Riccietti [correspondant].
Duration of the project: November 2024 - November 2028.
This project focuses on large scale optimization problems in signal processing and imaging. We consider a special class of such problem: those that admit a hierarchical structure. The aim of the project is to develop parsimonious methods for their solution by exploiting such underlying structure. We will focus on four different kinds of hierarchical structures: those arising from the geometry or physics of the problem (such as multiple resolutions in images or discretization of infinite dimensional problems); those that can be built by exploiting the analytical structure of some problems (training of neural networks, data-fitting problems); those that can be built exploiting the intrinsic structure of the algebraic tools involved (matrix, tensors, such as in matrix factorization problems); those that can be built exploiting multiple numerical formats (floating point numbers with reduced number of bits) .
The ambition of this project is thus to develop a large family of parsimonious multiresolution, multilevel and multiprecision algorithms that are not only efficient but that can also rely on solid mathematical foundations.
9.2.6 Defi Hive Inria Cupseli
Participants: Elisa Riccietti [correspondant], Remi Gribonval.
Duration of the project: September 2025-September 2028.
The Cupseli challenge aims to demonstrate that it is possible to run complex applications on heterogeneous, distributed, and volatile resources, while achieving good parallel efficiency and preserving both accuracy and confidentiality. It explores algorithmic and system-level solutions to optimize computation, memory, and communication, while ensuring security and fault tolerance. The work is organized around three main axes: Frugality (adapting training and inference to limited and dynamic resources), Security and confidentiality (protecting data and models through encryption, secure enclaves, and defenses against attacks), and Volatility (ensuring robustness and performance despite the unpredictable arrival and departure of resources).
9.2.7 DI2A - Subvention Simone et Cino del Duca, Institut de France.
Participants: Elisa Riccietti, Marion Foare, Paulo Goncalves.
Duration of the project: December 2023 - December 2025.
This project focuses on the physics-informed design of architectures and multiresolution deep learning techniques for large scale image restoration and data analysis for astronomy. With the term physics-informed design we refer to all the deep learning strategies in which the choice of the architecture, biases and activation functions of neural networks is guided by the underlying physics of data acquisition and/or from the optimization proximal schemes employed for the solution. From an application point of view, the project targets problems in astronomy and specifically the study of circumstellars environments through the instrument SPHERE/IRDIS. We aim to propose innovative reconstruction approaches partially supervised or even non supervised.
9.2.8 GDR ISIS project PROSSIMO
Participants: Mathurin Massias [correspondant], Rémi Gribonval, Anne Gagneux, Emmanuel Soubies.
Duration of the project: September 2023 - September 2025.
Composite optimisation problems are ubiquitous in machine learning, signal, and image processing. With the proximal algorithms used to solve them, they have met with great success in applications and have been extensively studied. More recently, so-called 'plug-and-play' (PNP) methods, inspired by proximal algorithms, propose new iterative algorithms in which the application of the proximal operator of the regulariser is replaced by a pre-existing denoiser or a learned operator. Their flexibility, however, complicates their theoretical analysis, because in the general case the operator does not have the interesting properties of proximal operators. In the PROSSIMO project, we propose to implement and study PNP operators via neural networks, while guaranteeing that these operators have the same properties as proximal operators. We aim at combining the flexibility of PNP methods with the rigorous theoretical guarantees of model-based methods. In addition to implementing such networks, we propose to study their approximation capacity: what classes of function can they approximate, and at what speed?
9.2.9 ANR TSIA BenchArk
Participants: Mathurin Massias [correspondant].
Duration of the project: October 2024 - October 2028.
Collaboration with Thomas Moreau, Gaël Varoquaux (INRIA Saclay) and Joseph Salmon (INRIA Montpellier).
Numerical evaluation of novel methods, a.k.a. benchmarking, is a pillar of the scientific method in machine learning. However, due to practical and statistical obstacles, the reproducibility of published results is currently insufficient: many details can invalidate numerical comparisons, from insufficient uncertainty quantification to improper methodology. In 2022, the Benchopt initiative provided an open source Python package together with a framework to seamlessly run, reuse, share and publish benchmarks in numerical optimization. The BenchArk project aims at bringing Benchopt to the whole machine learning community, making it a new standard in benchmarking by empowering researchers and practitioners with efficient and valid benchmarking methods. Our goal is to ensure reproducibility and consistency in model evaluation. We will federate the machine learning community to develop informative and statistically valid benchmarks, while providing methods to reduce identified hurdles in implementing such practices.
9.2.10 ANR SEIZURE
Participants: Paulo Goncalves [correspondant], Can Pouliquen.
Duration of the project: September 2024 - August 2028
Collaboration with Carole Lartizien (PI of the project, CNRS, Insa de Lyon, CREATIS), Julien Jung (MD-PhD, Hospices Civils de Lyon, CRNL), Pierre Borgnat (CNRS, ENS de Lyon, Physics Lab).
“Seeing the EpileptogenIc Zone through machine Learning on strUctuRal, functional and clinical nEurological data”
This project deals with the multimodal detection and the characterisation of epileptic zones in neuroimaging and intracranial EEG (iEEG). Ockham is mainly involved in WP3 (P. Borgnat leader) that aims at analysing the propagation of biomarkers within the brain as an indicator of the dynamic interictal epileptogenic network. A detailed understanding of the brain network and its key hubs provides invaluable insights into surgical outcomes. In a previous PhD work (G. Frusque, 2017-2020) we derived graphical lasso techniques on iEEG data to infer graphs times series, as relevant connectivity networks. In Seizure, we envision to enrich our previous approaches with deep learning based models and more specifically with graph recurrent neural networks and neural implicit representations.
10 Dissemination
10.1 Promoting scientific activities
10.1.1 Scientific events: organisation
- Organization of the COLT 2025 conference, (30/06/25 – 04/07/25), Remi Gribonval
- Organization of the GDR IASIS Thematic Day on Flow matching, diffusion and their applications (24/10/25) Mathurin Massias
- Organization of the GDR IASIS Thematic Day on optimal transport and machine learning (17/02/25) Titouan Vayer
- SMAI minisymposium on generative modelling, optimal transport and image restoration, (02/06/25 – 06/06/25) Mathurin Massias
- One-day workshop Sharp and Foundry: On Frugal and Robust Foundations for Machine Learning, ENS Lyon (30/06/25) Remi Gribonval
10.1.2 Scientific events: selection
Member of the conference program committees
- Mathurin Massias – Area Chair for NeurIPS, ICML.
- Titouan Vayer – Member of the GRETSI program comittee, area chair for ICML.
-
Rémi Gribonval
- MIA'25 Program Committee;
- Organizer of a Minisymposium on "Mathematical aspects of deep learning", Curves & Surfaces 2026, St-Malo, June 8-12 2026;
- Scientific board of JRAF (Journées de recherche en apprentissage frugal), Grenoble, Nov 26-27 2025;
- Scientific board of a workshop on the Mathematics of AI, Institut de mathématiques de Bordeaux, Nov 4-6 2026
Organization of the weekly "Machine Learning and Signal Processing (MLSP)" seminar (about twenty presentations in 2025) Marion Foare ; Paulo Goncalves ; Remi Gribonval ; Mathurin Massias ; Elisa Riccietti ; Titouan Vayer
10.1.3 Journal
Member of the editorial boards
- Mathurin Massias – Associate Editor for TMLR
- Remi Gribonval – Associate Editor for Constructive Approximation (Springer); founding member of the Editorial Board of Mathematical Foundations of Machine Learning (Springer), Senior Area Editor for the IEEE Signal Processing Magazine
10.1.4 Invited talks
- Elisa Riccietti – Journée AILYS, ENS Lyon, 14/02/2025
- Elisa Riccietti – MIA25 conference, Paris, 13/01/2025-15/01/2025.
- Anne Gagneux – Séminaire Imaging In Paris, 06/05/25
- Mathurin Massias – Séminaire Palaisien, 04/11/25
- Mathurin Massias – Séminaire IMAGINE, 05/11/25
-
Remi Gribonval
- Rencontre nationale du RT Optimisation, INSA Lyon, Nov 26-28 2025
- Workshop “(Blind) inverse problems in imaging: from foundations to applications”, CIRM, Luminy, Sep 29-Oct 3 2025
- Festum Pi Mathematics Conference, Chania, Crete, July 21-25 2025;
- Workshop on Mathematics of Data Science, Cente Lagrange, Paris, May 13-15 2025
- PEPR IA Days, CentraleSupelec, Mar 18th 2025
- as well as invited seminars: DATASHAPE Team, Inria Saclay, Nov 6 2025; Talk @MALGA Seminar - Genova, April 28th 2025, Apr 28th 2025; Journée AILYS at ENS Lyon, Feb 14th 2025; Séminaire "mathématiques de l'IA", IMB, Bordeaux, Jan 30th 2025; Séminaire MMCS de l'ICJ, Lyon 1 , Jan 7th 2025
10.1.5 Leadership within the scientific community
-
Remi Gribonval
- Scientific Committee of RT MAIAGES (formerly RT/GDR MIA);
- Comité de Liaison SIGMA-SMAI;
- Board of the GRETSI association;
- Cellule ERC of Inria, mentoring for ERC candidates in computer science and applied mathematics at the national Inria level
- Mathurin Massias – Secretary of the MODE group of SMAI
10.1.6 Scientific expertise
- Remi Gribonval – Scientific Advisory Board of the Acoustics Research Institute of the Austrian Academy of sciences
- Elisa Riccietti – Scientific Board of the Federation Informatique de Lyon (Conseil Scientifique de la FIL)
10.1.7 Research administration
-
Paulo Goncalves
- member of the steering committee for the ShapeMed@Lyon consortiums Data for Health workshop
- Scientific Director of the Inria Centre of Lyon and member of the Inria Evaluation Committee.
10.2 Teaching - Supervision - Juries - Educational and pedagogical outreach
10.2.1 Teaching
- Master:
- Elisa Riccietti – Optimisation (ENS Lyon) and Harnessing inexactness in scientific computing (ENS Lyon)
- Mathurin Massias – Python for datascience (Ecole Polytechnique), Statistics (Ecole Polytechnique),Optimal Transport for Machine and Deep Learning (ENS Lyon), Fundamentals of Machine Learning (ENS Lyon), Generative Models (ENS Lyon)
- Titouan Vayer – Optimal Transport for Machine and Deep Learning (ENS Lyon), Fundamentals of Machine Learning (ENS Lyon)
- Marion Foare – Image and Signal Processing, Inverse problems and optimization (CPE Lyon)
- Paulo Goncalves – Image and Signal Processing (CPE Lyon)
- Remi Gribonval – Inverse problems and high dimension; Mathematical foundations of deep neural networks; Concentration of measure in probability and high-dimensional statistical learning; M2, ENS Lyon
10.2.2 Supervision
All PhD students of the team are co-supervised by at least one team member. In addition, some team members are involved in co-supervisions of students hosted in other labs:
- Elisa Riccietti – co-supervision of the PhD of Filippo Marini with Margherita Porcelli (Università di Bologna) – defence on 16/06/2025
- Remi Gribonval – co-supervision of the PhD of Sibylle Marcotte with Gabriel Peyré since 2022 (Center for Data Science, ENS Paris) – defense on 21/11/2025
- Marion Foare – co-supervision of the PhD of Luis Enrique Amador Arya with Hélène Ratiney and Éric Van Reeth (Creatis, Villeurbanne) and Siemens Healthcare (Saint Denis) since 2023
PhD defenses in Ockham in 2025:
- Can Pouliquen
10.2.3 Juries
Members of the Ockham team participated in the following juries :
- Elisa Riccietti – PhD defence of Iskander Legheraba (Dauphine Université, Paris), CSI of Xavier Pillet (PhD student, Lyon 1 University)
- Mathurin Massias – CSI of Yu-Han Wu (PhD Student, Sorbonne Université)
- Paulo Goncalves – PhD defense of Valerian Mange (U Toulouse), CSI of Andréa Ducos (PhD Student, Lyon 1 University)
- Titouan Vayer – Junjie Yang (07/04/2025, examiner), member of the CSI for Antonin Joly (PhD Student, IRISA), Antoine Monier (PhD Student, IRISA).
- Remi Gribonval – PhD defenses of: Armand Foucault (26/05/25, Université de Toulouse, reviewer); Blaise Delattre (16/2/25, Dauphine PSL, reviewer); Manon Verbockhaven (28/03/25, Université Paris-Saclay, reviewer); Maud Biquard (5/11/25, Université de Toulouse, president); Mimoun Mohamed (31/03/25, Aix-Marseille Université, examiner); Volodimir Mitarchuk (17/01/25, Université Jean Monnet Saint-Étienne, president); Pierre Warion (19/11/25, Aix-Marseille Université, examiner); Romain Verdière (8/12/25, Université Grenoble Alpes, president).
11 Scientific production
11.1 Major publications
- 1 inproceedingsOn the Closed-Form of Flow Matching: Generalization Does Not Arise from Target Stochasticity.NeurIPS 2025NeurIPS 2025 - 39th Annual Conference on Neural Information Processing SystemsSan Diego (CA), United StatesDecember 2025HALback to textback to text
- 2 articleCompressive Statistical Learning with Random Feature Moments.Mathematical Statistics and Learning32August 2021, 113–164HALDOIback to text
- 3 articleSketching Data Sets for Large-Scale Learning: Keeping only what you need.IEEE Signal Processing Magazine385September 2021, 12-36HALDOIback to text
- 4 articleRapture of the deep: highs and lows of sparsity in a world of depths.IEEE Signal Processing MagazineJune 2025, 22 p.HALback to textback to text
- 5 articleIML FISTA: A Multilevel Framework for Inexact and Inertial Forward-Backward. Application to Image Restoration.SIAM Journal on Imaging SciencesJune 2024HALDOIback to text
- 6 proceedingsPNP-FLOW: Plug-And-Play Image Restoration with Flow Matching.International Conference on Learning RepresentationsSingapore, SingaporeApril 2025HALback to textback to text
- 7 inproceedingsBenchopt: Reproducible, efficient and collaborative optimization benchmarks.NeurIPS 2022 - 36th Conference on Neural Information Processing SystemsNew Orleans, United StatesNovember 2022HALback to text
- 8 articleFourier could be a Data Scientist: from Graph Fourier Transform to Signal Processing on Graphs.Comptes Rendus. PhysiqueSeptember 2019, 474-488HALDOIback to text
- 9 articleDistributional Reduction: Unifying Dimensionality Reduction and Clustering with Gromov-Wasserstein.Transactions on Machine Learning Research JournalJune 2025HALback to text
- 10 articleControlling Wasserstein Distances by Kernel Norms with Application to Compressive Statistical Learning.Journal of Machine Learning Research24149April 2023, 1--51HALback to text
11.2 Publications of the year
International journals
Invited conferences
International peer-reviewed conferences
National peer-reviewed Conferences
Conferences without proceedings
Doctoral dissertations and habilitation theses
Reports & preprints
Other scientific publications
Software
11.3 Cited publications
- 56 bookConvex analysis and monotone operator theory in Hilbert spaces.408Springer2011back to text
-
57
article
-PageRank for Semi-Supervised Learning.Applied Network Science4572019, 1-20HALDOIback to text - 58 articleBeyond l1: Faster and better sparse models with skglm.Advances in Neural Information Processing Systems352022, 38950--38965back to text
- 59 articleImplicit differentiation for fast hyperparameter selection in non-smooth convex learning.Journal of Machine Learning Research231April 2022, 6680 - 6722HALback to text
- 60 bookH.Holger Boche, R.Robert Calderbank, G.Gitta Kutyniok and J.Jan Vybiral, eds. Compressed Sensing and its Applications.Series: Applied and Numerical Harmonic AnalysisMATHEON Workshop 2013ISSN: 2296-5009Please note that you have the right to download and disseminate single chapters from the book that are authored by you and that are created and provided by Springer only for your private and professional non-commercial research and classroom use (e.g. sharing the chapter by mail or in hardcopy form with research colleagues for their professional non-commercial research and classroom use, or to use it for presentations or handouts for students). You are also entitled to use single chapters for the further development of your scientific career (e.g. by copying and attaching chapters to an electronic or hardcopy job or grant application). If you are an editor, book author or chapter author, please ask the (co)-author(s) of the respective individual chapter for approval before you share it with other scientists since sharing chapters requires the prior consent of any co-author(s) of the chapter. Posting of the book or a chapter on your homepage or deposit on repositories of third parties is not allowed.ChamBirkhäuser, Cham2015, URL: http://books.google.cz/books?id=6KoYCgAAQBAJ&pg=PA340&dq=intitle:Compressed+Sensing+and+its+Applications&hl=&cd=1&source=gbs_apiDOIback to text
- 61 articleExact Reconstruction using Beurling Minimal Extrapolation.arXiv.orgarXiv: 1103.4951v2March 2011, URL: http://arxiv.org/abs/1103.4951v2back to text
- 62 articleCompressive Learning with Privacy Guarantees.Information and Inference2021HALback to textback to text
- 63 incollectionProximal splitting methods in signal processing.Fixed-point algorithms for inverse problems in science and engineeringSpringer2011, 185--212back to text
- 64 articleDistributed Adaptive Learning of Graph Signals.IEEE Transaction on Signal Processing65162017back to text
- 65 bookCooperative and Graph Signal Processing: Principle and Applications.Academic Press2018back to text
- 66 bookSparse and Redundant Representations.From Theory to Applications in Signal and Image ProcessingSpringer2010, URL: http://books.google.fr/books?id=d5b6lJI9BvAC&printsec=frontcover&dq=sparse+and+redundant+representations&hl=&cd=1&source=gbs_apiback to text
- 67 articleSemi-Linearized Proximal Alternating Minimization for a Discrete Mumford-Shah Model.IEEE Transactions on Image Processing29October 2019, 2176-2189HALDOIback to text
- 68 bookA Mathematical Introduction to Compressive Sensing.New York, NYSpringer2013, URL: http://link.springer.com/10.1007/978-0-8176-4948-7DOIback to text
- 69 articleSparse inverse covariance estimation with the graphical lasso.Biostatistics932008, 432--441back to text
- 70 phdthesisInférence et décomposition modale de réseaux dynamiques en neurosciences.2020LYSEN0802020, URL: http://www.theses.fr/2020LYSEN080/documentback to text
- 71 articleTranslation on Graphs: An Isometric Shift Operator.IEEE Signal Processing Letters2212December 2015, 2416 - 2420HALDOIback to text
- 72 inproceedingsA path-norm toolkit for modern networks: consequences, promises and challenges.International Conference on Learning RepresentationsErratum: in the published version there was a typo in the definition of the activation matrix in Definition A.3. This is fixed with this new version.Wien, AustriaMay 2024HALback to text
- 73 phdthesisHarnessing symmetries for modern deep learning challenges : a path-lifting perspective.Ecole normale supérieure de lyon - ENS LYONNovember 2024HALback to text
- 74 articleCompressive Statistical Learning with Random Feature Moments.Mathematical Statistics and Learning2021, URL: https://hal.inria.fr/hal-01544609back to textback to text
- 75 articleStatistical Learning Guarantees for Compressive Clustering and Compressive Mixture Modeling.Mathematical Statistics and Learning32This preprint results from a split and profound restructuring and improvements of of https://hal.inria.fr/hal-01544609v2It is a companion paper to https://hal.inria.fr/hal-01544609v3August 2021, 165--257HALDOIback to text
- 76 unpublishedOptimal quantization of rank-one matrices in floating-point arithmetic---with applications to butterfly factorizations.June 2023, working paper or preprintHALback to text
- 77 inproceedingsScaling is all you need: quantization of butterfly matrix products via optimal rank-one quantization.Actes du GRETSI 2023Actes du GRETSI 20232023-1193Grenoble, FranceGRETSI - Groupe de Recherche en Traitement du Signal et des ImagesAugust 2023, 497-500HALback to text
- 78 articleStructured Variable Selection with Sparsity-Inducing Norms.Journal of Machine Learning Research12Publisher: Massachusetts Institute of Technology Press2011, 2777--2824URL: http://hal.inria.fr/inria-00377732back to text
- 79 articleA unified Framework for Structured Graph Learning via Spectral Constraints.Journal of Machine Learning Research212020, 1--60back to text
- 80 inproceedingsCoordinate Descent for SLOPE.Proceedings of The 26th International Conference on Artificial Intelligence and StatisticsValencia, SpainApril 2023HALback to text
- 81 inproceedingsA multilevel framework for accelerating uSARA in radio-interferometric imaging.European Signal Processing Conference (EUSIPCO)Lyon, FranceAugust 2024HALDOIback to text
- 82 articleMéthodes multi-niveaux pour la restauration d'images hyperspectrales.Colloque GRETSI, September 20232023back to text
- 83 inproceedingsMéthodes proximales multi-niveaux pour la restauration d'images.GRETSI'22 - 28ème Colloque Francophone de Traitement du Signal et des ImagesNancy, FranceSeptember 2022HALback to text
- 84 inproceedingsMultilevel FISTA for image restoration.IEEE International Conference on Acoustics, Speech, and Signal ProcessingIEEERhodes, GreeceJune 2023HALDOIback to text
- 85 phdthesisAlgorithmic and theoretical aspects of sparse deep neural networks.Ecole normale supérieure de lyon - ENS LYONDecember 2023HALback to text
- 86 unpublishedEmbedding Blake-Zisserman Regularization in Unfolded Proximal Neural Networks for Enhanced Edge Detection.2024, HALback to text
- 87 unpublishedUnfolded discrete Mumford-Shah functional for joint image denoising and edge detection.2025, HALback to text
- 88 inproceedings Does a sparse ReLU network training problem always admit an optimum? Advances in Neural Information Processing Systems 36 (NeurIPS 2023) Advances in Neural Information Processing Systems 36 (NeurIPS 2023) New Orleans (Lousiane), United States December 2023 HAL back to text
- 89 articleSpurious Valleys, NP-hardness, and Tractability of Sparse Matrix Factorization With Fixed Support.SIAM Journal on Matrix Analysis and Applications2022HALback to text
- 90 inproceedingsFast learning of fast transforms, with guarantees.ICASSP 2022 - IEEE International Conference on Acoustics, Speech and Signal ProcessingThis paper is associated to code for reproducible research available at https://hal.inria.fr/hal-03552956Singapore, SingaporeMay 2022HALDOIback to textback to textback to text
- 91 inproceedingsAbide by the Law and Follow the Flow: Conservation Laws for Gradient Flows.Advances in Neural Information Processing Systems 36 (NeurIPS 2023)Advances in Neural Information Processing Systems 36 (NeurIPS 2023)New Orleans (Louisiane), United StatesDecember 2023HALback to textback to text
- 92 inproceedingsKeep the Momentum: Conservation Laws beyond Euclidean Gradient Flows.Forty-first International Conference on Machine LearningAccepted to ICML 2024Vienna, AustriaJuly 2024HALback to text
- 93 articleDual Extrapolation for Sparse Generalized Linear Models.Journal of Machine Learning Research21234October 2020, 1-33HALback to text
- 94 inproceedingsImplicit Differentiation for Hyperparameter Tuning the Weighted Graphical Lasso.GRETSI 2023 - XXIXème Colloque Francophone de Traitement du Signal et des ImagesGrenoble (France), FranceAugust 2023, 1-4HALback to text
- 95 inproceedingsRandom features for large-scale kernel machines.Replace implicit mapping of kernel trick by explicit nonlinear mapping from R⌃2007back to text
- 96 articleSub-sampled Newton methods.Math. Program.1742019, 293-326DOIback to text
- 97 articleThe Emerging Field of Signal Processing on Graphs.IEEE Signal Processing MagazineMay 2013, 83--98back to text
- 98 articleHilbert Space Embeddings and Metrics on Probability Measures..JMLR11Theorem 21 relates Wasserstein metric to Kernel metric2010, 1517--1561URL: http://dblp.org/rec/journals/jmlr/SriperumbudurGFSL10back to text
- 99 articleAn Embedding of ReLU Networks and an Analysis of their Identifiability.Constructive Approximation572023, pages 853--899HALDOIback to text
- 100 articleDictionary Learning.IEEE Signal Processing Magazine28227--38URL: http://ieeexplore.ieee.org/lpdocs/epic03/wrapper.htm?arnumber=5714407DOIback to text
- 101 inproceedingsSNEkhorn: Dimension Reduction with Symmetric Entropic Affinities.Thirty-seventh Annual Conference on Neural Information Processing Systems (NeurIPS)NeurIPS 2023 conference paperNew Orleans, United StatesDecember 2023HALback to text
- 102 inproceedingsE2-train: Training state-of-the-art cnns with over 80% energy savings.Advances in Neural Information Processing Systems2019, 5138--5150back to text
- 103 inproceedingsSWALP: Stochastic weight averaging in low precision training.International Conference on Machine Learning2019, 7015--7024back to text
- 104 articleADAHESSIAN: An adaptive second order optimizer for machine learning.arXiv preprint arXiv:2006.007192020back to text
- 105 inproceedingsNonconvex Sparse Graph Learning under Laplacian Constrained Graphical Model.34th Conference on Neural Information Processing Systems2020back to text
- 106 phdthesisData frugality and computational efficiency in deep learning.Ecole normale supérieure de lyon - ENS LYONMay 2024HALback to text
- 107 inproceedingsFactorisation butterfly par identification algorithmique de blocs de rang un.XXIXème Colloque Francophone de Traitement du Signal et des ImagesGrenoble, FranceAugust 2023HALback to text
- 108 articleEfficient Identification of Butterfly Sparse Matrix Factorizations.SIAM Journal on Mathematics of Data Science2022HALback to text