OCKHAM

OCKHAM - 2025

2025Activity‌‌ reportProject-TeamOCKHAM

RNSR: 202324392T

Research center Inria‌ Lyon Centre
In partnership‌ with:Ecole normale supérieure‌‌ de Lyon, Université Claude Bernard (Lyon 1)
Team‌ name: Optimization, pHysical Knowledge,‌ Algorithms and Models
In‌‌ collaboration with:Laboratoire de‌ l'Informatique du Parallélisme (LIP)

Creation of the Project-Team:‌ 2023 March 01

Each year, Inria research teams‌ publish an Activity Report presenting their work and‌ results over the reporting period. These reports follow‌ a common structure, with some optional sections depending‌ on the specific team. They typically begin by‌ outlining the overall objectives and research programme, including‌ the main research themes, goals, and methodological approaches.‌ They also describe the application domains targeted by‌ the team, highlighting the scientific or societal contexts‌ in which their work is situated.

The reports‌ then present the highlights of the year, covering‌ major scientific achievements, software developments, or teaching contributions.‌ When relevant, they include sections on software, platforms,‌ and open data, detailing the tools developed and‌ how they are shared. A substantial part is‌ dedicated to new results, where scientific contributions are‌ described in detail, often with subsections specifying participants‌ and associated keywords.

Finally, the Activity Report addresses‌ funding, contracts, partnerships, and collaborations at various levels,‌ from industrial agreements to international cooperations. It also‌ covers dissemination and teaching activities, such as participation‌ in scientific events, outreach, and supervision. The document‌ concludes with a presentation of scientific production, including‌ major publications and those produced during the year.‌

Keywords

Computer Science and Digital Science

A3.5. Social‌ networks
A3.5.1. Analysis of large graphs
A5.3.2. Sparse‌ modeling and image representation
A5.8. Natural language processing‌
A5.9. Signal processing
A5.9.4. Signal processing over graphs‌
A5.9.5. Sparsity-aware processing
A5.9.6. Optimization tools
A6.3.1. Inverse‌ problems
A8.2. Optimization
A8.6. Information theory
A8.12. Optimal‌ transport
A9.2.1. Supervised learning
A9.2.4. Optimization and learning‌
A9.2.6. Neural networks
A9.2.7. Kernel methods
A9.2.8. Deep‌ learning
A9.11. Generative AI

1‌ Team members, visitors, external collaborators

Research Scientists

Remi‌ Gribonval [Team leader, INRIA, Senior‌ Researcher, HDR]
Paulo Goncalves [INRIA‌, Senior Researcher, HDR]
Mathurin Massias‌ [INRIA, Researcher]
Titouan Vayer [‌INRIA, Researcher]

Faculty Members

Marion Foare‌ [CPE LYON, Associate Professor]
Elisa‌ Riccietti [ENS DE LYON, Associate Professor‌, from Sep 2025]

Post-Doctoral Fellows

Alice‌ Brenon [ENS DE LYON]
Etienne Lasalle‌ [ENS DE LYON, Post-Doctoral Fellow,‌ until Feb 2025]
Guillaume Lauga [ENS‌ DE LYON, Post-Doctoral Fellow, from Feb‌ 2025 until Mar 2025]
Hugo Lebeau [‌INRIA, Post-Doctoral Fellow, from Feb 2025‌]
Manon Verbockhaven [ENS DE LYON,‌ Post-Doctoral Fellow, from Dec 2025]

PhD‌ Students

Giuseppe Carrino [ENS DE LYON,‌ from Nov 2025]
Mael Chaumette [INRIA‌]
Edgar Desainte-Mareville [ENS DE LYON]‌
Anne Gagneux [UNIV LYON I]
Arthur‌ Lebeurrier [ENS DE LYON]
Sibylle Marcotte‌ [ENS PARIS]
Can Pouliquen [ENS DE LYON]

Technical‌ Staff

Pascal Carrivain [‌INRIA, Engineer]‌‌

Interns and Apprentices

Ilias Bouhss [ENS DE‌ LYON, Intern,‌ from Jun 2025 until‌‌ Aug 2025]
Giuseppe Carrino [ENS DE‌ LYON, Intern,‌ from Mar 2025 until‌‌ Jun 2025]
Chady Essouabri [CNRS,‌ Intern, from May‌ 2025 until Aug 2025‌‌]
Florian Kozikowski [CNRS, Intern,‌ from Mar 2025 until‌ Aug 2025]
Damien‌‌ Rouchouse [INRIA, Intern, from Apr‌ 2025 until Sep 2025‌]

Administrative Assistant

Emilie‌‌ Gatignol [INRIA]

Visiting Scientist

Laurent Jacques‌ [Univ UCLouvain,‌ from Sep 2025]‌‌

2 Overall objectives

Building on a culture at‌ the interface of signal‌ modeling, mathematical optimization and‌‌ statistical machine learning, the global objective of OCKHAM‌ is to develop computationally‌ efficient and mathematically founded‌‌ methods and models to process high-dimensional data.‌ Our ambition is to‌ develop frugal signal processing‌‌ and machine learning methods able to exploit structured‌ models, intrinsically associated‌ to resource-efficient implementations,‌‌ and endowed with solid statistical guarantees.

Challenge‌ 1: Developing frugal methods‌ with robust expressivity.

The‌‌ idea of frugal approaches means algorithms relying on‌ a controlled use of‌ computing resources, but also‌‌ methods whose expressivity and flexibility provably relies on‌ the versatile notion of‌ sparsity. This is expected‌‌ to avoid the current pitfalls of costly over-parameterizations‌ and to robustify the‌ approaches with respect to‌‌ adversarial examples and overfitting. More specifically, it is‌ essential to contribute to‌ the understanding of methods‌‌ based on neural networks, in order to improve‌ their performance and most‌ of all, their efficiency‌‌ in resource-limited environments.

Challenge 2: Integrating models in‌ learning algorithms.

To make‌ statistical machine learning both‌‌ more frugal and more interpretable, it is important‌ to develop techniques able‌ to exploit not only‌‌ high-dimensional data but also models in various forms‌ when available. When some‌ partial knowledge is available‌‌ about some phenomena related to the processed data,‌ e.g. under the form‌ of a physical model‌‌ such as a partial differential equation, or as‌ a graph capturing local‌ or non-local correlations, the‌‌ goal is to use this knowledge as an‌ inspiration to adapt machine‌ learning algorithms. The main‌‌ challenge is to flexibly articulate a priori knowledge‌ and data-driven information, in‌ order to achieve a‌‌ controlled extrapolation of predicted phenomena much beyond the‌ particular type of data‌ on which they were‌‌ observed, and even in applications where training data‌ is scarce.

Challenge 3:‌ Guarantees on interpretability, explainability,‌‌ and privacy.

The notion of sparsity and its‌ structured avatars –notably via‌ graphs– is known to‌‌ play a fundamental role in ensuring the identifiability‌ of decompositions in latent‌ spaces, for example for‌‌ high-dimensional inverse problems in signal processing. The team's‌ ambition is to deploy‌ these ideas to ensure‌‌ not only frugality but also some level of‌ explainability of decisions and‌ an interpretability of learned‌‌ parameters, which is an‌ important societal stake for the acceptability of “algorithmic‌ decisions”. Learning in small-dimensional latent spaces is also‌ a way to spare computing resources and, by‌ limiting the public exposure of data, it is‌ expected to enable tunable and quantifiable tradeoffs between‌ the utility of the developed methods and their‌ ability to preserve privacy.

3 Research program

This‌ project is resolutely at the interface of signal‌ modeling, mathematical optimization and statistical machine learning, and‌ concentrates on scientific objectives that are both ambitious‌ –as they are difficult and subject to a‌ strong international competition– and realistic thanks to the‌ richness and complementarity of skills they mobilize in‌ the team.

Sparsity constitutes a backbone for this‌ project, not only as a target to ensure‌ resource-efficiency and privacy, but also as prior knowledge‌ to be exploited to ensure the identifiability of‌ parameters and the interpretability of results. Graphs are‌ its necessary alter ego, to flexibly model‌ and exploit relations between variables, signals, and phenomena,‌ whether these relations are known a priori or‌ to be inferred from data. Lastly, advanced large-scale‌ optimization is a key tool to handle in‌ a statistically controlled and algorithmically efficient way the‌ dynamic and incremental aspects of learning in varying‌ environments.

The scientific activity of the project is‌ articulated around the three axes described below. A‌ common endeavor to these three axes consists in‌ designing structured low-dimensional models, algorithms of bounded complexity‌ to adjust these models to data through learning‌ mechanisms, and a control of the performance of‌ these algorithms to exploit these models on tasks‌ ranging from low-level signal processing to the extraction‌ of high-level information.

3.1 Axis 1: Sparsity for‌ high-dimensional learning.

As now widely documented, the fact‌ that a signal admits a sparse representation in‌ some signal dictionary 66 is an enabling factor‌ not only to address a variety of inverse‌ problems with high-dimensional signals and images, such as‌ denoising, deconvolution, or declipping, but also to speedup‌ or decrease the cost of the acquisition of‌ analog signals in certain scenarios compatible with compressive‌ sensing 68, 60. The flexibility of‌ the models, which can incorporate learned dictionaries 100‌, as well as structured and/or low-rank variants‌ of the now-classical sparse modeling paradigm 78,‌ has been a key factor of the success‌ of these approaches. Another important factor is the‌ existence of algorithms of bounded complexity with provable‌ performance, often associated to convex regularization and proximal‌ strategies 56, 63, allowing to identify‌ latent sparse signal representations from low-dimensional indirect observations.‌

While being now well-mastered (and in the core‌ field of expertise of the team), these tools‌ are typically constrained to relatively rigid settings where‌ the unknown is described either as a sparse‌ vector or a low-rank matrix or tensor in‌ high (but finite) dimension. Moreover, the algorithms hardly‌ scale to the dimensions needed to handle inverse‌ problems arising from the discretization of physical models (e.g., for 3D wavefield‌ reconstruction). A major challenge‌ is to establish a‌‌ comprehensive algorithmic and theoretical toolset to handle continuous‌ notions of sparsity 61‌, which have been‌‌ identified as a way to potentially circumvent these‌ bottlenecks. The other main‌ challenge is to extend‌‌ the sparse modeling paradigm to resource-efficient and interpretable‌ statistical machine learning. The‌ methodological and conceptual output‌‌ of this axis provides tools for Axes 2‌ and 3, which in‌ return fuel the questions‌‌ investigated in this axis.

1.1 Versatile and efficient‌ sparse modeling. The goal‌ is to propose flexible‌‌ and resource-efficient sparse models, possibly leveraging classical notions‌ of dictionaries and structured‌ factorization, but also the‌‌ notion of sparsity in continuous domains (e.g. for‌ sketched clustering, mixture model‌ estimation, or image super-resolution),‌‌ low-rank tensor representations, and neural networks with sparse‌ connection patterns.

Besides the‌ empirical validation of these‌‌ models and of the related algorithms on a‌ diversity of targeted applications,‌ the challenge is to‌‌ determine conditions under which their success can be‌ mathematically controlled, and to‌ determine the fundamental tradeoffs‌‌ between the expressivity of these models and their‌ complexity.
1.2 Sparse optimization.‌ The main objectives are:‌‌ a) to define cost functions and regularization penalties‌ that integrate not only‌ the targeted learning tasks,‌‌ but also a priori knowledge, for example under‌ the form of conservation‌ laws or as relation‌‌ graphs, cf Axis 2; b) to design efficient‌ and scalable algorithms 67‌, 80 to optimize‌‌ these cost functions in a controlled manner in‌ a large-scale setting. To‌ ensure the resource-efficiency of‌‌ these algorithms, while avoiding pitfalls related to the‌ discretization of high-dimensional problems‌ (aka curse of dimensionality),‌‌ we investigate the notion of “continuous” sparsity (i.e.,‌ with sparse measures), of‌ hierarchies (along the ideas‌‌ of multilevel methods), and of reduced precision (cf‌ also Axis 3). The‌ nonconvexity and non-smoothness of‌‌ the problems are key challenges, and the exploitation‌ of proximal algorithms and/or‌ convexifications in the space‌‌ of Borelian measures are privileged approaches.
1.3 Identifiability‌ of latent sparse representations.‌ To provide solid guarantees‌‌ on the interpretability of sparse models obtained via‌ learning, one needs to‌ ensure the identifiability of‌‌ the latent variables associated to their parameters. This‌ is particularly important when‌ these parameters bear some‌‌ meaning due to the underlying physics. Vice-versa, physical‌ knowledge can guide the‌ choice of which latent‌‌ parameters to estimate. By leveraging the team's know-how‌ obtained in the field‌ of inverse problems, compressive‌‌ sensing and source separation in signal processing, we‌ aim at establishing theoretical‌ guarantees on the uniqueness‌‌ (modulo some equivalence classes to be characterized) of‌ the solutions of the‌ considered optimization problems, on‌‌ their stability in the presence of random or‌ adversarial noise, and on‌ the convergence and stability‌‌ of the algorithms.

3.2 Axis 2: Learning on‌ graphs and learning of‌ graphs.

Graphs provide synthetic‌‌ and sparse representations of the interactions between potentially‌ high-dimensional data, whether in‌ terms of proximity, statistical‌‌ correlation, functional similarity, or‌ simple affinities. One central task in this domain‌ is how to infer such discrete structures, from‌ the observations, in a way that best accounts‌ for the ties between data, without becoming too‌ complex due to spurious relationships. The graphical lasso‌ 69 is among the most popular and successful‌ algorithm to build a sparse representation of the‌ relations between time series (observed at each node)‌ and that unveils relevant patterns of the data.‌ Recent works (e.g. 79) strived to emphasize‌ the clustered structure of the data by imposing‌ spectral constraints to the Laplacian of the sought‌ graphs, with the aim to improve the performance‌ of spectral approaches to unsupervised classification. In this‌ direction, several challenges remain, such as for instance‌ the transposition of the framework to graph-based semi-supervised‌ learning 57, where natural models are stochastic‌ block models rather than strictly multi-component graphs (e.g.‌ Gaussian mixtures models). As it is done in‌ 105, the standard $l_{1}$ -norm penalization‌ term of graphical lasso could be questioned in‌ this case. On another level, when low-rank (precision)‌ matrices and / or when preservation of privacy‌ are important stakes, one could be inspired by‌ the sketching techniques developed in 74 and 62‌ to work out a sketched graphical lasso.‌ There exists other situations where the graph is‌ known a priori and does not need to‌ be inferred from the data. This is for‌ instance the case when the data naturally lie‌ on a graph (e.g. social networks or geographical‌ graphs) and so, one has to combine this‌ data structure with the attributes (or measures) carried‌ by the nodes or the edges of these‌ graphs. Graph signal processing (GSP) 978,‌ which underwent methodological developments at a very rapid‌ pace in recent years, is precisely an approach‌ to jointly exploit algebraically these structures and attributes,‌ either by filtering them, by re-organizing them, or‌ by reducing them to principal components. However, as‌ it tends to be more and more the‌ case, data collection processes yield very large data‌ sets with high dimensional graphs. In contrast to‌ standard digital signal processing that relies on regular‌ graph structures (cycle graph or cartesian grid) treating‌ complex structured data in a global form is‌ not an easily scalable task 71. Hence,‌ the notion of distributed GSP 64, 65‌ has naturally emerged. Yet, very little has been‌ done on graph signals supported on dynamical graphs‌ that undergo vertices/edges editions.

2.1 Learning of graphs.‌ When the graphical structure of the data is‌ not known a priori, one needs to explore‌ how to build it or to infer it.‌ In the case of partially known graphs, this‌ raises several questions in terms of relevance with‌ respect to sparse learning. For example, a challenge‌ is to determine which edges should be kept,‌ whether they should be oriented, and how attributes‌ on the graph could be taken into account (in particular when considering‌ time-series on graphs) to‌ better infer the nature‌‌ and structure of the un-observed interactions. We strive‌ to adapt known approaches‌ such as the graphical‌‌ lasso to estimate the covariance under a sparsity‌ constraint (integrating also temporal‌ priors), and investigate diffusion‌‌ approaches to study the identifiability of the graphs.‌ In connection with Axis‌ 1.2, a particular challenge‌‌ is to incorporate a priori knowledge coming from‌ physical models that offer‌ concise and interpretable descriptions‌‌ of the data and their interactions.
2.2 Distributed‌ and adaptive learning on‌ graphs. The availability of‌‌ a known graph structure underlying training data offers‌ many opportunities to develop‌ distributed approaches, open perspectives‌‌ where graph signal processing and machine learning can‌ mutually fertilize each other.‌

Some classifiers can be‌‌ formalized as solutions of a constrained optimization problem,‌ and an important objective‌ is then to reduce‌‌ their global complexity by developing distributed versions of‌ these algorithms. Compared to‌ costly centralized solutions, distributing‌‌ the operations by restricting them to local node‌ neighborhoods will enable solutions‌ that are both more‌‌ frugal and more privacy-friendly. In the case of‌ dynamic graphs, the idea‌ is to get inspiration‌‌ from adaptive processing techniques to make the algorithms‌ able to track the‌ temporal evolution of data,‌‌ either in terms of structural evolution or of‌ temporal variations of the‌ attributes. This aspect finds‌‌ a natural continuation in the objectives of Axis‌ 3.

3.3 Axis 3:‌ Dynamic and frugal learning.‌‌

With the resurgence of neural networks approaches in‌ machine learning, training times‌ of the order of‌‌ days, weeks, or even months are common. Mainstream‌ research in deep learning‌ somehow applies it to‌‌ an increasingly large class of problems and uses‌ the general wisdom to‌ improve the models prediction‌‌ accuracy by “stacking more layers”, making the approach‌ ever more resource-hungry. Underpinning‌ theory on which resources‌‌ are needed for a network architecture to achieve‌ a given accuracy is‌ still in its infancy.‌‌ Efficient scaling of such techniques to massive sample‌ sizes or dimensions in‌ a resource-restricted environment remains‌‌ a challenge and is a particularly active field‌ of academic and industrial‌ R&D, with recent interest‌‌ in techniques such as sketching, dimension reduction, and‌ approximate optimization.

A central‌ challenge is to develop‌‌ novel approximate techniques with reduced computational and memory‌ imprint. For certain unsupervised‌ learning tasks such as‌‌ PCA, unsupervised clustering, or parametric density estimation, random‌ features (e.g. random Fourier‌ features 95) allow‌‌ to compute aggregated sketches guaranteed to preserve the‌ information needed to learn,‌ and no more: this‌‌ has led to the compressive learning framework, which‌ is endowed with statistical‌ learning guarantees 74 as‌‌ well as privacy preservation guarantees 62. A‌ sketch can be seen‌ as an embedding of‌‌ the empirical probability distribution of the dataset with‌ a particular form of‌ kernel mean embedding 98‌‌. Yet, designing random features given a learning‌ task remains something of‌ an art, and a‌‌ major challenge is to‌ design provably good end-to-end sketching pipelines with controlled‌ complexity for supervised classification, structured matrix factorization, and‌ deep learning.

Another crucial direction is the use‌ of dynamical learning methods, capable of exploiting wisely‌ multiple representations at different scales of the problem‌ at hand. For instance, many low and mixed-precision‌ variants of gradient-based methods have been recently proposed‌ 103, 102, which are however based‌ on a static reduced precision policy, while a‌ dynamic approach can lead to much improved energy-efficiency.‌ Also, despite their massive success, gradient-based training methods‌ still possess many weaknesses (low convergence rate, dependence‌ on the tuning of the learning parameters, vanishing‌ and exploding gradients) and the use of dynamical‌ information promises to allow for the development of‌ alternative methods, such as second-order or multilevel methods,‌ which are as scalable as first-order methods but‌ with faster convergence guarantees 96, 104.‌

The overall objective in this axis is to‌ adapt in a controlled manner the information that‌ is extracted from datasets or data streams and‌ to dynamically use such information in learning, in‌ order to optimize the tradeoffs between statistical significance,‌ resource-efficiency, privacy-preservation and integration of a priori knowledge.‌

3.1 Compressive and privacy-preserving learning. The goal is‌ to compress training datasets as soon as possible‌ in the processing workflow, before even starting to‌ learn. In the spirit of compressive sensing, this‌ is desirable not only to ensure the frugal‌ use of ressources (memory and computation), but also‌ to preserve privacy by limiting the diffusion of‌ raw datasets and controlling the information that could‌ actually be extracted from the targeted compressed representations,‌ called sketches, obtained by well-chosen nonlinear random‌ projections. We aim to build on a compressive‌ learning framework developed by the team with the‌ viewpoint that sketches provide an embedding of the‌ data distribution, which should preserve some metrics, either‌ associated to the specific learning task or to‌ more generic optimal transport formulations. Besides ensuring the‌ identifiability of the task-specific information from a sketch‌ (cf Axis 1.3), an objective is to efficiently‌ extract this information from a sketch, for example‌ via algorithms related to avatars of continuous sparsity‌ as studied in Axis 1.2. A particular challenge,‌ connected with Axis 2.1 when inferring dynamic graphs‌ from correlation of non-stationary times series, and with‌ Axis 3.2 below, is to dynamically adapt the‌ sketching mechanism to the analyzed data stream.
3.2‌ Sequential sparse learning. Whether aiming at dynamically learning‌ on data streams (cf. Axes 2.1 and 2.2),‌ at integrating a priori physical knowledge when learning,‌ or at ensuring domain adaptation for transfer learning,‌ the objective is to achieve a statistically near-optimal‌ update of a model from a sequence of‌ observations whose content can also dynamically vary. When‌ considering time-series on graphs, to preserve resource-efficiency and‌ increase robustness, the algorithms further need to update‌ the current models by dynamically integrating the data‌ stream.
3.3 Dynamic-precision learning. The goal is to propose new optimization algorithms‌ to overcome the cost‌ of solving large scale‌‌ problems in learning, by dynamically adapting the precision‌ of the data. The‌ main idea is to‌‌ exploit multiple representations at different scales of the‌ problem at hand. We‌ explore in particular two‌‌ different directions to build the scales of problems:‌ a) exploiting ideas coming‌ from multilevel optimization to‌‌ propose dynamical hierarchical approaches exploiting representations of the‌ problem of progressively reduced‌ dimension; b) leveraging the‌‌ recent advances in hardware and the possibility of‌ representing data at multiple‌ precision levels provided by‌‌ them. We aim at improving over state-of-the-art training‌ strategies by investigating the‌ design of scalable multilevel‌‌ and mixed-precision second-order optimization and quantization methods, possibly‌ derivative-free.

4 Application domains‌

The primary objectives of‌‌ this project, which is rooted in Signal Processing‌ and Machine Learning methodology,‌ are to develop flexible‌‌ methods, endowed with solid mathematical foundations and efficient‌ algorithmic implementations, that can‌ be adapted to numerous‌‌ application domains. We are nevertheless convinced that such‌ methods are best developed‌ in strong and regular‌‌ connection with concrete applications, which are not only‌ necessary to validate the‌ approaches but also to‌‌ fuel the methodological investigations with relevant and fruitful‌ ideas. The following application‌ domains are primarily investigated‌‌ in partnership with research groups with the relevant‌ expertise.

4.1 Frugal AI‌ on embedded devices

There‌‌ is a strong need to drastically compress signal‌ processing and machine learning‌ models (typically, but not‌‌ only, deep neural networks) to fit them on‌ embedded devices. For example,‌ on autonomous vehicles, due‌‌ to strong constraints (reliability, energy consumption, production costs),‌ the memory and computing‌ resources of dedicated high-end‌‌ image-analysis hardware are two orders of magnitude more‌ limited than what is‌ typically required to run‌‌ state-of-the-art deep network models in real-time. The research‌ conducted in the OCKHAM‌ project finds direct applications‌‌ in these areas, including: compressing deep neural networks‌ to obtain low-bandwidth video-codecs‌ that can run on‌‌ smartphones with limited memory resources; sketched learning and‌ sparse networks for autonomous‌ vehicles; or sketching algorithms‌‌ tailored to exploit optical processing units for energy‌ efficient large-scale learning.

4.2‌ Imaging in physics and‌‌ medicine

Many problems in imaging involve the reconstruction‌ of large scale data‌ from limited and noise-corrupted‌‌ measurements. In this context, the research conducted in‌ OCKHAM pays a special‌ attention to modeling domain‌‌ knowledge such as physical constraints or prior medical‌ knowledge. This finds applications‌ from physics to medical‌‌ imaging, including: multiphase flow image characterization; near infrared‌ polarization imaging in circumstellar‌ imaging; compressive sensing for‌‌ joint segmentation and high-resolution 3D MRI imaging; or‌ graph signal processing for‌ radio astronomy imaging with‌‌ the Square Kilometer Array (SKA).

4.3 Interactions with‌ computational social sciences

Based‌ on collaborations with the‌‌ relevant experts the team also regularly investigates applications‌ in computational social science.‌ For example, modeling infection‌‌ disease epidemics requires efficient methods to reduce the‌ complexity of large networked‌ datasets while preserving the‌‌ ability to feed effective‌ and realistic data-driven models of spreading phenomena. In‌ another area, estimating the vote transfer matrices between‌ two elections is an ill-posed problem that requires‌ the design of adapted regularization schemes together with‌ the associated optimization algorithms.

5 Highlights of the‌ year

The paper “On the closed form of‌ flow matching: generalization does not arise from stochasticty”‌ 1 was accepted as an oral presentation at‌ NeurIPS 2025 (top 0.3% of more than 22000‌ submissions).

The paper “Transformative or conservative? Conservation laws‌ for ResNets and Transformers” 26 was accepted as‌ an oral presentation at ICML 2025 (top 1%‌ of about 12000 submissions)

The paper “Rapture of‌ the deep: highs and lows of sparsity in‌ a world of depths” 4 has been accepted‌ in the Signal Processing Magazine.

Antoine Gonon, former‌ Ph.D. student of the Ockham team, was awarded‌ a Honorable mention (2nd ex-aequo) of the 2025‌ Ph.D. award of the Société Savante Francophone en‌ Apprentissage Machine.

6 Latest software developments, platforms,‌ open data

6.1 Latest software developments

6.1.1 skglm‌

Keywords:
Optimization, Machine learning, Sparsity
Functional Description:

skglm‌ is a Python package that offers fast estimators‌ for Generalized Linear Models (GLMs) that are compatible‌ with scikit-learn. It is highly flexible and supports‌ a wide range of GLMs. Its main feature‌ is flexibility: you can implement virtually any estimator‌ as a combination of datafit and penalty.

Thanks‌ to this flexible design, skglm supports many missing‌ models in scikit-learn while ensuring high performance. There‌ are several reasons to opt for skglm:

-‌ Support for many fast solvers able to tackle‌ large datasets, either dense or sparse, with millions‌ of features up to 100 times faster than‌ scikit-learn - User-friendly API than enables composing custom‌ estimators with any combination of existing datafits and‌ penalties - Flexible design that makes it simple‌ and easy to implement new datafits and penalties,‌ a matter of few lines of code -‌ Estimators fully compatible with the scikit-learn API and‌ drop-in replacements of its GLM estimators

skglm is‌ integrated into scikit-learn via the scikit-learn-contrib organization.
URL:‌
https://contrib.scikit-learn.org/skglm/
Publication:
hal-03819082
Contact:
Mathurin Massias
Participant:
2‌ anonymous participants

6.1.2 Benchopt

Keywords:
Benchmarking, Machine learning,‌ Optimization
Functional Description:

BenchOpt is a package to‌ simplify, make more transparent and more reproducible the‌ comparisons of optimization algorithms. It is written in‌ Python but it is available with many programming‌ languages. So far it has been tested with‌ Python, R, Julia and compiled binaries written in‌ C/C++ available via a terminal command. If it‌ can be installed via conda, it should just‌ work!

BenchOpt is used through a simple command‌ line and ultimately running and replicating an optimization‌ benchmark should be as easy a cloning a‌ repo and launching the computation with a single‌ command line. For now, BenchOpt features benchmarks for‌ around 10 convex optimization problems and we are‌ working on expanding this to feature more complex‌ optimization problems. We are also developing a website to display the benchmark‌ results easily.
Release Contributions:‌
https://github.com/benchopt/benchopt/releases/tag/1.5.1
Publication:
hal-03830604
Contact:‌‌
Thomas Moreau
Participant:
4 anonymous participants

6.1.3 lazylinop‌

Name:
lazylinop
Keywords:
Signal‌ processing, Numerical algorithm, Scientific‌‌ computing
Scientific Description:
lazylinop is an easy way‌ to combine existing operators‌ into more complex operators‌‌ with direct access to its adjoint.
Functional Description:‌
Lazy evaluation of linear‌ operators applied to vectors‌‌ or matrices. lazylinop aims at providing an easy‌ way to combine existing‌ operators into more complex‌‌ operators with direct access to its adjoint. Thanks‌ to the lazy computation‌ paradigm, lazylinop offers potential‌‌ performances gains and memory sparing.
Release Contributions:

-‌ Basic linear operators: Kronecker‌ product, addition, diagonal, block-diagonal,‌‌ concatenation ... - Polynomial of linear operators. -‌ Usual signal processing linear‌ operators. - Usual image‌‌ processing linear operators. - Butterfly linear operators. -‌ Near optimal Butterfly (real‌ values) quantification. - Lazylinop‌‌ operators take as input NumPy/CuPy arrays or torch‌ tensors (via array-api)

Work-In-Progress:‌ - Near optimal Butterfly‌‌ (complex values) quantification.
URL:
https://lazylinop.inria.fr
Contact:
Pascal Carrivain‌
Participant:
4 anonymous participants‌

6.1.4 Celer

Keywords:
Mathematical‌‌ Optimization, Machine learning, Sparsity
Functional Description:

celer is‌ a Python package that‌ solves Lasso-like problems and‌‌ provides estimators that under the popular scikit-learn API.‌ Thanks to a tailored‌ implementation, celer provides a‌‌ fast solver that tackles large-scale datasets with millions‌ of features up to‌ 100 times faster than‌‌ scikit-learn. It handles Lasso, ElasticNet, Group Lasso, Multitask‌ Lasso and Sparse Logistic‌ regression, and comes with‌‌ - automated parallel cross-validation - support of sparse‌ and dense data -‌ optional feature centering and‌‌ normalization - unpenalized intercept fitting

celer also provides‌ easy-to-use estimators as it‌ is designed under the‌‌ scikit-learn API.
URL:
http://mathurinm.github.io/celer
Publications:
hal-02263500, hal-01833398‌
Contact:
Mathurin Massias
Participant:‌
2 anonymous participants

6.1.5‌‌ TorchDR

Keywords:
Optimal transportation, Machine learning, Dimensionality reduction,‌ High Dimensional Data
Scientific‌ Description:
TorchDR is an‌‌ open-source dimensionality reduction (DR) library using PyTorch. Its‌ goal is to accelerate‌ the development of new‌‌ DR methods by providing a common simplified framework.‌
Functional Description:
TorchDR is‌ an open-source dimensionality reduction‌‌ (DR) library using PyTorch. Its goal is to‌ accelerate the development of‌ new DR methods by‌‌ providing a common simplified framework.
URL:
https://torchdr.github.io/
Contact:‌
Titouan Vayer
Participant:
5‌ anonymous participants

6.1.6 FAuST‌‌

Keywords:
Matrix calculation, Multilayer sparse factorisation
Scientific Description:‌
FAuST allows to approximate‌ a given dense matrix‌‌ by a product of sparse matrices, with considerable‌ potential gains in terms‌ of storage and speedup‌‌ for matrix-vector multiplications.
Functional Description:

FAUST is a‌ C++ toolbox designed to‌ decompose a given dense‌‌ matrix into a product of sparse matrices in‌ order to reduce its‌ computational complexity (both for‌‌ storage and manipulation).

Faust includes Matlab and Python‌ wrappers and scripts to‌ reproduce the experimental results‌‌ of the following papers: - Le Magoarou L.‌ and Gribonval R,. "Flexible‌ multi-layer sparse approximations of‌‌ matrices and applications", Journal of Selected Topics in‌ Signal Processing, 2016. -‌ Le Magoarou L., Gribonval‌‌ R., Tremblay N. "Approximate‌ fast graph Fourier transforms via multi-layer sparse", IEEE‌ Transactions on Signal and Information Processing over Networks,‌ 2018 - Quoc-Tung Le, Rémi Gribonval. Structured Support‌ Exploration For Multilayer Sparse Matrix Factorization. ICASSP 2021‌ – IEEE International Conference on Acoustics, Speech and‌ Signal Processing, Jun 2021, Toronto, Ontario, Canada. pp.1-5.‌ - Sibylle Marcotte, Amélie Barbe, Rémi Gribonval, Titouan‌ Vayer, Marc Sebban, et al.. Fast Multiscale Diffusion‌ on Graphs. 2021.
Release Contributions:

Faust 1.x contains‌ Matlab routines to reproduce experiments of the PANAMA‌ team on learned fast transforms.

Faust 2.x contains‌ a C++ implementation with preliminary Matlab / Python‌ wrappers.

Faust 3.x includes Python and Matlab wrappers‌ around a C++ core with GPU acceleration, new‌ algorithms.
URL:
https://faust.inria.fr/
Publications:
hal-03212764, hal-01416110,‌ hal-01627434, hal-01167948, hal-01254108, tel-01412558,‌ hal-01156478, hal-01104696, hal-01158057, hal-03132013
Contact:‌
Remi Gribonval
Participant:
6 anonymous participants

7 New‌ results

7.1 Integrating Structured Models in Machine Learning‌ and Signal Processing

7.1.1 Physics-informed neural networks

Participants:‌ Elisa Riccietti.

Collaboration with Alena Kopanicakova (IRIT,‌ Toulouse), Stefania Bellavia and Mahsa Yousefi (UNIFI, Florence,‌ Italy).

Physics-informed neural networks (PINNs) are specialized network‌ architectures designed for the solution of partial differential‌ equations (PDEs) that take into account the underlying‌ physics of the problem. We investigated their use‌ both for direct and inverse problems involving PDEs.‌

In the context of the postdoc of Mahsa‌ Yousefi, we pursued the work started last year‌ on the investigation of their ability to deal‌ with ill-posed inverse problems, focusing especially on parameter‌ identification problems. We have proposed a two-step training‌ strategy that first fits the available noisy observations‌ and later adds the physics information. The strategy‌ is shown to improve the solution of such‌ ill-posed problems.

In collaboration with Alena Kopanicakova, we‌ have proposed a book chapter on scientific machine‌ learning with a focus on the training of‌ physics-informed neural networks, guided by the neural tangent‌ kernel theory to correct the spectral bias.

7.1.2‌ Differentiable and learning-based methods for structure representation: application‌ to sparse precision matrices

Participants: Can Pouliquen,‌ Paulo Goncalves, Mathurin Massias, Titouan Vayer‌.

The PhD of Can Pouliquen, defended sucessfully‌ in December 2025, is devoted to the estimation‌ of structures from signals, such as sparse precision‌ matrices. For the latter problem we have adopted‌ the mathematical framework of the Graphical Lasso, and‌ pursued several directions. We have introduced SpodNet, a‌ new deep neural network architecture for positive definite‌ matrix estimation. In particular, it is the first‌ architecture which can guarantee a simultaneously sparse and‌ symmetric positive definite output. This highly desirable property‌ was so far a missing feature of existing‌ architectures, and has many potential applications in graph‌ learning beyond neurosciences 28. This work was‌ accepted to ICLR 2025. We have also developed‌ a bilevel optimization framework, that eases the tuning‌ of individual correlation strengths in the Graphical Lasso‌ penalty 94. Finally, we have proposed a fast and modular benchmark‌ for the Graphical Lasso,‌ together with high quality‌‌ open source implementations of fast solvers 35.‌

7.1.3 New penalties and‌ proximal operators

Participants: Anne‌‌ Gagneux, Remi Gribonval, Mathurin Massias.‌

Collaboration with Emmanuel Soubies‌ (CNRS, IRIT, Toulouse).

Finishing‌‌ the internship work of Anne Gagneux, we have‌ studied the properties of‌ sorted non convex penalties.‌‌ Convex sorted penalties such as SLOPE are known‌ to automatically cluster coefficients‌ associated to correlated variables;‌‌ non convex penalties on the other hand mitigate‌ the well-known amplitude bias‌ of the L1 norm.‌‌ Combining non-convexity with automatic grouping is therefore a‌ promising venue. However the‌ technical difficulties raised by‌‌ such new penalties are many (non convexity, non‌ smoothness). We have derived‌ an algorithm based on‌‌ the Pool Adjacent Violators Algorithm (PAVA) that computes‌ the exact proximal operator‌ of a first kind‌‌ of sorted penalties (sorted MCP, sorted Log-sum). We‌ have also extended it‌ to compute the proximal‌‌ operators of the sorted $ℓ_{q}$ ( $0‌ < q > 1‌$ ) penalties, which presented‌‌ more difficulties due to non Lipschitzianity. This work‌ has been submitted to‌ IEEE TSP 44.‌‌

7.1.4 Inverse problems for medical imaging

Participants: Marion‌ Foare.

Collaboration with‌ Luis Enrique Amador Arya‌‌ (Creatis, Villeurbanne), Hélène Ratiney (Creatis, Villeurbanne), Éric Van‌ Reeth (Creatis, Villeurbanne), and‌ Siemens Healthcare, Saint Denis‌‌

It is of particular interest in the field‌ of medical imaging to‌ quickly acquire low-resolution volumes‌‌ (compromise between acquisition time, SNR and spatial resolution),‌ and enhance their resolution‌ as a post-processing step.‌‌ In particular, isotropic super-resolution (ISR) techniques consist in‌ reconstructing an isotropic volume‌ from the combination of‌‌ several anisotropic volumes acquired with different orientations.

In‌ the context of the‌ PhD work of Luis‌‌ Enrique Amador Araya, we pursed the development of‌ specialized piecewise-smooth variational methods‌ combining data fitting terms‌‌ with geometric priors (e.g. the Discrete Mumford-Shah model)‌ to build faithful super-resolution‌ images in 3D Magnetic‌‌ Resonance Imaging (MRI).

In particular, we explored new‌ regularization terms to extend‌ this approach to multi-constrasts‌‌ ISR, that is, to reconstruct isotropic and multi-contrasts‌ high resolution images from‌ multi-contrasts anisotropic acquisitions. Preliminary‌‌ results were accepted for publication at the conference‌ ISBI 2026.

7.1.5 Gromov‌ hyperbolicity for tree representation‌‌ of relational data

Participants: Titouan Vayer.

Collaboration‌ with Pierre Houedry, Nicolas‌ Courty, Florestan Martin-Baillon, Laetitia‌‌ Chapel from Université Bretagne Sud.

Trees and the‌ associated shortest-path tree metrics‌ provide a powerful framework‌‌ for representing hierarchical and combinatorial structures in data.‌ Designing algorithms that can‌ produce a tree from‌‌ pairwise relationship between data points is a vivid‌ subject of interest. However,‌ most common approaches are‌‌ either heuristical and lack guarantees, or perform moderately‌ well. In 24 we‌ develop a geometrical framework‌‌ for learning such trees, based on the notion‌ of Gromov hyperbolicity, that‌ encodes to which extent‌‌ a metric space deviate from a tree structure.‌ We introduce a novel‌ differentiable optimization framework, coined‌‌ DeltaZero, that solves this‌ problem. Experiments on synthetic and real-world datasets demonstrate‌ that our method consistently achieves state-of-the-art distortion. This‌ work was accepted in NeurIPS 2025.

7.1.6 Contrastive‌ pre-training of transformer encoders for SEEG-based seizure onset‌ zone detection

Participants: Paulo Goncalves.

Collaboration with‌ Pierre Borgnat (ENS de Lyon), Julien Jung (Hôpital‌ Neurologique, HCL, CRNL).

Within the context of his‌ Master 2 internship, Zacharie Rodière pursued the work‌ of Gaetan Frusque, a former PhD student in‌ our group 70 on the clinical study of‌ epilepsy. Zacharie developed a transformer encoder for the‌ detection of Seizure Onset Zone (SOZ) from stereo-EEG.‌ It integrates clinically grounded time-frequency features with spatial‌ contrastive pre-training. While prior spatial transformer approaches analyse‌ learned representations, the proposed method uniquely combines: (1)‌ engineered time-frequency representations (TFRs) encoding epileptic spikes and‌ oscillations, and (2) a contrastive objective leveraging anatomical‌ relationships between the electrode contacts that are either‌ inside the SOZ or outside. Attention heads provide‌ interpretable connectivity patterns, bridging data-driven learning with the‌ study of functional connectivity networks. Zacharie presented his‌ preliminary results at the Graph Signal Processing Workshop‌ 2025 37.

7.2 Deep neural networks :‌ theory and algorithms

7.2.1 Mathematics of deep learning:‌ rescaling invariances, generalization bounds, and conservation laws

Participants:‌ Rémi Gribonval, Elisa Riccietti, Sibylle Marcotte‌, Arthur Lebeurrier, Titouan Vayer.

Collaborations‌ with Nicolas Brisebarre (ARIC team, ENS de Lyon),‌ and with Gabriel Peyré (DMA, ENS, Paris)

Rescaling‌ invariance in ReLU networks. Neural networks with the‌ ReLU activation function are described by weights and‌ bias parameters, and implemented into a piecewise linear‌ continuous function. Natural scalings and permutations operations on‌ the parameters leave the realization unchanged, leading to‌ equivalence classes of parameters that yield the same‌ realization.

Path-embedding and path-norm based generalization bounds. The‌ path-embedding of parameters that we introduced in 99‌ was invariant to such scalings but limited to‌ strictly layered ReLU architectures. In the context of‌ the PhD of Antoine Gonon 73 (defended on‌ 12/11/2024), we extended it 72 to fully encompass‌ general DAG ReLU networks with biases, skip connections‌ and any operation based on the extraction of‌ order statistics: max pooling, GroupSort etc. The norm‌ of the resulting embedding is called a path-norm,‌ and we established a general toolkit to obtain‌ statistical generalization bounds for such modern neural networks.‌ The resulting bounds are not only the most‌ widely applicable path-norm based ones, but also recover‌ or beat the sharpest known bounds of this‌ type. These extended path-norms further enjoy the usual‌ benefits of path-norms: ease of computation, invariance under‌ the symmetries of the network, and improved sharpness‌ on feed-forward networks compared to the product of‌ operators’ norms, another complexity measure most commonly used.‌ The versatility of the toolkit and its ease‌ of implementation allowed us to challenge the concrete‌ promises of path-norm-based generalization bounds, by numerically evaluating‌ the sharpest known bounds for ResNets on ImageNet.‌ Building on this toolkit, we more recently investigated a rescaling-invariant Lipschitz bound‌ on the mapping from‌ parameter space to function‌‌ space and illustrated its potential for neural network‌ pruning and quantization 22‌ in a paper published‌‌ at ICML 2025.

Conservation laws. In the thesis‌ of Sibylle Marcotte (defended‌ on 21/11/2025), the above‌‌ path-embedding also served as a key enabler for‌ the analysis of conservation‌ laws in gradient descent‌‌ dynamics of ReLU networks 91. Understanding the‌ geometric properties of gradient‌ descent dynamics is indeed‌‌ a key ingredient in deciphering the recent success‌ of very large machine‌ learning models. A striking‌‌ observation is that trained over-parameterized models retain some‌ properties of the optimization‌ initialization. This "implicit bias"‌‌ is believed to be responsible for some favorable‌ properties of the trained‌ models and could explain‌‌ their good generalization properties.

Out initial work on‌ this topic 91 was‌ conducted with a motivation‌‌ that was threefold. First, we rigorously exposed the‌ definition and basic properties‌ of "conservation laws", which‌‌ are maximal sets of independent quantities conserved during‌ gradient flows of a‌ given model (e.g. of‌‌ a ReLU network with a given architecture) with‌ any training data and‌ any loss. Then we‌‌ explained how to find the exact number of‌ these quantities by performing‌ finite-dimensional algebraic manipulations on‌‌ the Lie algebra generated by the Jacobian of‌ the model. Finally, we‌ provided algorithms (implemented in‌‌ SageMath) to: a) compute a family of polynomial‌ laws; b) compute the‌ number of (not necessarily‌‌ polynomial) conservation laws. We provided showcase examples that‌ we fully work out‌ theoretically. Besides, applying the‌‌ two algorithms confirmed for a number of ReLU‌ network architectures that all‌ known laws are recovered‌‌ by the algorithm, and that there are no‌ other laws. Such computational‌ tools paved the way‌‌ to understanding desirable properties of optimization initialization in‌ large machine learning models.‌

We then studied 92‌‌ the notion of conservation law and the corresponding‌ algorithms for optimzation flows‌ associated to non-Euclidean geometries‌‌ and momentum-based dynamics. We characterized "all" conservation laws‌ in this general setting.‌ In stark contrast to‌‌ the case of gradient flows, we proved that‌ the conservation laws for‌ momentum-based dynamics exhibit temporal‌‌ dependence. Additionally, we often observed a "conservation‌ loss" when transitioning from‌ gradient flow to momentum‌‌ dynamics. Specifically, for linear networks, our framework allowed‌ us to identify all‌ momentum conservation laws, which‌‌ are less numerous than in the gradient flow‌ case except in sufficiently‌ over-parameterized regimes. With ReLU‌‌ networks, no conservation law remains. This phenomenon also‌ manifests in non-Euclidean metrics,‌ used e.g. for Nonnegative‌‌ Matrix Factorization (NMF): all conservation laws can be‌ determined in the gradient‌ flow context, yet none‌‌ persists in the momentum case.

This year, we‌ extended the analysis 26‌ to extensively cover ResNets‌‌ and attention layers. For this, we first showed‌ that basic building blocks‌ such as ReLU (or‌‌ lin- ear) shallow networks, with or without convolu-‌ tion, have easily expressed‌ conservation laws, and no‌‌ more than the known‌ ones. In the case of a single attention‌ layer, we also completely de- scribed all conservation‌ laws, and we showed that residual blocks have‌ the same conservation laws as the same block‌ without skip connection. We then introduce the notion‌ of conservation laws that depend only on a‌ subset of parameters (cor- responding e.g. to a‌ pair of consecutive layers, to a residual block,‌ or to an attention layer). We demonstrate that‌ the characterization of such laws can be reduced‌ to the analysis of the correspond- ing building‌ block in isolation. Finally, we ex- amined how‌ these newly discovered conservation principles, initially established in‌ the continuous gradient flow regime, persist under discrete‌ opti- mization dynamics, particularly in the context of‌ Stochastic Gradient Descent (SGD).

This year we investigated‌ the consequences of conservation laws to characterize whether‌ a (path)lifted representation has in intrinsic training dynamics‌ 45, as a stepping stone to so-called‌ implicit bias analysis. We expressed a so-called intrinsic‌ dynamic property and showed how it is related‌ to the study of conservation laws associated with‌ the lifting function. This lead to a simple‌ criterion based on the inclusion of kernels of‌ linear maps which yields a necessary condition for‌ this property to hold. Applying our theory to‌ general ReLU networks of arbitrary depth, with the‌ path lifting, we showed that the dynamic is‌ intrinsic for any initialization. In the case of‌ linear networks with a natural lifting defined as‌ the product of weight matrices, so-called balanced initializations‌ were also known to enable such an intrinsic‌ dynamic; we generalized this result to a broader‌ class of relaxed balanced initializations, showing that, in‌ certain configurations, these are the only initializations that‌ ensure the intrinsic dynamic property. Finally, for the‌ linear neural ODE associated with the limit of‌ infinitely deep linear networks, with relaxed balanced initialization,‌ we explicitly expressed the corresponding intrinsic dynamics.

Path-conditioning‌ for faster training. Finally in the context of‌ the PhD thesis of Arthur Lebeurrier, we are‌ investigating how to leverage the path-lifiting framework to‌ better understand the dynamic of neural networks and‌ to eventually accelerate the training of the parameters.‌ We plan to submit this work for ICML‌ 2026.

7.2.2 Quantized networks: theory and algorithms

Participants:‌ Rémi Gribonval, Elisa Riccietti, Giuseppe Carrino‌, Mael Chaumette.

Collaboration with Nicolas Brisebarre‌ (ARIC team, ENS de Lyon), with Silviu Filip‌ and El-Mehdi El arar (IRISA, Rennes), and with‌ Theo Mary (LIP6, Paris)

Motivated by the importance‌ of quantizing networks besides pruning them to achieve‌ sparsity, we studied different aspects related to this‌ topic.

Quantization of neural networks: the multi-linear case‌ As a first step towards a better understanding‌ of nonlinear quantized networks, we studied the simpler‌ multi-linear case. Particularly, we investigated the problem of‌ optimally quantizing low rank matrices by exploiting scaling‌ invariances inherent to the optimization problem. We proposed‌ 76, 77 an optimal solution algorithm with polynomial complexity in the‌ dimension of the problem‌ and exponential complexity in‌‌ the number of bits. We showed that it‌ provides much more accurate‌ quantizations than the simple‌‌ round to nearest strategy. Particularly we used this‌ algorithm in combination with‌ the hierarchical procedure in‌‌ 90, to design a heuristic strategy to‌ efficiently quantize the family‌ of butterfly matrices, which‌‌ very often occur in fast transforms and machine‌ learning applications, for instance‌ to sparsify dense neural‌‌ networks. Our work may help to improve the‌ compression rate in this‌ context by coupling sparsification‌‌ and quantization. The corresponding algorithms have been incorporated‌ in the quantization module‌ of the lazylinop library‌‌ 6.1.3.

In the context of the thesis‌ of Mael Chaumette we‌ extended this approach to‌‌ complex valued matrices 30. This extension is‌ important since most of‌ the fast transforms that‌‌ involve butterfly matrices, such as the Fourier transform,‌ are complexed valued and‌ cannot be quantized by‌‌ the previously proposed strategy. Building this extension has‌ not been straightforward from‌ the real case: this‌‌ rised new questions and required to propose new‌ algorithms. A journal version‌ is in preparation as‌‌ well as an implementation in the lazylinop library‌ 6.1.3.

Quantization of‌ neural networks: mixed-precision inference‌‌ In order to further exploit the benefits of‌ quantization in neural networks‌ and the multiple reduced‌‌ numerical formats made available by modern computer architectures,‌ we studied the introduction‌ of mixed precision in‌‌ the inference of neural networks 42. We‌ proposed an analysis on‌ the propagation of the‌‌ error in the forward pass of neural networks,‌ which suggests a good‌ rule to choose the‌‌ numerical format of each line of the weight‌ matrices, yielding a mixed-precision‌ procedure that provides the‌‌ same accuracy of classical inference but with a‌ lower energy consumption.

Quantization‌ of neural networks: mixed-precision‌‌ training As a first step towards a mixed‌ precision training of neural‌ networks, in the context‌‌ of the master internship and of the PhD‌ thesis of Giuseppe Carrino,‌ we have studied the‌‌ convergence theory of the Newton's method in finite‌ precision 38. This‌ analysis allows for understanding‌‌ the impact of the different errors on the‌ convergence and thus to‌ guide the choice of‌‌ the precision in each step of the method,‌ leading to a mixed-precision‌ algorithm. Further research will‌‌ deal with an extension to the stochastic case,‌ which would be adapted‌ to the training of‌‌ neural networks.

7.2.3 Sparse regularization, unfolding, and approximation‌ theory

Participants: Marion Foare‌.

Collaborations with Nelly‌‌ Pustelnik (Physics lab, ENS de Lyon) and Audrey‌ Repetti (Heriot-Watt University, Edinburgh).‌

In the PhD work‌‌ of Hoang Trieu Vy Le, we investigated several‌ unfolding strategies of standard‌ proximal algorithms and their‌‌ associated accelerated version in the context of image‌ denoising, deconvolution. The goal‌ was to study the‌‌ impact of accelerated schemes on learning performance and‌ robustness. Currently, we are‌ studying various unrolling approaches‌‌ to tackle the joint‌ task of image restoration and edge detection. First,‌ we proposed a two-step procedure mimicking the Blake-Zisserman‌ minimization strategy, and relying on a smoothing Proximal‌ Neural Network, followed by an edge detection layer‌ (86).

On the other hand, we‌ are working on the unrolling procedure of the‌ (non-convex) Mumford-Shah model, which allows to jointly perfom‌ image restoration and edge detection using a single‌ model-based proximal neural network. The proposed architecture is‌ significantly lighter than recent learning models designed only‌ for edge detection, both in terms of number‌ of learnable parameters and inference time. This work‌ was published in Eusipco 2025 87.

7.2.4‌ Deep sparsity: from hardness to deformable butterfly algorithms‌

Participants: Rémi Gribonval, Elisa Ricietti, Pascal‌ Carrivain.

Collaboration with Leon Zheng (Huawei), Quoc-Tung‌ Le (TSE, Toulouse)

Matrix factorization with sparsity constraints‌ plays an important role in many machine learning‌ and signal processing problems such as dictionary learning,‌ data visualization, dimension reduction.

We have deeply investigated‌ this subject in the last years in the‌ context of the thesis of Quoc-Tung Le 85‌ and Léon Zheng 106.

Building on this‌ series of work on the hardness, tractability, and‌ uniqueness properties of sparse matrix factorizations under various‌ sparsity constraints 108, 89, 90,‌ we prepared this year a tutorial paper 4‌ for the signal processing magazine (SPM) Special Issue‌ ”Mathematics of Deep Learning”, in which we propose‌ an overview on the role of sparsity in‌ a deep learning context.

This work includes our‌ previous results on the subject.

First of all,‌ it includes the extension of the tractable algorithm‌ for so-called butterfly sparsity patterns (which somehow factorizes‌ a given matrix essentially at the cost of‌ a single matrix-vector multiplication, with exact recovery guarantees)‌ to so-called deformable butterlies. We have studied‌ its performance guarantees beyond the case of matrices‌ admitting an exact factorization 17. The corresponding‌ algorithm has been incorporated in the lazylinop software‌ library 6.1.3.

Second, it includes also our‌ study on the understanding on how to fully‌ exploit the specific structure of butterfly factors and‌ translate it into practical time gains, published at‌ ICML 2025 23. Specifically, we have studied‌ how to optimize memory access to the matrix‌ elements and we implemented a CUDA kernel to‌ multiply on GPU a dense matrix with a‌ deformable butterfly factor. This is also available in‌ lazylinop 6.1.3. In the paper we benchmark‌ our implementation against existing matrix-vector multiplication algorithms to‌ select the optimal one.

Going beyond the linear‌ case, the paper also includes our results on‌ neural networks. We have indeed shown that the‌ pitfalls that we had identified for certain sparse‌ matrix factorization problems 90 also hold for certain‌ sparse ReLU neural network training problems 88.‌ In particular, there exist settings where the optimization‌ is necessarily instable, in the sense that minimizing‌ the loss function can only be achieved by letting some coefficients diverge‌ to infinity.

Finally, the‌ paper includes also our‌‌ developed heuristics to handle butterfly approximations for matrices‌ under unknown permutations of‌ rows and/or columns 107‌‌.

7.2.5 Plug and play methods

Participants: Elisa‌ Riccietti, Rémi Gribonval‌, Mathurin Massias,‌‌ Anne Gagneux.

Collaboration with Emmanuel Soubies (CNRS,‌ IRIT), Nelly Pustelnik and‌ Julian Tachella (CNRS, ENS‌‌ Lyon), Nils Laurent (LASPI Roanne)

In imaging tasks,‌ Plug and Play (PnP)‌ methods leverage the strength‌‌ of pre-trained denoisers, often deep neural networks, by‌ integrating them in optimization‌ schemes, ensuring better reconstructions‌‌ than classical variational methods.

In the early PhD‌ work of Anne Gagneux,‌ we have investigated the‌‌ use of neural networks to implement convex functions.‌ Learning convex functions has‌ many applications in imaging‌‌ (notably in Plug and Play methods) and in‌ optimal transport. In 13‌ we have studied the‌‌ expressive power of Input Convex Neural Networks (ICNNs),‌ a special architectural constraint.‌ In particular, we have‌‌ shown that ICNNs are restrictive, and may require‌ more neurons than unconstrained‌ networks to implement a‌‌ given convex function.

One of the main pitfalls‌ of PnP methods is‌ their slow rate of‌‌ convergence and high computational cost. To overcome this,‌ in the context of‌ the postdoc of Nils‌‌ Laurent, we have studied the use of multilevel‌ schemes in conjunction with‌ plug and play (PnP)‌‌ methods. Since these methods involve neural networks, the‌ strategy to integrate multilevel‌ schemes is naturally different‌‌ from the one used so far in classical‌ image denoising problems. We‌ have proposed 18 a‌‌ multilevel PnP method that leverages images of smaller‌ sizes and lighter denoisers‌ at coarse levels.

7.2.6‌‌ Generative models

Participants: Anne Gagneux, Rémi Gribonval‌, Mathurin Massias.‌

Collaboration with Quentin Bertrand,‌‌ Rémi Emonet (INRIA Malice, Université Jean Monnet), Ségolène‌ Martin, Paul Hagemann, Gabriele‌ Steidl (TU Berlin).

Since‌‌ mid 2024, the team has started to study‌ generative modelling, with an‌ initial focus on diffusion‌‌ and flow matching methods for image generation. In‌ 6 (work done a‌ summer internship at TU‌‌ Berlin), OCKHAM PhD student Anne Gagneux has proposed‌ to use generative models,‌ namely flow matching, in‌‌ the PnP framework. This is achieved by defining‌ a time-dependent denoiser using‌ a pre-trained FM model.‌‌ The algorithm alternates between gradient descent steps on‌ the data-fidelity term, reprojections‌ onto the learned FM‌‌ path, and denoising. On tasks such as denoising,‌ super-resolution, deblurring, and inpainting,‌ the algorithm demonstrates superior‌‌ results compared to existing PnP algorithms and Flow‌ Matching based state-of-the-art methods.‌ The algorithm has been‌‌ released publicly on GitHub.

In a collaboration‌ with members of the‌ Inria MALICE team, we‌‌ have written an introductory blog post on flow‌ matching, with is now‌ considered as one of‌‌ the reference materials on the topic 51.‌

In 1, we‌ have shown that perfectly‌‌ trained flow matching (and diffusion) models admit a‌ closed-form solution, which can‌ only generate points from‌‌ their training data. We‌ have shown that these models produce new data‌ when they fail to perfectly learn their target,‌ and that failure at small generation times was‌ particularly important. This work was accepted as an‌ oral presentation at NeurIPS 2025 (top 0.3% of‌ submitted papers). We have pursued this research direction‌ in 43, adopting a denoising perspective on‌ the task of generating images: complementary to 6‌, we show how to build a generative‌ model from a denoiser, and leverage this framework‌ to produce new insights on the generation dynamics‌ of flow matching.

We are currently pursuing several‌ directions: conditional generation (text-to-image models), links with optimal‌ transport through Schrodinger bridges, discrete flow matching for‌ text generation, and application to molecule discovery.

7.3‌ Statistical learning, dimension reduction, and privacy preservation

7.3.1‌ Theoretical foundations of compressive learning: sketches, kernels, and‌ optimal transport

Participants: Hugo Lebeau, Rémi Gribonval‌, Titouan Vayer.

The compressive learning framework‌ proposes to deal with the large scale of‌ datasets by compressing them into a single vector‌ of generalized random moments, called a sketch,‌ from which the learning task is then performed.‌ In past works we established statistical guarantees on‌ the generalization error of this procedure, first in‌ a general abstract setting illustrated on PCA 2‌, then for the specific case of compressive‌ $k$ -means and compressive Gaussian Mixture Modeling 75‌. The overall framework is described in a‌ tutorial paper 3.

Theoretical guarantees in compressive‌ learning fundamentally rely on comparing certain metrics between‌ probability distributions as explored in a previous paper‌ 10. Preliminary works on the relations between‌ sketching and random matrix theory were conducted this‌ year. We began to investigate the sharpness of‌ the existing theoretical guarantees by looking at different‌ metrics between probability distributions, which naturally arise when‌ ones try to bound the excess risk of‌ sketching methods.

7.3.2 Practical exploration of sketching and‌ methods with limited resources

Participants: Etienne Lassalle,‌ Rémi Gribonval, Titouan Vayer, Paulo Goncalves‌.

Collaborations with Rémi Vaudaine (previously postdoctoral researcher),‌ Marton Karsai (CEU, Vienne, Austria) and Pierre Borgnat‌ (Physics Lab, ENS deLyon)

We explored the sketching‌ approach in the context of graph clustering, a‌ key task in graph analysis. Many methods, like‌ spectral clustering, are impractical for large graphs due‌ to computational constraints. To address this, we introduced‌ PASCO in 16, a sketching-based overlay that‌ accelerates clustering algorithms. PASCO involves: 1- generating small,‌ structure-preserving coarse graphs from the input graph, 2-‌ running clustering algorithms in parallel on these graphs‌ to produce partitions, and 3- aligning and merging‌ these partitions using optimal transport. The PASCO framework‌ is based on two key contributions: a novel‌ global algorithm structure designed to enable parallelization and‌ a fast, empirically validated graph coarsening algorithm that‌ preserves structural properties. This work was published in‌ the journal Machine Learning, 2025 and presented at‌ ECML-PKDD 2025.

7.3.3 Dimensionality reduction and optimal transport

Participants: Titouan Vayer,‌ Etienne Lasalle.

Collaborations‌ with Franck Picard (DR‌‌ CNRS, ENS Lyon), Chady Essouabri (intern, ENS Lyon),‌ Hugues Van Assel (PhD‌ student, ENS Lyon), Cédric‌‌ Vincent-Cuaz (post-doctoral researcher, EPFL), Rémi Flamary (CMAP, Ecole‌ Polytechnique), Nicolas Courty (IRISA,‌ Université Bretagne Sud), Pascal‌‌ Frossard (EPFL).

Exploring and analyzing high-dimensional data is‌ a core problem of‌ data science that requires‌‌ building low-dimensional and interpretable representations of the data‌ through dimensionality reduction (DR).‌ In a series of‌‌ work we provide new methods an analysis for‌ DR, inspired from optimal‌ transport (OT). A key‌‌ requirement for dimensionality reduction is to incorporate global‌ dependencies among original and‌ embedded samples while preserving‌‌ clusters in the embedding space. In a previous‌ work 101, we‌ introduced and explored an‌‌ innovative nonlinear dimensionality reduction method by utilizing the‌ optimal transport framework and‌ entropic affinities.

Building on‌‌ these results, we extended our work to generalize‌ dimension reduction, as detailed‌ in 9, accepted‌‌ at TMLR 2025. Our approach leverages OT, specifically‌ the Gromov-Wasserstein distance (GW),‌ to propose a framework‌‌ that simultaneously reduces both the dimensionality and the‌ number of points in‌ a dataset, enabling significant‌‌ data compression. Notably, when the number of points‌ is preserved, we demonstrated‌ strong connections between our‌‌ method and traditional dimensionality reduction techniques, such as‌ spectral methods and t-SNE.‌ We refer to our‌‌ framework as "Distributional Dimension Reduction" which can be‌ interpreted as projecting a‌ distribution, and a geometry‌‌ encoding the relationships among data points in high-dimensional‌ space, into a lower-dimensional‌ space using the GW‌‌ perspective. Based on these principles, we developed a‌ library for dimensionality reduction‌ in Pytorch 6.1.5.‌‌ Finally, we investigated the relations between OT and‌ mixture models, and write‌ a small tutorial on‌‌ the subject in 48. These works are‌ at the core of‌ further research on OT‌‌ and self-supervised learning methods, as explored during the‌ intership of Chady Essouabri‌ in collaboration with Franck‌‌ Picard.

7.4 Large-scale convex and nonconvex optimization

7.4.1‌ Multilevel schemes for image‌ restoration

Participants: Elisa Riccietti‌‌, Paulo Gonçalves, Edgar Desainte-Mareville.

Collaboration‌ with Nelly Pustelnik (CNRS,‌ ENS de Lyon), Nils‌‌ Laurent (ENS de Lyon)

In the context of‌ the Ph.D. work of‌ Guillaume Lauga (defended on‌‌ the 18/12/2024), we studied the combination of multilevel‌ schemes and proximal methods‌ 5, 83,‌‌ 84, 81, 82. Pushing further‌ in this direction, we‌ studied the link between‌‌ multilevel and block coordinate methods and their convergence‌ analysis 33. This‌ line of research is‌‌ also the object of the PhD thesis of‌ Edgar Desainte-Mareville. Its aim‌ is to investigate how‌‌ to unroll such multilevel strategies in order to‌ learn important ingredients such‌ as the transfer operators.‌‌ In order to do that, an improved understanding‌ of the link between‌ multilevel and block methods‌‌ is essential.

7.4.2 Stochastic multilevel schemes

Participants: Elisa‌ Riccietti.

Collaboration with‌ Margherita Porcelli (UNIFI, Firenze,‌‌ Italy) and Filippo Marini‌ (UNIBO, Bologna, Italy)

Classical deterministic multilevel schemes are‌ limited by the need of regularly handling the‌ high level expensive objective function and are usuited‌ to solve stochastic problems such as expected risk‌ minimization. We proposed a stochastic extension of the‌ multilevel framework 46 that does not require the‌ finest approximation to coincide with the original objective‌ function along all the optimization process. This allows‌ for significantly decreasing the cost of the multilevel‌ paradigm, for instance in data-fitting problems, where considering‌ all the data at each iteration can be‌ avoided.

7.4.3 Reproducible benchmarking of optimization algorithms

Participants:‌ Mathurin Massias, Florian Kozikowski.

Collaboration with‌ Thomas Moreau (MIND, Inria Saclay), Badr Moufad (Ecole‌ Polytechnique), Nelly Pustelnik (CNRS, ENS de Lyon).

The‌ team continues working on reproducible optimisation benchmarks, with‌ Benchopt 7, a collaborative framework to automate,‌ reproduce and publish benchmarks in machine learning across‌ programming languages and hardware architectures. We continued to‌ publish open source implementations of state-of-the-art solvers on‌ major ML problems, and a detailed comparison of‌ the regimes in which they succeed and fail‌ respectively. In 2025, thanks to the internship of‌ Florian Kozikowski, we implemented new benchmarks (Poisson regression).‌ We are currently planning to develop benchmarks related‌ to generative models.

7.4.4 Algorithms for large scale‌ sparse linear models

Participants: Mathurin Massias.

Collaboration‌ with Quentin Bertrand (INRIA MALICE), Badr Moufad (Ecole‌ Polytechnique)

Based on our seminal works in 93‌ and 59, we continued to develop and‌ implement new state-of-the-art solvers for optimization problems with‌ millions of variables in the context of sparse‌ linear models 58, implemented in the skglm‌ package (see Section 6.1.1), that was integrated‌ into the ecosystem of the scikit-learn package. In‌ 2025, the internship work of Florian Kozikowski allowed‌ implementing new solvers (Poisson, Group Poisson and Gamma‌ regression) as well as a complete rewriting of‌ the documentation.

8 Bilateral contracts and grants‌ with industry

8.1 Bilateral grants with industry

CIFRE‌ contract with CNES, Paris on "Optimized on-board decision‌ with fast energy-efficient neural networks". This PhD thesis‌ is in collaboration with Stéphane May, engineer at‌ CNES.

Participants: Rémi Gribonval, Titouan Vayer,‌ Arthur Lebeurrier.

Duration: 3 years (2024-2027)

Partners:‌ CNES, Paris; ENS de Lyon

Funding: CNES, Paris;‌ PEPR IA SHARP

Context: ANR Chaire IA AllegroAssai‌ 9.2.2

This thesis aims to develop compact, high-performance‌ neural networks tailored to on-board constraints, enabling optimized‌ decision-making on low-energy platforms. It includes an exploration‌ of parsimony structures suited for deep networks and‌ a comprehensive study of quantization and optimization techniques‌ for neural networks.

Funding from Facebook Artificial Intelligence‌ Research, Paris

Participants: Rémi Gribonval.

Duration: 5‌ years (2021-2025)

Partners: Facebook Artificial Intelligence Research, Paris;‌ ENS de Lyon

Funding: Facebook Artificial Intelligence Research,‌ Paris

Context: Chaire IA AllegroAssai 9.2.2

This is‌ supporting the research conducted in the framework of‌ the Chaire IA AllegroAssai.

9 Partnerships and cooperations‌

9.1 International research visitors

Laurent JACQUES

Status:
researcher
Institution of origin:
Université‌ de Louvain
Country:
Belgium‌
Dates:
Sept. 1, 2025‌‌ till June 30, 2026
Context of the visit:‌
Inria chair from the‌ Collegium of Lyon
Mobility‌‌ program/type of mobility:
sabbatical

9.2 National initiatives

9.2.1‌ PEPR IA project :‌ SHARP

Participants: Rémi Gribonval‌‌ [correspondant], Paulo Goncalves, Elisa Ricietti,‌ Marion Foare, Mathurin‌ Massias, Titouan Vayer‌‌, Arthur Lebeurrier, Mael Chaumette.

Partnership‌ with LAMSADE (PSL); LIGM‌ (ENPC); GENESIS (Inria London‌‌ & University College London); IRISA; CEA List; ISIR‌ (Sorbonne Université)

Duration of‌ the project: 2023 -‌‌ 2029.

The vision of the SHARP proposal‌ is that the resources‌ required to train ML‌‌ models can be decreased by several orders of‌ magnitude, with negligible performance‌ loss compared to the‌‌ state of the art. This means significantly reducing‌ the dimensionality of predictors‌ (to reduce inference costs)‌‌ and of their gradients (to reduce training and‌ bandwidth costs in distributed‌ settings), the amount of‌‌ data needed to learn (to address data scarce‌ settings up to zero-shot‌ learning, and incremental learning‌‌ scenarios), and compressing datasets before learning (to reduce‌ storage and compute requirements,‌ and address privacy concerns).‌‌

9.2.2 ANR IA Chaire : AllegroAssai

Participants: Rémi‌ Gribonval [correspondant], Paulo‌ Goncalves, Elisa Ricietti‌‌, Marion Foare, Mathurin Massias, Léon‌ Zheng, Quoc-Tung Le‌, Antoine Gonon,‌‌ Titouan Vayer, Ayoub Belhadji, Clement Lalanne‌, Can Pouliquen.‌

Past members: Luc Giffon.‌‌

Duration of the project: 2020 - 2025.‌

AllegroAssai focuses on the‌ design of machine learning‌‌ techniques endowed both with statistical guarantees (to ensure‌ their performance, fairness, privacy,‌ etc.) and provable resource-efficiency‌‌ (e.g. in terms of bytes and flops, which‌ impact energy consumption and‌ hardware costs), robustness in‌‌ adversarial conditions for secure performance, and ability to‌ leverage domain-specific models and‌ expert knowledge. The vision‌‌ of AllegroAssai is that the versatile notion of‌ sparsity, together with sketching‌ techniques using random features,‌‌ are key in harnessing these fundamental tradeoffs. The‌ first pillar of the‌ project is to investigate‌‌ sparsely connected deep networks, to understand the tradeoffs‌ between the approximation capacity‌ of a network architecture‌‌ (ResNet, U-net, etc.) and its “trainability” with provably-good‌ algorithms. A major endeavor‌ is to design efficient‌‌ regularizers promoting sparsely connected networks with provable robustness‌ in adversarial settings. The‌ second pillar revolves around‌‌ the design and analysis of provably-good end-to-end sketching‌ pipelines for versatile and‌ resource-efficient large-scale learning, with‌‌ controlled complexity driven by the structure of the‌ data and that of‌ the task rather than‌‌ the dataset size.

9.2.3 ANR DataRedux

Participants: Paulo‌ Goncalves [correspondant], Rémi‌ Gribonval, Marion Foare‌‌.

Collaboration with Marton Karsai (former PI, ECU‌ Austria), Pierre Borgnat (ENS‌ de Lyon)

Duration of‌‌ the project: February 2020 - January 2024 prolonged‌ to March 31, 2026‌.

DataRedux puts forward‌‌ an innovative framework to reduce networked data complexity‌ while preserving its richness,‌ by working at intermediate‌‌ scales (“mesoscales”). Our objective‌ was to contribute to the theoretical understanding and‌ representation of rich and complex networked datasets for‌ use in predictive data-driven models. Our main novelty‌ has been to define network reduction techniques in‌ two particular usecases: one in relation with the‌ dynamical processes occurring on the networks, and the‌ second related to the clustering of large size‌ graphs. Both approches relied on the extracting information‌ and knowledge at different scales in a human-accessible‌ way by extracting structures from high-resolution, diverse and‌ heterogeneous data.

Our guideline in the DataRedux project‌ was to identify methods for aggregating data at‌ intermediate scales and new types of data representations‌ related to dynamic processes, which preserve the richness‌ of information contained in the original data, while‌ retaining their most relevant models for easy integration‌ into data-based digital models to facilitate decision-making and‌ obtain actionable information.

9.2.4 ANR JCJC EROSION

Participants:‌ Mathurin Massias.

Duration of the project: December‌ 2023 - December 2026.

Collaboration with Emmanuel‌ Soubies (PI of the project, CNRS, IRIT), Paul‌ Escande (CR CNRS, I2M), Cédric Févotte (DR CNRS,‌ IRIT), Henrique Goulart (MdC INP, IRIT) and Joseph‌ Salmon (Prof. Université de Montpellier, IMAG)

The promise‌ of EROSION is to push the frontiers of‌ sparse and low-rank optimization by combining the strengths‌ of exact relaxations and local optimization. More precisely,‌ we propose to move away from the appealing‌ convex relaxation requiring too strong assumptions to ensure‌ the equivalence with the original problem. Instead, EROSION‌ will address the following two research objectives. 1‌ : Deriving exact relaxations of $ℓ_{0}$ regression‌ (= same global minimizers) which, although still non-convex,‌ are more amenable to non-convex local optimization (e.g.,‌ less local minimizers, wider basins of attraction). 2‌ : Developing new local optimization strategies that exploit‌ the nice properties of such exact relaxations so‌ as to improve both the quality of reached‌ local extrema and the convergence speed over existing‌ solvers.

In OCKHAM, this collaboration has lead to‌ the internship of Anne Gagneux (co-supervized with Emmanuel‌ Soubiès), on the design of new sorted non-convex‌ penalties and the computation of their proximal operators.‌

9.2.5 ANR JCJC MEPHISTO

Participants: Elisa Riccietti [correspondant]‌.

Duration of the project: November 2024 -‌ November 2028.

This project focuses on large‌ scale optimization problems in signal processing and imaging.‌ We consider a special class of such problem:‌ those that admit a hierarchical structure. The aim‌ of the project is to develop parsimonious methods‌ for their solution by exploiting such underlying structure.‌ We will focus on four different kinds of‌ hierarchical structures: those arising from the geometry or‌ physics of the problem (such as multiple resolutions‌ in images or discretization of infinite dimensional problems);‌ those that can be built by exploiting the‌ analytical structure of some problems (training of neural‌ networks, data-fitting problems); those that can be built‌ exploiting the intrinsic structure of the algebraic tools‌ involved (matrix, tensors, such as in matrix factorization problems); those that can‌ be built exploiting multiple‌ numerical formats (floating point‌‌ numbers with reduced number of bits) .

The‌ ambition of this project‌ is thus to develop‌‌ a large family of parsimonious multiresolution, multilevel and‌ multiprecision algorithms that are‌ not only efficient but‌‌ that can also rely on solid mathematical foundations.‌

9.2.6 Defi Hive Inria‌ Cupseli

Participants: Elisa Riccietti‌‌ [correspondant], Remi Gribonval.

Duration of the‌ project: September 2025-September 2028‌.

The Cupseli challenge‌‌ aims to demonstrate that it is possible to‌ run complex applications on‌ heterogeneous, distributed, and volatile‌‌ resources, while achieving good parallel efficiency and preserving‌ both accuracy and confidentiality.‌ It explores algorithmic and‌‌ system-level solutions to optimize computation, memory, and communication,‌ while ensuring security and‌ fault tolerance. The work‌‌ is organized around three main axes: Frugality (adapting‌ training and inference to‌ limited and dynamic resources),‌‌ Security and confidentiality (protecting data and models through‌ encryption, secure enclaves, and‌ defenses against attacks), and‌‌ Volatility (ensuring robustness and performance despite the unpredictable‌ arrival and departure of‌ resources).

9.2.7 DI2A -‌‌ Subvention Simone et Cino del Duca, Institut de‌ France.

Participants: Elisa Riccietti‌, Marion Foare,‌‌ Paulo Goncalves.

Duration of the project: December‌ 2023 - December 2025‌.

This project focuses‌‌ on the physics-informed design of architectures and multiresolution‌ deep learning techniques for‌ large scale image restoration‌‌ and data analysis for astronomy. With the term‌ physics-informed design we refer‌ to all the deep‌‌ learning strategies in which the choice of the‌ architecture, biases and activation‌ functions of neural networks‌‌ is guided by the underlying physics of data‌ acquisition and/or from the‌ optimization proximal schemes employed‌‌ for the solution. From an application point of‌ view, the project targets‌ problems in astronomy and‌‌ specifically the study of circumstellars environments through the‌ instrument SPHERE/IRDIS. We aim‌ to propose innovative reconstruction‌‌ approaches partially supervised or even non supervised.

9.2.8‌ GDR ISIS project PROSSIMO‌

Participants: Mathurin Massias [correspondant]‌‌, Rémi Gribonval, Anne Gagneux, Emmanuel‌ Soubies.

Duration of‌ the project: September 2023‌‌ - September 2025.

Composite optimisation problems are‌ ubiquitous in machine learning,‌ signal, and image processing.‌‌ With the proximal algorithms used to solve them,‌ they have met with‌ great success in applications‌‌ and have been extensively studied. More recently, so-called‌ 'plug-and-play' (PNP) methods, inspired‌ by proximal algorithms, propose‌‌ new iterative algorithms in which the application of‌ the proximal operator of‌ the regulariser is replaced‌‌ by a pre-existing denoiser or a learned operator.‌ Their flexibility, however, complicates‌ their theoretical analysis, because‌‌ in the general case the operator does not‌ have the interesting properties‌ of proximal operators. In‌‌ the PROSSIMO project, we propose to implement and‌ study PNP operators via‌ neural networks, while guaranteeing‌‌ that these operators have the same properties as‌ proximal operators. We aim‌ at combining the flexibility‌‌ of PNP methods with the rigorous theoretical guarantees‌ of model-based methods. In‌ addition to implementing such‌‌ networks, we propose to‌ study their approximation capacity: what classes of function‌ can they approximate, and at what speed?

9.2.9‌ ANR TSIA BenchArk

Participants: Mathurin Massias [correspondant].‌

Duration of the project: October 2024 - October‌ 2028.

Collaboration with Thomas Moreau, Gaël Varoquaux‌ (INRIA Saclay) and Joseph Salmon (INRIA Montpellier).

Numerical‌ evaluation of novel methods, a.k.a. benchmarking, is a‌ pillar of the scientific method in machine learning.‌ However, due to practical and statistical obstacles, the‌ reproducibility of published results is currently insufficient: many‌ details can invalidate numerical comparisons, from insufficient uncertainty‌ quantification to improper methodology. In 2022, the Benchopt‌ initiative provided an open source Python package together‌ with a framework to seamlessly run, reuse, share‌ and publish benchmarks in numerical optimization. The BenchArk‌ project aims at bringing Benchopt to the whole‌ machine learning community, making it a new standard‌ in benchmarking by empowering researchers and practitioners with‌ efficient and valid benchmarking methods. Our goal is‌ to ensure reproducibility and consistency in model evaluation.‌ We will federate the machine learning community to‌ develop informative and statistically valid benchmarks, while providing‌ methods to reduce identified hurdles in implementing such‌ practices.

9.2.10 ANR SEIZURE

Participants: Paulo Goncalves [correspondant]‌, Can Pouliquen.

Duration of the project:‌ September 2024 - August 2028

Collaboration with Carole‌ Lartizien (PI of the project, CNRS, Insa de‌ Lyon, CREATIS), Julien Jung (MD-PhD, Hospices Civils de‌ Lyon, CRNL), Pierre Borgnat (CNRS, ENS de Lyon,‌ Physics Lab).

“Seeing the EpileptogenIc Zone through machine‌ Learning on strUctuRal, functional and clinical nEurological data”‌

This project deals with the multimodal detection and‌ the characterisation of epileptic zones in neuroimaging and‌ intracranial EEG (iEEG). Ockham is mainly involved in‌ WP3 (P. Borgnat leader) that aims at analysing‌ the propagation of biomarkers within the brain as‌ an indicator of the dynamic interictal epileptogenic network.‌ A detailed understanding of the brain network and‌ its key hubs provides invaluable insights into surgical‌ outcomes. In a previous PhD work (G. Frusque,‌ 2017-2020) we derived graphical lasso techniques on iEEG‌ data to infer graphs times series, as relevant‌ connectivity networks. In Seizure, we envision to enrich‌ our previous approaches with deep learning based models‌ and more specifically with graph recurrent neural networks‌ and neural implicit representations.

10 Dissemination

10.1 Promoting‌ scientific activities

10.1.1 Scientific events: organisation

Organization of‌ the COLT 2025 conference, (30/06/25 – 04/07/25),‌ Remi Gribonval
Organization of the GDR IASIS Thematic‌ Day on Flow matching, diffusion and their applications‌ (24/10/25) Mathurin Massias
Organization of the GDR IASIS‌ Thematic Day on optimal transport and machine learning‌ (17/02/25) Titouan Vayer
SMAI minisymposium on generative modelling,‌ optimal transport and image restoration, (02/06/25 – 06/06/25)‌ Mathurin Massias
One-day workshop Sharp and Foundry: On‌ Frugal and Robust Foundations for Machine Learning,‌ ENS Lyon (30/06/25) Remi Gribonval

10.1.2 Scientific events:‌ selection

Member of the conference program committees

Mathurin‌ Massias – Area Chair for NeurIPS, ICML.
Titouan‌ Vayer – Member of the GRETSI program comittee, area chair for ICML.‌
Rémi Gribonval
- MIA'25 Program‌ Committee;
- Organizer of‌‌ a Minisymposium on "Mathematical aspects of deep learning",‌ Curves & Surfaces 2026‌, St-Malo, June 8-12‌‌ 2026;
- Scientific board of JRAF (Journées de recherche‌ en apprentissage frugal), Grenoble,‌ Nov 26-27 2025;
- Scientific‌‌ board of a workshop on the Mathematics of‌ AI, Institut de‌ mathématiques de Bordeaux, Nov‌‌ 4-6 2026

Organization of the weekly "Machine Learning‌ and Signal Processing (MLSP)"‌ seminar (about twenty presentations‌‌ in 2025) Marion Foare ; Paulo Goncalves ;‌ Remi Gribonval ; Mathurin‌ Massias ; Elisa Riccietti‌‌ ; Titouan Vayer

10.1.3 Journal

Member of the‌ editorial boards

Mathurin Massias‌ – Associate Editor for‌‌ TMLR
Remi Gribonval – Associate Editor for Constructive‌ Approximation (Springer); founding member‌ of the Editorial Board‌‌ of Mathematical Foundations of Machine Learning (Springer), Senior‌ Area Editor for the‌ IEEE Signal Processing Magazine‌‌

10.1.4 Invited talks

Elisa Riccietti – Journée AILYS,‌ ENS Lyon, 14/02/2025
Elisa‌ Riccietti – MIA25 conference,‌‌ Paris, 13/01/2025-15/01/2025.
Anne Gagneux – Séminaire Imaging In‌ Paris, 06/05/25
Mathurin Massias‌ – Séminaire Palaisien, 04/11/25‌‌
Mathurin Massias – Séminaire IMAGINE, 05/11/25
Remi Gribonval‌
- Rencontre nationale du RT‌ Optimisation, INSA Lyon,‌‌ Nov 26-28 2025
- Workshop “(Blind) inverse problems in‌ imaging: from foundations to‌ applications”, CIRM, Luminy,‌‌ Sep 29-Oct 3 2025
- Festum Pi Mathematics Conference‌, Chania, Crete, July‌ 21-25 2025;
- Workshop on‌‌ Mathematics of Data Science, Cente Lagrange, Paris, May‌ 13-15 2025
- PEPR IA‌ Days, CentraleSupelec, Mar‌‌ 18th 2025
- as well as invited seminars: DATASHAPE‌ Team, Inria Saclay, Nov‌ 6 2025; Talk @MALGA‌‌ Seminar - Genova, April 28th 2025, Apr‌ 28th 2025; Journée AILYS‌ at ENS Lyon, Feb‌‌ 14th 2025; Séminaire "mathématiques de l'IA", IMB, Bordeaux,‌ Jan 30th 2025; Séminaire‌ MMCS de l'ICJ,‌‌ Lyon 1 , Jan 7th 2025

10.1.5 Leadership‌ within the scientific community‌

Remi Gribonval
- Scientific Committee‌‌ of RT MAIAGES (formerly RT/GDR MIA);
- Comité de‌ Liaison SIGMA-SMAI;
- Board‌ of the GRETSI association;‌‌
- Cellule ERC of Inria, mentoring for ERC candidates‌ in computer science and‌ applied mathematics at the‌‌ national Inria level
Mathurin Massias – Secretary of‌ the MODE group of‌ SMAI

10.1.6 Scientific expertise‌‌

Remi Gribonval – Scientific Advisory Board of the‌ Acoustics Research Institute of‌ the Austrian Academy of‌‌ sciences
Elisa Riccietti – Scientific Board of the‌ Federation Informatique de Lyon‌ (Conseil Scientifique de‌‌ la FIL)

10.1.7 Research administration

Paulo Goncalves‌
- member of the steering‌ committee for the ShapeMed@Lyon‌‌ consortiums Data for Health workshop
- Scientific Director of‌ the Inria Centre of‌ Lyon and member of‌‌ the Inria Evaluation Committee.

10.2 Teaching - Supervision‌ - Juries - Educational‌ and pedagogical outreach

10.2.1‌‌ Teaching

Master:
- Elisa Riccietti – Optimisation (ENS Lyon)‌ and Harnessing inexactness in‌ scientific computing (ENS Lyon)‌‌
- Mathurin Massias – Python for datascience (Ecole Polytechnique),‌ Statistics (Ecole Polytechnique),Optimal Transport‌ for Machine and Deep‌‌ Learning (ENS Lyon), Fundamentals of Machine Learning (ENS‌ Lyon), Generative Models (ENS‌ Lyon)
- Titouan Vayer –‌‌ Optimal Transport for Machine‌ and Deep Learning (ENS Lyon), Fundamentals of Machine‌ Learning (ENS Lyon)
- Marion Foare – Image and‌ Signal Processing, Inverse problems and optimization (CPE Lyon)‌
- Paulo Goncalves – Image and Signal Processing (CPE‌ Lyon)
- Remi Gribonval – Inverse problems and high‌ dimension; Mathematical foundations of deep neural networks; Concentration‌ of measure in probability and high-dimensional statistical learning;‌ M2, ENS Lyon

10.2.2 Supervision

All PhD students‌ of the team are co-supervised by at least‌ one team member. In addition, some team members‌ are involved in co-supervisions of students hosted in‌ other labs:

Elisa Riccietti – co-supervision of the‌ PhD of Filippo Marini with Margherita Porcelli (Università‌ di Bologna) – defence on 16/06/2025
Remi Gribonval‌ – co-supervision of the PhD of Sibylle Marcotte‌ with Gabriel Peyré since 2022 (Center for Data‌ Science, ENS Paris) – defense on 21/11/2025
Marion‌ Foare – co-supervision of the PhD of Luis‌ Enrique Amador Arya with Hélène Ratiney and Éric‌ Van Reeth (Creatis, Villeurbanne) and Siemens Healthcare (Saint‌ Denis) since 2023

PhD defenses in Ockham in‌ 2025:

Can Pouliquen

10.2.3 Juries

Members of the‌ Ockham team participated in the following juries :‌

Elisa Riccietti – PhD defence of Iskander Legheraba‌ (Dauphine Université, Paris), CSI of Xavier Pillet (PhD‌ student, Lyon 1 University)
Mathurin Massias – CSI‌ of Yu-Han Wu (PhD Student, Sorbonne Université)
Paulo‌ Goncalves – PhD defense of Valerian Mange (U‌ Toulouse), CSI of Andréa Ducos (PhD Student, Lyon‌ 1 University)
Titouan Vayer – Junjie Yang (07/04/2025,‌ examiner), member of the CSI for Antonin Joly‌ (PhD Student, IRISA), Antoine Monier (PhD Student, IRISA).‌
Remi Gribonval – PhD defenses of: Armand Foucault‌ (26/05/25, Université de Toulouse, reviewer); Blaise Delattre (16/2/25,‌ Dauphine PSL, reviewer); Manon Verbockhaven (28/03/25, Université Paris-Saclay,‌ reviewer); Maud Biquard (5/11/25, Université de Toulouse, president);‌ Mimoun Mohamed (31/03/25, Aix-Marseille Université, examiner); Volodimir Mitarchuk‌ (17/01/25, Université Jean Monnet Saint-Étienne, president); Pierre Warion‌ (19/11/25, Aix-Marseille Université, examiner); Romain Verdière (8/12/25, Université‌ Grenoble Alpes, president).

11 Scientific production

11.1 Major‌ publications

1 inproceedingsQ.Quentin Bertrand, A.‌Anne Gagneux, M.Mathurin Massias and R.‌Rémi Emonet. On the Closed-Form of Flow‌ Matching: Generalization Does Not Arise from Target Stochasticity‌.NeurIPS 2025NeurIPS 2025 - 39th Annual‌ Conference on Neural Information Processing SystemsSan Diego‌ (CA), United StatesDecember 2025HAL back to‌ text back to text
2 articleR.Rémi‌ Gribonval, G.Gilles Blanchard, N.Nicolas‌ Keriven and Y.Yann Traonmilin. Compressive Statistical‌ Learning with Random Feature Moments.Mathematical Statistics‌ and Learning32August 2021, 113–164‌HAL DOI back to text
3 articleR.‌Rémi Gribonval, A.Antoine Chatalic, N.‌Nicolas Keriven, V.Vincent Schellekens, L.‌Laurent Jacques and P.Philip Schniter. Sketching‌ Data Sets for Large-Scale Learning: Keeping only what‌ you need.IEEE Signal Processing Magazine38‌5September 2021, 12-36HAL DOI back‌ to text
4 articleR.Rémi Gribonval, E.Elisa Riccietti,‌ Q.-T.Quoc-Tung Le and‌ L.Léon Zheng.‌‌ Rapture of the deep: highs and lows of‌ sparsity in a world‌ of depths.IEEE‌‌ Signal Processing MagazineJune 2025, 22 p.‌HAL back to text‌back to text
5‌‌ articleG.Guillaume Lauga, E.Elisa Riccietti‌, N.Nelly Pustelnik‌ and P.Paulo Gonçalves‌‌. IML FISTA: A Multilevel Framework for Inexact‌ and Inertial Forward-Backward. Application‌ to Image Restoration.‌‌SIAM Journal on Imaging SciencesJune 2024HAL‌DOI back to text‌
6 proceedingsPNP-FLOW: Plug-And-Play‌‌ Image Restoration with Flow Matching.International Conference‌ on Learning RepresentationsSingapore,‌ SingaporeApril 2025HAL‌‌back to text back to text
7 inproceedings‌T.Thomas Moreau,‌ M.Mathurin Massias,‌‌ A.Alexandre Gramfort, P.Pierre Ablin,‌ P.-A.Pierre-Antoine Bannier,‌ B.Benjamin Charlier,‌‌ M.Mathieu Dagréou, T.Tom Dupré La‌ Tour, G.Ghislain‌ Durif, C. F.‌‌Cassio F. Dantas, Q.Quentin Klopfenstein,‌ J.Johan Larsson,‌ E.En Lai,‌‌ T.Tanguy Lefort, B.Benoit Malézieux,‌ B.Badr Moufad,‌ B. T.Binh T‌‌ Nguyen, A.Alain Rakotomamonjy, Z.Zaccharie‌ Ramzi, J.Joseph‌ Salmon and S.Samuel‌‌ Vaiter. Benchopt: Reproducible, efficient and collaborative optimization‌ benchmarks.NeurIPS 2022‌ - 36th Conference on‌‌ Neural Information Processing SystemsNew Orleans, United States‌November 2022HAL back‌ to text
8 article‌‌B.Benjamin Ricaud, P.Pierre Borgnat,‌ N.Nicolas Tremblay,‌ P.Paulo Gonçalves and‌‌ P.Pierre Vandergheynst. Fourier could be a‌ Data Scientist: from Graph‌ Fourier Transform to Signal‌‌ Processing on Graphs.Comptes Rendus. PhysiqueSeptember‌ 2019, 474-488HAL‌DOI back to text‌‌
9 articleH.Hugues Van Assel, C.‌Cédric Vincent-Cuaz, N.‌Nicolas Courty, R.‌‌Rémi Flamary, P.Pascal Frossard and T.‌Titouan Vayer. Distributional‌ Reduction: Unifying Dimensionality Reduction‌‌ and Clustering with Gromov-Wasserstein.Transactions on Machine‌ Learning Research JournalJune‌ 2025HAL back to‌‌ text
10 articleT.Titouan Vayer and R.‌Rémi Gribonval. Controlling‌ Wasserstein Distances by Kernel‌‌ Norms with Application to Compressive Statistical Learning.‌Journal of Machine Learning‌ Research24149April‌‌ 2023, 1--51HALback to text

11.2‌ Publications of the year‌

International journals

11 article‌‌A.Ayoub Belhadji, R.Rémi Bardenet and‌ P.Pierre Chainais.‌ Signal reconstruction using determinantal‌‌ sampling.Applied and Computational Harmonic Analysis2025‌HAL
12 articleA.‌Alice Brenon. Les‌‌ effets de la brièveté dans les entrées encyclopédiques‌.Syntaxe et sémantique‌2025. In press.‌‌ HAL
13 articleA.Anne Gagneux, M.‌Mathurin Massias, E.‌Emmanuel Soubies and R.‌‌Rémi Gribonval. Convexity in ReLU Neural Networks:‌ beyond ICNNs?Journal of‌ Mathematical Imaging and Vision‌‌674June 2025, 40HAL DOI‌back to text
14‌ articleR.Rémi Gribonval‌‌, E.Elisa Riccietti‌, Q.-T.Quoc-Tung Le and L.Léon Zheng‌. Rapture of the deep: highs and lows‌ of sparsity in a world of depths.‌IEEE Signal Processing MagazineJune 2025, 22‌ p.HAL
15 articleC.Clara Lage,‌ N.Nelly Pustelnik, J.-M.Jean-Michel Arbona,‌ B.Benjamin Audit and R.Rémi Gribonval.‌ Identifying a piecewise affine signal from its nonlinear‌ observation -application to DNA replication analysis.IEEE‌ Transactions on Signal Processing732025, 1278‌ - 1292HAL DOI
16 articleE.Etienne‌ Lasalle, R.Rémi Vaudaine, T.Titouan‌ Vayer, P.Pierre Borgnat, R.Rémi‌ Gribonval, P.Paulo Gonçalves and M.Màrton‌ Karsai. PASCO (PArallel Structured COarsening): an overlay‌ to speed up graph clustering algorithms.Machine‌ Learning114August 2025, 212HAL DOI‌back to text
17 articleQ.-T.Quoc-Tung Le‌, L.Léon Zheng, E.Elisa Riccietti‌ and R.Rémi Gribonval. Butterfly factorization with‌ error guarantees.SIAM Journal on Matrix Analysis‌ and Applications464July 2025HAL DOI‌back to text
18 articleJ.Julián Tachella‌, M.Matthieu Terris, S.Samuel Hurault‌, A.Andrew Wang, L.Leo Davy‌, J.Jérémy Scanvic, V.Victor Sechaud‌, R.Romain Vo, T.Thomas Moreau‌, T.Thomas Davies, D.Dongdong Chen‌, N.Nils Laurent, B.Brayan Monroy‌, J.Jonathan Dong, Z.Zhiyuan Hu‌, M.-H.Minh-Hai Nguyen, F.Florian Sarron‌, P.Pierre Weiss, P.Paul Escande‌, M.Mathurin Massias, T.Thibaut Modrzyk‌, B.Brett Levac, T. I.Tobías‌ I Liaudat, M.Maxime Song, J.‌Johannes Hertrich, S.Sebastian Neumayer and G.‌Georg Schramm. DeepInverse: A Python package for‌ solving imaging inverse problems with deep learning.‌Journal of Open Source Software10115November‌ 2025, 8923HALDOI back to text‌
19 articleH.Hugues Van Assel, C.‌Cédric Vincent-Cuaz, N.Nicolas Courty, R.‌Rémi Flamary, P.Pascal Frossard and T.‌Titouan Vayer. Distributional Reduction: Unifying Dimensionality Reduction‌ and Clustering with Gromov-Wasserstein.Transactions on Machine‌ Learning Research JournalJune 2025HAL

Invited conferences‌

20 inproceedingsR.Rémi Gribonval and E.Elisa‌ Riccietti. Une brève histoire de la parcimonie‌ : du traitement de signal à l'apprentissage profond‌.GRETSI 2025 – XXXème Colloque Francophone de‌ Traitement du Signal et des ImagesStrasbourg, France‌August 2025HAL

International peer-reviewed conferences

21 inproceedings‌Q.Quentin Bertrand, A.Anne Gagneux,‌ M.Mathurin Massias and R.Rémi Emonet.‌ On the Closed-Form of Flow Matching: Generalization Does‌ Not Arise from Target Stochasticity.NeurIPS 2025‌NeurIPS 2025 - 39th Annual Conference on Neural‌ Information Processing SystemsSan Diego (CA), United States‌December 2025HAL
22 inproceedingsA.Antoine Gonon‌, N.Nicolas Brisebarre, E.Elisa Riccietti and R.Rémi Gribonval‌. A Rescaling-Invariant Lipschitz‌ Bound Based on Path-Metrics‌‌ for Modern ReLU Network Parameterizations.Proceedings of‌ the 42nd International Conference‌ on Machine Learning (ICML‌‌ 2025)International Conference on Machine LearningVancouver (BC),‌ CanadaJuly 2025HAL‌back to text
23‌‌ inproceedingsA.Antoine Gonon, L.Léon Zheng‌, P.Pascal Carrivain‌ and Q.-T.Quoc-Tung Le‌‌. Fast Inference with Kronecker-Sparse Matrices.Proceedings‌ of the 42nd International‌ Conference on Machine Learning‌‌ (ICML 2025)International Conference on Machine LearningVancouver‌ (BC), CanadaJuly 2025‌HAL back to text‌‌
24 inproceedingsP.Pierre Houedry, N.Nicolas‌ Courty, F.Florestan‌ Martin-Baillon, L.Laetitia‌‌ Chapel and T.Titouan Vayer. Bridging Arbitrary‌ and Tree Metrics via‌ Differentiable Gromov Hyperbolicity.‌‌NeurIPS 2025 - 39th Annual Conference on Neural‌ Information Processing SystemsSan‌ diego (Californie), United States‌‌December 2025HAL back to text
25 inproceedings‌H. T.Hoang Trieu‌ Vy Le, M.‌‌Marion Foare, A.Audrey Repetti and N.‌Nelly Pustelnik. Unfolded‌ discrete Mumford-Shah functional for‌‌ joint image denoising and edge detection.EUSIPCO‌ 2025 - 32nd European‌ Signal Processing ConferencePalerme,‌‌ ItalySeptember 2025HAL
26 inproceedingsS.Sibylle‌ Marcotte, R.Rémi‌ Gribonval and G.Gabriel‌‌ Peyré. Transformative or Conservative? Conservation laws for‌ ResNets and Transformers.‌42nd International Conference on‌‌ Machine Learning (ICML 2025)Vancouver (Canada), CanadaJuly‌ 2025HAL back to‌ text back to text‌‌
27 inproceedingsS.Ségolène Martin, A.Anne‌ Gagneux, P.Paul‌ Hagemann and G.Gabriele‌‌ Steidl. PNP-FLOW: Plug-And-Play Image Restoration with Flow‌ Matching.International Conference‌ on Learning Representations 2025‌‌International Conference on Learning RepresentationsSingapore, SingaporeApril‌ 2025HAL
28 inproceedings‌C.Can Pouliquen,‌‌ M.Mathurin Massias and T.Titouan Vayer.‌ Schur's Positive-Definite Network: Deep‌ Learning in the SPD‌‌ cone with structure.International Conference on Learning‌ RepresentationsInternational Conference on‌ Learning Representations ICLR 2025‌‌Singapore, SingaporeJanuary 2025HAL back to text‌

National peer-reviewed Conferences

29‌ inproceedingsV.Valérie Castin‌‌ and R.Rémi Gribonval. Opening the Black‌ Box: Reverse-Engineering of Sparse‌ Neural Networks.30°‌‌ colloque sur le traitement du signal et des‌ imagesGRETSI 2025 –‌ XXXème Colloque Francophone de‌‌ Traitement du Signal et des ImagesStrasbourg, France‌2025HAL
30 inproceedings‌M.Maël Chaumette,‌‌ R.Rémi Gribonval and E.Elisa Riccietti.‌ CROQuant: Complex Rank-One Quantization‌ Algorithm.GRETSI 2025‌‌ – XXXème Colloque Francophone de Traitement du Signal‌ et des ImagesStrasbourg,‌ FranceAugust 2025HAL‌‌back to text
31 inproceedings A.Anne Gagneux‌, M.Mathurin Massias‌, E.Emmanuel Soubies‌‌ and R.Rémi Gribonval. How to improve‌ expressivity of convex ReLU‌ neural networks? GRETSI 2025‌‌ GRETSI 2025 - XXXème Colloque Francophone de Traitement‌ du Signal et des‌ Images Strasbourg, France 2025‌‌ HAL
32 inproceedingsG.Guillaume Lauga, M.‌Maël Chaumette, E.‌Edgar Desainte-Maréville, É.‌‌Étienne Lasalle and A.‌Arthur Lebeurrier. A multilevel approach to accelerate‌ the training of Transformers.GRETSI'25, XXème Colloque‌ Francophone de Traitement du Signal et des Images‌Strasbourg, FranceAugust 2025HAL
33 inproceedingsG.‌Guillaume Lauga, E.Elisa Riccietti, L.‌Luis Briceño-Arias, N.Nelly Pustelnik and P.‌Paulo Gonçalves. Une équivalence entre algorithmes multiniveaux‌ et algorithmes de descente par blocs.GRETSI‌Colloque GRETSI’25, XXXe Colloque Francophone de Traitement du‌ Signal et des ImagesStrasbourg, FranceAugust 2025‌HAL back to text
34 inproceedingsN.Nils‌ Laurent, E.Elisa Riccietti, J.Julian‌ Tachella and N.Nelly Pustelnik. Algorithme multiniveau‌ hybride pour la restauration d'images.GRETSI 2025‌ - XXXème Colloque Francophone de Traitement du Signal‌ et des ImagesStrasbourg, FranceAugust 2025HAL‌
35 inproceedingsC.Can Pouliquen, P.Paulo‌ Gonçalves, T.Titouan Vayer and M.Mathurin‌ Massias. En quête de précision : Un‌ benchmark open-source et un solveur polyvalent pour le‌ Graphical Lasso.GRETSI 2025 - XXXème Colloque‌ Francophone de Traitement du Signal et des Images‌Strasboug, France2025, 1-3HAL back to‌ text

Conferences without proceedings

36 inproceedingsA.Alice‌ Brenon and D.Denis Vigier. Propositions pour‌ une approche quantitative-qualitative des discours traitant des métiers‌ dans l’Encyclopédie de Diderot et d’Alembert: Proposals for‌ a quantitative-qualitative approach to discourse on trades in‌ Diderot's and d'Alembert's Encyclopedie.Journées internationales de‌ Linguistique de Corpus (JLC 2025)Lyon (ENS LSH),‌ FranceOctober 2025HAL
37 inproceedingsZ.Zacharie‌ Rodière, P.Pierre Borgnat, P.Paulo‌ Gonçalves and J.Julien Jung. Spatial contrastive‌ pre-training of transformer encoders for seeg-based seizure onset‌ zone detection.GSP 2025 - Workshop on‌ Graph Signal ProcessingMontréal (Québec), CanadaMay 2025‌, 1-3HAL back to text

Doctoral dissertations‌ and habilitation theses

38 thesisG.Giuseppe Carrino‌. Frugality in second-order optimization: floating-point approximations for‌ Newton's method.Bologna UniversityOctober 2025HAL‌back to text
39 thesisC.Can Pouliquen‌. Differentiable and learning-based methods for structure representation‌.Ecole Normale Supérieure de LyonDecember 2025‌HAL

Reports & preprints

40 miscL.Luis‌ Briceño-Arias, P.Paulo Gonçalves, G.Guillaume‌ Lauga, N.Nelly Pustelnik and E.Elisa‌ Riccietti. A flexible block-coordinate forward-backward algorithm for‌ non-smooth and non-convex optimization..October 2025HAL‌
41 miscY.Yohann de Castro, R.‌Rémi Gribonval and N.Nicolas Jouvin. Effective‌ regions and kernels in continuous sparse regularisation, with‌ application to sketched mixtures.July 2025HAL‌
42 miscE.-M.El-Mehdi El Arar, S.-I.‌Silviu-Ioan Filip, T.Theo Mary and E.‌Elisa Riccietti. Mixed precision accumulation for neural‌ network inference guided by componentwise forward error analysis‌.2025HAL back to text
43 misc‌A.Anne Gagneux, S.Ségolène Martin,‌ R.Rémi Gribonval and M.Mathurin Massias.‌ The Generation Phases of Flow Matching: a Denoising Perspective.January 2026‌HAL back to text‌
44 miscA.Anne‌‌ Gagneux, M.Mathurin Massias and E.Emmanuel‌ Soubies. Proximal Operators‌ of Sorted Nonconvex Penalties‌‌.June 2025HALback to text
45‌ miscS.Sibylle Marcotte‌, G.Gabriel Peyré‌‌ and R.Rémi Gribonval. Intrinsic training dynamics‌ of deep neural networks‌.August 2025HAL‌‌back to text
46 miscF.Filippo Marini‌, M.Margherita Porcelli‌ and E.Elisa Riccietti‌‌. A multilevel stochastic regularized first-order method with‌ application to finite sum‌ minimization.January 2025‌‌HAL back to text
47 misc D.Damien‌ Rouchouse, A.Antoine‌ Gonon, R.Rémi‌‌ Gribonval and B.Benjamin Guedj. Non-Vacuous Generalization‌ Bounds: Can Rescaling Invariances‌ Help? September 2025 HAL‌‌
48 miscT.Titouan Vayer and E.Etienne‌ Lasalle. A note‌ on the relations between‌‌ mixture models, maximum-likelihood and entropic optimal transport.‌January 2025HAL back‌ to text

Other scientific‌‌ publications

49 inproceedingsL.Luis Amador, M.‌Marion Foare, O.‌Olivier Beuf, H.‌‌Hélène Ratiney and E. V.Eric Van Reeth‌. MULTI-CONTRAST SUPER-RESOLUTION FOR‌ MRI WITH A VARIATIONAL‌‌ APPROACH: DISENTANGLING THE CHALLENGES.European Society for‌ Magnetic Resonance in Medicine‌ and Biology (ESMRMB).Marseille,‌‌ FranceOctober 2025HAL
50 inproceedingsL.Luis‌ Amador, M.Marion‌ Foare, O.Olivier‌‌ Beuf, H.Hélène Ratiney and E. V.‌Eric Van Reeth.‌ SUPER-RESOLUTION ISOTROPE VARIATIONNELLE POUR‌‌ L'IRM.Septième congrès de la Société Française‌ de Résonance Magnétique en‌ Biologie et Médecine (SFRMBM)‌‌Saint Malo, FranceMarch 2025HAL
51 misc‌A.Anne Gagneux,‌ S.Ségolène Martin,‌‌ R.Rémi Emonet, Q.Quentin Bertrand and‌ M.Mathurin Massias.‌ A Visual Dive into‌‌ Conditional Flow Matching.April 2025HAL back‌ to text

Software

52‌ softwareV.Valérie Castin‌‌ and R.Rémi Gribonval. Code for reproducible‌ research: "Opening the Black‌ Box: Reverse-Engineering of Sparse‌‌ Neural Networks".July 2025 lic: BSD 3-Clause‌ "New" or "Revised" License‌.HAL Software Heritage‌‌VCS
53 softwareM.Maël Chaumette, R.‌Rémi Gribonval and E.‌Elisa Riccietti. Code‌‌ for reproducible research - CROQuant: Complex Rank-One Quantization‌ Algorithm.July 2025‌ lic: BSD 3-Clause License‌‌.HAL Software Heritage
54 softwareA.Antoine‌ Gonon, N.Nicolas‌ Brisebarre, E.Elisa‌‌ Riccietti and R.Rémi Gribonval. Code for‌ reproducible research - A‌ rescaling-invariant Lipschitz bound using‌‌ path-metrics.0.0.1June 2025 lic: BSD 3-Clause‌ "New" or "Revised" License‌.HAL Software Heritage‌‌VCS
55 softwareE.Etienne Lasalle, R.‌Rémi Vaudaine and T.‌Titouan Vayer. PASCO‌‌.May 2025 lic: BSD 3-Clause "New" or‌ "Revised" License.HAL‌Software Heritage VCS

11.3‌‌ Cited publications

56 bookH. H.H. H.‌ Bauschke, P. L.‌P. L. Combettes and‌‌ others. Convex analysis and monotone operator theory‌ in Hilbert spaces.‌408Springer2011back‌‌ to text
57 article‌E.Esteban Bautista, P.Patrice Abry and‌ P.Paulo Gonçalves. $L^{}$ -PageRank for Semi-Supervised‌ Learning.Applied Network Science4572019‌, 1-20HAL DOIback to text
58‌ articleQ.Quentin Bertrand, Q.Quentin Klopfenstein‌, P.-A.Pierre-Antoine Bannier, G.Gauthier Gidel‌ and M.Mathurin Massias. Beyond l1: Faster‌ and better sparse models with skglm.Advances‌ in Neural Information Processing Systems352022,‌ 38950--38965back to text
59 articleQ.Quentin‌ Bertrand, Q.Quentin Klopfenstein, M.Mathurin‌ Massias, M.Mathieu Blondel, S.Samuel‌ Vaiter, A.Alexandre Gramfort and J.Joseph‌ Salmon. Implicit differentiation for fast hyperparameter selection‌ in non-smooth convex learning.Journal of Machine‌ Learning Research231April 2022, 6680‌ - 6722HAL back to text
60 book‌H.Holger Boche, R.Robert Calderbank,‌ G.Gitta Kutyniok and J.Jan Vybiral.‌ H.Holger Boche, R.Robert Calderbank,‌ G.Gitta Kutyniok and J.Jan Vybiral,‌ eds. Compressed Sensing and its Applications.Series:‌ Applied and Numerical Harmonic AnalysisMATHEON Workshop 2013‌ISSN: 2296-5009Please note that you have the‌ right to download and disseminate single chapters from‌ the book that are authored by you and‌ that are created and provided by Springer only‌ for your private and professional non-commercial research and‌ classroom use (e.g. sharing the chapter by mail‌ or in hardcopy form with research colleagues for‌ their professional non-commercial research and classroom use, or‌ to use it for presentations or handouts for‌ students). You are also entitled to use single‌ chapters for the further development of your scientific‌ career (e.g. by copying and attaching chapters to‌ an electronic or hardcopy job or grant application).‌ If you are an editor, book author or‌ chapter author, please ask the (co)-author(s) of the‌ respective individual chapter for approval before you share‌ it with other scientists since sharing chapters requires‌ the prior consent of any co-author(s) of the‌ chapter. Posting of the book or a chapter‌ on your homepage or deposit on repositories of‌ third parties is not allowed.ChamBirkhäuser, Cham‌2015, URL: http://books.google.cz/books?id=6KoYCgAAQBAJ&pg=PA340&dq=intitle:Compressed+Sensing+and+its+Applications&hl=&cd=1&source=gbs_apiDOI back to text‌
61 articleY.Yohann de Castro and F.‌Fabrice Gamboa. Exact Reconstruction using Beurling Minimal‌ Extrapolation.arXiv.orgarXiv: 1103.4951v2March 2011,‌ URL: http://arxiv.org/abs/1103.4951v2back to text
62 articleA.‌Antoine Chatalic, V.Vincent Schellekens, F.‌Florimond Houssiau, Y.-A.Yves-Alexandre De Montjoye,‌ L.Laurent Jacques and R.Rémi Gribonval.‌ Compressive Learning with Privacy Guarantees.Information and‌ Inference2021HAL back to text back to‌ text
63 incollectionP. L.P. L. Combettes‌ and J.-C.J.-C. Pesquet. Proximal splitting methods‌ in signal processing.Fixed-point algorithms for inverse‌ problems in science and engineeringSpringer2011,‌ 185--212back to text
64 articleP.Paolo‌ Di Lorenzo, P.Paolo Banelli, S.Sergio Barbarossa and S.‌Stefania Sardellitti. Distributed‌ Adaptive Learning of Graph‌‌ Signals.IEEE Transaction on Signal Processing65‌162017back to‌ text
65 bookP.‌‌ M.P. M. Djuric and R.Richard C.‌. Cooperative and Graph‌ Signal Processing: Principle and‌‌ Applications.Academic Press2018back to text‌
66 bookM.Michael‌ Elad. Sparse and‌‌ Redundant Representations.From Theory to Applications in‌ Signal and Image Processing‌Springer2010, URL:‌‌ http://books.google.fr/books?id=d5b6lJI9BvAC&printsec=frontcover&dq=sparse+and+redundant+representations&hl=&cd=1&source=gbs_apiback to text
67 articleM.Marion‌ Foare, N.Nelly‌ Pustelnik and L.Laurent‌‌ Condat. Semi-Linearized Proximal Alternating Minimization for a‌ Discrete Mumford-Shah Model.‌IEEE Transactions on Image‌‌ Processing29October 2019, 2176-2189HAL DOI‌back to text
68‌ bookS.Simon Foucart‌‌ and H.Holger Rauhut. A Mathematical Introduction‌ to Compressive Sensing.‌New York, NYSpringer‌‌2013, URL: http://link.springer.com/10.1007/978-0-8176-4948-7DOI back to text‌
69 articleJ.J.‌ Friedman, T.T.‌‌ Hastie and R.R. Tibshirani. Sparse inverse‌ covariance estimation with the‌ graphical lasso.Biostatistics‌‌932008, 432--441back to text‌
70 phdthesisG.Gaëtan‌ Frusque. Inférence et‌‌ décomposition modale de réseaux dynamiques en neurosciences.‌2020LYSEN0802020, URL:‌ http://www.theses.fr/2020LYSEN080/documentback to text‌‌
71 articleB.Benjamin Girault, P.Paulo‌ Gonçalves and E.Eric‌ Fleury. Translation on‌‌ Graphs: An Isometric Shift Operator.IEEE Signal‌ Processing Letters2212‌December 2015, 2416‌‌ - 2420HAL DOIback to text
72‌ inproceedingsA.Antoine Gonon‌, N.Nicolas Brisebarre‌‌, E.Elisa Riccietti and R.Rémi Gribonval‌. A path-norm toolkit‌ for modern networks: consequences,‌‌ promises and challenges.International Conference on Learning‌ RepresentationsErratum: in the‌ published version there was‌‌ a typo in the definition of the activation‌ matrix in Definition A.3.‌ This is fixed with‌‌ this new version.Wien, AustriaMay 2024HAL‌back to text
73‌ phdthesisA.Antoine Gonon‌‌. Harnessing symmetries for modern deep learning challenges‌ : a path-lifting perspective‌.Ecole normale supérieure‌‌ de lyon - ENS LYONNovember 2024HAL‌back to text
74‌ articleR.Rémi Gribonval‌‌, G.Gilles Blanchard, N.Nicolas Keriven‌ and Y.Yann Traonmilin‌. Compressive Statistical Learning‌‌ with Random Feature Moments.Mathematical Statistics and‌ Learning2021, URL:‌ https://hal.inria.fr/hal-01544609back to text‌‌back to text
75 articleR.Rémi Gribonval‌, G.Gilles Blanchard‌, N.Nicolas Keriven‌‌ and Y.Yann Traonmilin. Statistical Learning Guarantees‌ for Compressive Clustering and‌ Compressive Mixture Modeling.‌‌Mathematical Statistics and Learning32This preprint‌ results from a split‌ and profound restructuring and‌‌ improvements of of https://hal.inria.fr/hal-01544609v2It is a companion paper‌ to https://hal.inria.fr/hal-01544609v3August 2021‌, 165--257HAL DOI‌‌back to text
76 unpublishedR.Rémi Gribonval‌, T.Theo Mary‌ and E.Elisa Riccietti‌‌. Optimal quantization of rank-one matrices in floating-point‌ arithmetic---with applications to butterfly‌ factorizations.June 2023‌‌, working paper or‌ preprintHAL back to text
77 inproceedingsR.‌Rémi Gribonval, T.Theo Mary and E.‌Elisa Riccietti. Scaling is all you need:‌ quantization of butterfly matrix products via optimal rank-one‌ quantization.Actes du GRETSI 2023Actes du‌ GRETSI 20232023-1193Grenoble, FranceGRETSI - Groupe‌ de Recherche en Traitement du Signal et des‌ ImagesAugust 2023, 497-500HAL back to‌ text
78 articleR.Rodolphe Jenatton, J.-Y.‌Jean-Yves Audibert and F.Francis Bach. Structured‌ Variable Selection with Sparsity-Inducing Norms.Journal of‌ Machine Learning Research12Publisher: Massachusetts Institute of‌ Technology Press2011, 2777--2824URL: http://hal.inria.fr/inria-00377732back‌ to text
79 articleS.Sandeep Kumar,‌ J.Jiaxi Ying, J. V.José Vinícius‌ de M. Cardoso and D.Daniel Palomar.‌ A unified Framework for Structured Graph Learning via‌ Spectral Constraints.Journal of Machine Learning Research‌212020, 1--60back to text
80‌ inproceedingsJ.Johan Larsson, Q.Quentin Klopfenstein‌, M.Mathurin Massias and J.Jonas Wallin‌. Coordinate Descent for SLOPE.Proceedings of‌ The 26th International Conference on Artificial Intelligence and‌ StatisticsValencia, SpainApril 2023HAL back to‌ text
81 inproceedingsG.Guillaume Lauga, A.‌Audrey Repetti, E.Elisa Riccietti, N.‌Nelly Pustelnik, P.Paulo Gonçalves and Y.‌Yves Wiaux. A multilevel framework for accelerating‌ uSARA in radio-interferometric imaging.European Signal Processing‌ Conference (EUSIPCO)Lyon, FranceAugust 2024HAL DOI‌back to text
82 articleG.Guillaume Lauga‌, E.Elisa Riccietti, N.Nelly Pustelnik‌ and P.Paulo Gonçalves. Méthodes multi-niveaux pour‌ la restauration d'images hyperspectrales.Colloque GRETSI, September‌ 20232023back to text
83 inproceedingsG.‌Guillaume Lauga, E.Elisa Riccietti, N.‌Nelly Pustelnik and P.Paulo Gonçalves. Méthodes‌ proximales multi-niveaux pour la restauration d'images.GRETSI'22‌ - 28ème Colloque Francophone de Traitement du Signal‌ et des ImagesNancy, FranceSeptember 2022HAL‌back to text
84 inproceedingsG.Guillaume Lauga‌, E.Elisa Riccietti, N.Nelly Pustelnik‌ and P.Paulo Gonçalves. Multilevel FISTA for‌ image restoration.IEEE International Conference on Acoustics,‌ Speech, and Signal ProcessingIEEERhodes, GreeceJune‌ 2023HAL DOI back to text
85 phdthesis‌Q.-T.Quoc-Tung Le. Algorithmic and theoretical aspects‌ of sparse deep neural networks.Ecole normale‌ supérieure de lyon - ENS LYONDecember 2023‌HAL back to text
86 unpublishedH. T.‌Hoang Trieu Vy Le, M.Marion Foare‌, A.Audrey Repetti and N.Nelly Pustelnik‌. Embedding Blake-Zisserman Regularization in Unfolded Proximal Neural‌ Networks for Enhanced Edge Detection.2024,‌ HAL back to text
87 unpublishedH. T.‌Hoang Trieu Vy Le, M.Marion Foare‌, A.Audrey Repetti and N.Nelly Pustelnik‌. Unfolded discrete Mumford-Shah functional for joint image‌ denoising and edge detection.2025, HAL‌back to text
88 inproceedings Q.-T.Quoc-Tung Le, E.Elisa Riccietti‌ and R.Rémi Gribonval‌. Does a sparse‌‌ ReLU network training problem always admit an optimum?‌ Advances in Neural Information‌ Processing Systems 36 (NeurIPS‌‌ 2023) Advances in Neural Information Processing Systems 36‌ (NeurIPS 2023) New Orleans‌ (Lousiane), United States December‌‌ 2023 HAL back to text
89 articleQ.-T.‌Quoc-Tung Le, E.‌Elisa Riccietti and R.‌‌Rémi Gribonval. Spurious Valleys, NP-hardness, and Tractability‌ of Sparse Matrix Factorization‌ With Fixed Support.‌‌SIAM Journal on Matrix Analysis and Applications2022‌HAL back to text‌
90 inproceedingsQ.-T.Quoc-Tung‌‌ Le, L.Léon Zheng, E.Elisa‌ Riccietti and R.Rémi‌ Gribonval. Fast learning‌‌ of fast transforms, with guarantees.ICASSP 2022‌ - IEEE International Conference‌ on Acoustics, Speech and‌‌ Signal ProcessingThis paper is associated to code‌ for reproducible research available‌ at https://hal.inria.fr/hal-03552956Singapore, Singapore‌‌May 2022HAL DOIback to text back‌ to text back to‌ text
91 inproceedingsS.‌‌Sibylle Marcotte, R.Rémi Gribonval and G.‌Gabriel Peyré. Abide‌ by the Law and‌‌ Follow the Flow: Conservation Laws for Gradient Flows‌.Advances in Neural‌ Information Processing Systems 36‌‌ (NeurIPS 2023)Advances in Neural Information Processing Systems‌ 36 (NeurIPS 2023)New‌ Orleans (Louisiane), United States‌‌December 2023HAL back to text back to‌ text
92 inproceedingsS.‌Sibylle Marcotte, R.‌‌Rémi Gribonval and G.Gabriel Peyré. Keep‌ the Momentum: Conservation Laws‌ beyond Euclidean Gradient Flows‌‌.Forty-first International Conference on Machine LearningAccepted‌ to ICML 2024Vienna,‌ AustriaJuly 2024HAL‌‌back to text
93 articleM.Mathurin Massias‌, S.Samuel Vaiter‌, A.Alexandre Gramfort‌‌ and J.Joseph Salmon. Dual Extrapolation for‌ Sparse Generalized Linear Models‌.Journal of Machine‌‌ Learning Research21234October 2020, 1-33‌HAL back to text‌
94 inproceedingsC.Can‌‌ Pouliquen, P.Paulo Gonçalves, M.Mathurin‌ Massias and T.Titouan‌ Vayer. Implicit Differentiation‌‌ for Hyperparameter Tuning the Weighted Graphical Lasso.‌GRETSI 2023 - XXIXème‌ Colloque Francophone de Traitement‌‌ du Signal et des ImagesGrenoble (France), France‌August 2023, 1-4‌HAL back to text‌‌
95 inproceedingsA.A Rahimi and B.Benjamin‌ Recht. Random features‌ for large-scale kernel machines‌‌.Replace implicit mapping of kernel trick by‌ explicit nonlinear mapping from‌ R⌃2007back to‌‌ text
96 articleF.F. Roosta-Khorasani and M.‌M.W. Mahoney. Sub-sampled‌ Newton methods.Math.‌‌ Program.1742019, 293-326DOI back to‌ text
97 articleD.‌David Shuman, S.‌‌Sunil Narang, P.Pascal Frossard, A.‌Antonio Ortega and P.‌Pierre Vandergheynst. The‌‌ Emerging Field of Signal Processing on Graphs.‌IEEE Signal Processing Magazine‌May 2013, 83--98‌‌back to text
98 articleB. K.Bharath‌ K Sriperumbudur, A.‌Arthur Gretton, K.‌‌Kenji Fukumizu, B.Bernhard Schölkopf and G.‌ R.Gert R G‌ Lanckriet. Hilbert Space‌‌ Embeddings and Metrics on‌ Probability Measures..JMLR11Theorem 21 relates‌ Wasserstein metric to Kernel metric2010, 1517--1561‌URL: http://dblp.org/rec/journals/jmlr/SriperumbudurGFSL10back to text
99 articleP.‌Pierre Stock and R.Rémi Gribonval. An‌ Embedding of ReLU Networks and an Analysis of‌ their Identifiability.Constructive Approximation572023,‌ pages 853--899HAL DOIback to text
100‌ articleI.Ivana Tosic and P.Pascal Frossard‌. Dictionary Learning.IEEE Signal Processing Magazine‌28227--38URL: http://ieeexplore.ieee.org/lpdocs/epic03/wrapper.htm?arnumber=5714407DOI back to‌ text
101 inproceedingsH.Hugues Van Assel,‌ T.Titouan Vayer, R.Rémi Flamary and‌ N.Nicolas Courty. SNEkhorn: Dimension Reduction with‌ Symmetric Entropic Affinities.Thirty-seventh Annual Conference on‌ Neural Information Processing Systems (NeurIPS)NeurIPS 2023 conference‌ paperNew Orleans, United StatesDecember 2023HAL‌back to text
102 inproceedingsY.Yue Wang‌, Z.Ziyu Jiang, X.Xiaohan Chen‌, P.Pengfei Xu, Y.Yang Zhao‌, Y.Yingyan Lin and Z.Zhangyang Wang‌. E2-train: Training state-of-the-art cnns with over 80%‌ energy savings.Advances in Neural Information Processing‌ Systems2019, 5138--5150back to text
103‌ inproceedingsG.Guandao Yang, T.Tianyi Zhang‌, P.Polina Kirichenko, J.Junwen Bai‌, A. G.Andrew Gordon Wilson and C.‌Chris De Sa. SWALP: Stochastic weight averaging‌ in low precision training.International Conference on‌ Machine Learning2019, 7015--7024back to text‌
104 articleZ.Zhewei Yao, A.Amir‌ Gholami, S.Sheng Shen, K.Kurt‌ Keutzer and M. W.Michael W Mahoney.‌ ADAHESSIAN: An adaptive second order optimizer for machine‌ learning.arXiv preprint arXiv:2006.007192020back to‌ text
105 inproceedingsJ.Jiaxi Ying, J.‌ V.José Vinícius de M. Cardoso and D.‌Daniel Palomar. Nonconvex Sparse Graph Learning under‌ Laplacian Constrained Graphical Model.34th Conference on‌ Neural Information Processing Systems2020back to text‌
106 phdthesisL.Léon Zheng. Data frugality‌ and computational efficiency in deep learning.Ecole‌ normale supérieure de lyon - ENS LYONMay‌ 2024HAL back to text
107 inproceedingsL.‌Léon Zheng, G.Gilles Puy, E.‌Elisa Riccietti, P.Patrick Pérez and R.‌Rémi Gribonval. Factorisation butterfly par identification algorithmique‌ de blocs de rang un.XXIXème Colloque‌ Francophone de Traitement du Signal et des Images‌Grenoble, FranceAugust 2023HAL back to text‌
108 articleL.Léon Zheng, E.Elisa‌ Riccietti and R.Rémi Gribonval. Efficient Identification‌ of Butterfly Sparse Matrix Factorizations.SIAM Journal‌ on Mathematics of Data Science2022HAL back‌ to text

OCKHAM - 2025

OCKHAM - 2025

2025Activity﻿‌​‌ reportProject-TeamOCKHAM

Keywords

Computer Science and​​﻿﻿ Digital Science

Other Research Topics and​​​‌ Application Domains

1​‌﻿﻿ Team members, visitors, external​​﻿﻿ collaborators

Research Scientists

Faculty Members

Post-Doctoral Fellows

PhD​​​‌ Students

Technical​​​‌ Staff

Interns and Apprentices

Administrative Assistant

Visiting Scientist

2 Overall objectives

Challenge​​​‌ 1: Developing frugal methods﻿﻿﻿‌ with robust expressivity.

Challenge﻿​​﻿ 2: Integrating models in​​​‌ learning algorithms.

Challenge 3:﻿﻿﻿‌ Guarantees on interpretability, explainability,﻿‌​‌ and privacy.

3 Research program

3.1​​﻿﻿ Axis 1: Sparsity for​​​‌ high-dimensional learning.

3.2﻿​​﻿ Axis 2: Learning on​​​‌ graphs and learning of﻿﻿﻿‌ graphs.

3.3 Axis 3:﻿﻿﻿‌ Dynamic and frugal learning.﻿‌​‌

4 Application domains﻿﻿﻿‌

4.1 Frugal AI﻿﻿﻿‌ on embedded devices

4.2﻿﻿﻿‌ Imaging in physics and﻿‌​‌ medicine

4.3 Interactions with​​​‌ computational social sciences

5 Highlights of the​‌﻿﻿ year

6​​﻿﻿ Latest software developments, platforms,​​​‌ open data

6.1 Latest﻿​﻿﻿ software developments

6.1.1 skglm​‌﻿﻿

6.1.2 Benchopt​​﻿﻿

6.1.3 lazylinop​​​‌

6.1.4 Celer

6.1.5﻿‌​‌ TorchDR

6.1.6 FAuST﻿‌​‌

7 New​​​‌ results

7.1 Integrating Structured﻿​﻿﻿ Models in Machine Learning​‌﻿﻿ and Signal Processing

7.1.1​​﻿﻿ Physics-informed neural networks

7.1.2​​​‌ Differentiable and learning-based methods﻿​﻿﻿ for structure representation: application​‌﻿﻿ to sparse precision matrices​​﻿﻿

7.1.3 New penalties and﻿﻿﻿‌ proximal operators

7.1.4 Inverse problems for﻿​​﻿ medical imaging

7.1.5 Gromov﻿﻿﻿‌ hyperbolicity for tree representation﻿‌​‌ of relational data

7.1.6 Contrastive​‌﻿﻿ pre-training of transformer encoders​​﻿﻿ for SEEG-based seizure onset​​​‌ zone detection

7.2﻿​﻿﻿ Deep neural networks :​‌﻿﻿ theory and algorithms

7.2.1​​﻿﻿ Mathematics of deep learning:​​​‌ rescaling invariances, generalization bounds,﻿​﻿﻿ and conservation laws

7.2.2 Quantized networks:​​﻿﻿ theory and algorithms

7.2.3 Sparse﻿​​﻿ regularization, unfolding, and approximation​​​‌ theory

7.2.4​‌﻿﻿ Deep sparsity: from hardness​​﻿﻿ to deformable butterfly algorithms​​​‌

7.2.5 Plug and﻿​​﻿ play methods

7.2.6﻿‌​‌ Generative models

7.3​‌﻿﻿ Statistical learning, dimension reduction,​​﻿﻿ and privacy preservation

7.3.1​​​‌ Theoretical foundations of compressive﻿​﻿﻿ learning: sketches, kernels, and​‌﻿﻿ optimal transport

7.3.2 Practical﻿​﻿﻿ exploration of sketching and​‌﻿﻿ methods with limited resources​​﻿﻿

7.3.3 Dimensionality​​﻿﻿ reduction and optimal transport﻿​​﻿

7.4 Large-scale convex﻿​​﻿ and nonconvex optimization

7.4.1​​​‌ Multilevel schemes for image﻿﻿﻿‌ restoration

7.4.2 Stochastic﻿​​﻿ multilevel schemes

7.4.3 Reproducible benchmarking​​﻿﻿ of optimization algorithms

7.4.4​​﻿﻿ Algorithms for large scale​​​‌ sparse linear models

8﻿​﻿﻿ Bilateral contracts and grants​‌﻿﻿ with industry

8.1 Bilateral​​﻿﻿ grants with industry

9 Partnerships and cooperations​‌﻿﻿

9.1 International research visitors​​﻿﻿

Laurent JACQUES

9.2 National initiatives

9.2.1​​​‌ PEPR IA project :﻿﻿﻿‌ SHARP

9.2.2 ANR IA Chaire﻿​​﻿ : AllegroAssai

9.2.3﻿​​﻿ ANR DataRedux

9.2.4​​﻿﻿ ANR JCJC EROSION

9.2.5 ANR JCJC MEPHISTO​​﻿﻿

9.2.6 Defi Hive Inria﻿﻿﻿‌ Cupseli

9.2.7 DI2A -﻿‌​‌ Subvention Simone et Cino﻿​​﻿ del Duca, Institut de​​​‌ France.

9.2.8​​​‌ GDR ISIS project PROSSIMO﻿﻿﻿‌

9.2.9​​​‌ ANR TSIA BenchArk

9.2.10 ANR SEIZURE​​﻿﻿

10 Dissemination

10.1 Promoting​​​‌ scientific activities

10.1.1 Scientific﻿​﻿﻿ events: organisation

10.1.2 Scientific events:​‌﻿﻿ selection

2025Activity‌‌ reportProject-TeamOCKHAM

Computer Science and Digital Science

Other Research Topics and‌ Application Domains

1‌ Team members, visitors, external collaborators

PhD‌ Students

Technical‌ Staff

Challenge‌ 1: Developing frugal methods‌ with robust expressivity.

Challenge 2: Integrating models in‌ learning algorithms.

Challenge 3:‌ Guarantees on interpretability, explainability,‌‌ and privacy.

3.1 Axis 1: Sparsity for‌ high-dimensional learning.

3.2 Axis 2: Learning on‌ graphs and learning of‌ graphs.

3.3 Axis 3:‌ Dynamic and frugal learning.‌‌

4 Application domains‌

4.1 Frugal AI‌ on embedded devices

4.2‌ Imaging in physics and‌‌ medicine

4.3 Interactions with‌ computational social sciences

5 Highlights of the‌ year

6 Latest software developments, platforms,‌ open data

6.1 Latest software developments

6.1.1 skglm‌

6.1.2 Benchopt

6.1.3 lazylinop‌

6.1.5‌‌ TorchDR

6.1.6 FAuST‌‌

7 New‌ results

7.1 Integrating Structured Models in Machine Learning‌ and Signal Processing

7.1.1 Physics-informed neural networks

7.1.2‌ Differentiable and learning-based methods for structure representation: application‌ to sparse precision matrices

7.1.3 New penalties and‌ proximal operators

7.1.4 Inverse problems for medical imaging

7.1.5 Gromov‌ hyperbolicity for tree representation‌‌ of relational data

7.1.6 Contrastive‌ pre-training of transformer encoders for SEEG-based seizure onset‌ zone detection

7.2 Deep neural networks :‌ theory and algorithms

7.2.1 Mathematics of deep learning:‌ rescaling invariances, generalization bounds, and conservation laws

7.2.2 Quantized networks: theory and algorithms

7.2.3 Sparse regularization, unfolding, and approximation‌ theory

7.2.4‌ Deep sparsity: from hardness to deformable butterfly algorithms‌

7.2.5 Plug and play methods

7.2.6‌‌ Generative models

7.3‌ Statistical learning, dimension reduction, and privacy preservation

7.3.1‌ Theoretical foundations of compressive learning: sketches, kernels, and‌ optimal transport

7.3.2 Practical exploration of sketching and‌ methods with limited resources

7.3.3 Dimensionality reduction and optimal transport

7.4 Large-scale convex and nonconvex optimization

7.4.1‌ Multilevel schemes for image‌ restoration

7.4.2 Stochastic multilevel schemes

7.4.3 Reproducible benchmarking of optimization algorithms

7.4.4 Algorithms for large scale‌ sparse linear models

8 Bilateral contracts and grants‌ with industry

8.1 Bilateral grants with industry

9 Partnerships and cooperations‌

9.1 International research visitors

9.2.1‌ PEPR IA project :‌ SHARP

9.2.2 ANR IA Chaire : AllegroAssai

9.2.3 ANR DataRedux

9.2.4 ANR JCJC EROSION

9.2.5 ANR JCJC MEPHISTO

9.2.6 Defi Hive Inria‌ Cupseli

9.2.7 DI2A -‌‌ Subvention Simone et Cino del Duca, Institut de‌ France.

9.2.8‌ GDR ISIS project PROSSIMO‌

9.2.9‌ ANR TSIA BenchArk

9.2.10 ANR SEIZURE

10.1 Promoting‌ scientific activities

10.1.1 Scientific events: organisation

10.1.2 Scientific events:‌ selection

Member of the conference program committees

10.1.3 Journal

Member of the‌ editorial boards

10.1.5 Leadership‌ within the scientific community‌

10.1.6 Scientific expertise‌‌

10.1.7 Research administration

10.2 Teaching - Supervision‌ - Juries - Educational‌ and pedagogical outreach

10.2.1‌‌ Teaching

10.2.2 Supervision

10.2.3 Juries

11 Scientific production

11.1 Major‌ publications

11.2‌ Publications of the year‌

Invited conferences‌

International peer-reviewed conferences