The Parietal team focuses on mathematical methods for modeling and statistical inference based on neuroimaging data, with a particular interest in machine learning techniques and applications of human functional imaging. This general theme splits into four research axes:
Modeling for neuroimaging population studies,
Encoding and decoding models for cognitive imaging,
Statistical and machine learning methods for large-scale data,
Compressed-sensing for MRI.
Parietal is also strongly involved in open-source software development in scientific Python (machine learning) and for neuroimaging applications.
Many problems in neuroimaging can be framed as forward and inverse problems. For instance, brain population imaging is concerned with the inverse problem that consists in predicting individual information (behavior, phenotype) from neuroimaging data, while the corresponding forward problem boils down to explaining neuroimaging data with the behavioral variables. Solving these problems entails the definition of two terms: a loss that quantifies the goodness of fit of the solution (does the model explain the data well enough ?), and a regularization scheme that represents a prior on the expected solution of the problem. These priors can be used to enforce some properties on the solutions, such as sparsity, smoothness or being piece-wise constant.
Let us detail the model used in typical inverse problem: Let
where the vector contains
where
with
When
Total Variation regularization (see Fig. ) is obtained for
(
Smooth lasso is obtained with (
Note that, while the qualitative aspect of the solutions are very different, the predictive power of these models is often very close.
The performance of the predictive model can simply be evaluated as the
amount of variance in
This framework is easily extended by considering
Grouped penalization, where the penalization explicitly includes a prior clustering of the features, i.e. voxel-related signals, into given groups. This amounts to enforcing structured priors on the solution.
Combined penalizations, i.e. a mixture of simple and group-wise penalizations, that allow some variability to fit the data in different populations of subjects, while keeping some common constraints.
Logistic and hinge regression, where a non-linearity is applied to the linear model so that it yields a probability of classification in a binary classification problem.
Robustness to between-subject variability to avoid the
learned model overly reflecting a few outlying particular observations
of the training set. Note that noise and deviating assumptions can be
present in both
Multi-task learning: if several target variables
are thought to be related, it might be useful to constrain the
estimated parameter vector
For instance, when one of the variables
then
Multivariate decompositions provide a way to model complex data such as brain activation images: for instance, one might be interested in extracting an atlas of brain regions from a given dataset, such as regions exhibiting similar activity during a protocol, across multiple protocols, or even in the absence of protocol (during resting-state). These data can often be factorized into spatial-temporal components, and thus can be estimated through regularized Principal Components Analysis (PCA) algorithms, which share some common steps with regularized regression.
Let
where
where
The problem is not jointly convex in all the variables but each
penalization given in Eq () yields a convex problem on
Ultimately, the main limitations to these algorithms is the cost due to the memory requirements: holding datasets with large dimension and large number of samples (as in recent neuroimaging cohorts) leads to inefficient computation. To solve this issue, online methods are particularly attractive .
Another important estimation problem stems from the general issue of learning the relationship between sets of variables, in particular their covariance. Covariance learning is essential to model the dependence of these variables when they are used in a multivariate model, for instance to study potential interactions among them and with other variables. Covariance learning is necessary to model latent interactions in high-dimensional observation spaces, e.g. when considering multiple contrasts or functional connectivity data.
The difficulties are two-fold: on the one hand, there is a shortage of data to learn a good covariance model from an individual subject, and on the other hand, subject-to-subject variability poses a serious challenge to the use of multi-subject data. While the covariance structure may vary from population to population, or depending on the input data (activation versus spontaneous activity), assuming some shared structure across problems, such as their sparsity pattern, is important in order to obtain correct estimates from noisy data. Some of the most important models are:
Sparse Gaussian graphical models, as they express meaningful conditional independence relationships between regions, and do improve conditioning/avoid overfit.
Decomposable models, as they enjoy good computational properties and enable intuitive interpretations of the network structure. Whether they can faithfully or not represent brain networks is still an open question.
PCA-based regularization of covariance which is powerful when modes of variation are more important than conditional independence relationships.
Adequate model selection procedures are necessary to achieve the right level of sparsity or regularization in covariance estimation; the natural evaluation metric here is the out-of-sample likelihood of the associated Gaussian model. Another essential remaining issue is to develop an adequate statistical framework to test differences between covariance models in different populations. To do so, we consider different means of parametrizing covariance distributions and how these parametrizations impact the test of statistical differences across individuals.
The brain as a highly structured organ, with both functional specialization and a complex network organization. While most of the knowledge historically comes from lesion studies and animal electophysiological recordings, the development of non-invasive imaging modalities, such as fMRI, has made it possible to study routinely high-level cognition in humans since the early 90's. This has opened major questions on the interplay between mind and brain , such as: How is the function of cortical territories constrained by anatomy (connectivity) ? How to assess the specificity of brain regions ? How can one characterize reliably inter-subject differences ?
Functional connectivity is defined as the interaction structure that underlies brain function. Since the beginning of fMRI, it has been observed that remote regions sustain high correlation in their spontaneous activity, i.e. in the absence of a driving task. This means that the signals observed during resting-state define a signature of the connectivity of brain regions. The main interest of resting-state fMRI is that it provides easy-to-acquire functional markers that have recently been proved to be very powerful for population studies.
While fMRI has been very useful in defining the function of regions at the mm scale, Magneto-encephalography (MEG) provides the other piece of the puzzle, namely temporal dynamics of brain activity, at the ms scale. MEG is also non-invasive. It makes it possible to keep track of precise schedule of mental operations and their interactions. It also opens the way toward a study of the rhythmic activity of the brain. On the other hand, the localization of brain activity with MEG entails the solution of a hard inverse problem.
Human neuroimaging targets two major goals: i) the study of neural responses involved in sensory, motor or cognitive functions, in relation to models from cognitive psychology, i.e. the identification of neurophysiological and neuroanatomical correlates of cognition; ii) the identification of markers in brain structure and function of neurological or psychiatric diseases. Both goals have to deal with a tension between
the search for higher spatial
the importance of inferring brain features with population-level validity, hence, contaminated with high variability within observed cohorts, which blurs the information at the population level and ultimately limits the spatial resolution of these observations.
Importantly, the signal-to-noise ratio (SNR) of the data remains
limited due to both resolution improvements
Demian Wassermann obtained an ERC starting grant, Neurolang, Accelerating Neuroscience Research by Unifying Knowledge Representation and Analysis Through a Domain Specific Language.
Besides, Alexandre Gramfort joined Parietal just after the start of his ERC grant entitled SLAB, Signal processing and Learning Applied to Brain data.
Functional Description: Mayavi is the most used scientific 3D visualization Python software. Mayavi can be used as a visualization tool, through interactive command line or as a library. It is distributed under Linux through Ubuntu, Debian, Fedora and Mandriva, as well as in PythonXY and EPD Python scientific distributions. Mayavi is used by several software platforms, such as PDE solvers (fipy, sfepy), molecule visualization tools and brain connectivity analysis tools (connectomeViewer).
Contact: Gaël Varoquaux
Keywords: Visualization - DWI - Health - Segmentation - Medical imaging
Scientific Description: It aims at creating an easily extensible platform for the distribution of research algorithms developed at Inria for medical image processing. This project has been funded by the D2T (ADT MedInria-NT) in 2010, renewed in 2012. A fast-track ADT was awarded in 2017 to transition the software core to more recent dependencies and study the possibility of a consortium creation.The Visages team leads this Inria national project and participates in the development of the common core architecture and features of the software as well as in the development of specific plugins for the team's algorithm.
Functional Description: MedInria is a free software platform dedicated to medical data visualization and processing.
Participants: Maxime Sermesant, Olivier Commowick and Théodore Papadopoulo
Partners: HARVARD Medical School - IHU - LIRYC - NIH
Contact: Olivier Commowick
URL: http://
NeuroImaging with scikit learn
Keywords: Health - Neuroimaging - Medical imaging
Functional Description: NiLearn is the neuroimaging library that adapts the concepts and tools of scikit-learn to neuroimaging problems. As a pure Python library, it depends on scikit-learn and nibabel, the main Python library for neuroimaging I/O. It is an open-source project, available under BSD license. The two key components of NiLearn are i) the analysis of functional connectivity (spatial decompositions and covariance learning) and ii) the most common tools for multivariate pattern analysis. A great deal of efforts has been put on the efficiency of the procedures both in terms of memory cost and computation time.
Participants: Alexandre Abraham, Alexandre Gramfort, Bertrand Thirion, Elvis Dohmatob, Fabian Pedregosa Izquierdo, Gaël Varoquaux, Loïc Estève, Michael Eickenberg and Virgile Fritsch
Contact: Bertrand Thirion
Keywords: Health - Brain - IRM - Neurosciences - Statistic analysis - FMRI - Medical imaging
Functional Description: As part of fMRI data analysis, PyHRF provides a set of tools for addressing the two main issues involved in intra-subject fMRI data analysis : (i) the localization of cerebral regions that elicit evoked activity and (ii) the estimation of the activation dynamics also referenced to as the recovery of the Hemodynamic Response Function (HRF). To tackle these two problems, PyHRF implements the Joint Detection-Estimation framework (JDE) which recovers parcel-level HRFs and embeds an adaptive spatio-temporal regularization scheme of activation maps.
Participants: Aina Frau Pascual, Christine Bakhous, Florence Forbes, Jaime Eduardo Arias Almeida, Laurent Risser, Lotfi Chaari, Philippe Ciuciu, Solveig Badillo, Thomas Perret and Thomas Vincent
Partners: CEA - NeuroSpin
Contact: Florence Forbes
URL: http://
Keywords: Regession - Clustering - Learning - Classification - Medical imaging
Scientific Description: Scikit-learn is a Python module integrating classic machine learning algorithms in the tightly-knit scientific Python world. It aims to provide simple and efficient solutions to learning problems, accessible to everybody and reusable in various contexts: machine-learning as a versatile tool for science and engineering.
Functional Description: Scikit-learn can be used as a middleware for prediction tasks. For example, many web startups adapt Scikitlearn to predict buying behavior of users, provide product recommendations, detect trends or abusive behavior (fraud, spam). Scikit-learn is used to extract the structure of complex data (text, images) and classify such data with techniques relevant to the state of the art.
Easy to use, efficient and accessible to non datascience experts, Scikit-learn is an increasingly popular machine learning library in Python. In a data exploration step, the user can enter a few lines on an interactive (but non-graphical) interface and immediately sees the results of his request. Scikitlearn is a prediction engine . Scikit-learn is developed in open source, and available under the BSD license.
Participants: Alexandre Gramfort, Bertrand Thirion, Fabian Pedregosa Izquierdo, Gaël Varoquaux, Loïc Estève, Michael Eickenberg and Olivier Grisel
Partners: CEA - Logilab - Nuxeo - Saint Gobain - Tinyclues - Telecom Paris
Contact: Olivier Grisel
Massive Online Dictionary Learning
Keywords: Pattern discovery - Machine learning
Functional Description: Matrix factorization library, usable on very large datasets, with optional sparse and positive factors.
Participants: Arthur Mensch, Gaël Varoquaux, Bertrand Thirion and Julien Mairal
Contact: Arthur Mensch
Publications: Subsampled online matrix factorization with convergence guarantees - Stochastic Subsampling for Factorizing Huge Matrices
MNE-Python
Keywords: Neurosciences - EEG - MEG - Signal processing - Machine learning
Functional Description: Open-source Python software for exploring, visualizing, and analyzing human neurophysiological data: MEG, EEG, sEEG, ECoG, and more.
Contact: Alexandre Gramfort
To probe individual variations in brain organization, population imaging relates features of brain images to rich descriptions of the subjects such as genetic information or behavioral and clinical assessments. Capturing common trends across these measurements is important: they jointly characterize the disease status of patient groups. In particular, mapping imaging features to behavioral scores with predictive models opens the way toward more precise diagnosis. Here we propose to jointly predict all the dimensions (behavioral scores) that make up the individual profiles, using so-called multi-output models. This approach often boosts prediction accuracy by capturing latent shared information across scores. We demonstrate the efficiency of multi-output models on two independent resting-state fMRI datasets targeting different brain disorders (Alzheimer's Disease and schizophrenia). Furthermore, the model with joint prediction generalizes much better to a new cohort: a model learned on one study is more accurately transferred to an independent one. Finally, we show how multi-output models can easily be extended to multi-modal settings, combining heterogeneous data sources for a better overall accuracy.
Brain functional connectivity, obtained from functional Magnetic Resonance Imaging at rest (r-fMRI), reflects inter-subject variations in behavior and characterizes neuropathologies. It is captured by the covariance matrix between time series of remote brain regions. With noisy and short time series as in r-fMRI, covariance estimation calls for penalization, and shrinkage approaches are popular. Here we introduce a new covariance estimator based on a non-isotropic shrinkage that integrates prior knowledge of the covariance distribution over a large population. The estimator performs shrinkage tailored to the Riemannian geometry of symmetric positive definite matrices, coupled with a probabilistic modeling of the subject and population covariance distributions. Experiments on a large-scale dataset show that such estimators resolve better intra-and inter-subject functional connectivities compared existing co-variance estimates. We also demonstrate that the estimator improves the relationship across subjects between their functional-connectivity measures and their behavioral assessments. More information can be found in Fig. in .
Brain decoding relates behavior to brain activity through predictive models. These are also used to identify brain regions involved in the cognitive operations related to the observed behavior. Training such multivariate models is a high-dimensional statistical problem that calls for suitable priors. State of the art priors –eg small total-variation– enforce spatial structure on the maps to stabilize them and improve prediction. However, they come with a hefty computational cost. We build upon very fast dimension reduction with spatial structure and model ensembling to achieve decoders that are fast on large datasets and increase the stability of the predictions and the maps. Our approach, fast regularized ensemble of models (FReM), includes an implicit spatial regularization by using a voxel grouping with a fast clustering algorithm. In addition, it aggregates different estimators obtained across splits of a cross-validation loop, each time keeping the best possible model. Experiments on a large number of brain imaging datasets show that our combination of voxel clustering and model ensembling improves decoding maps stability and reduces the variance of prediction accuracy. Importantly, our method requires less samples than state-of-the-art methods to achieve a given level of prediction accuracy. Finally, FreM is highly parallelizable, and has lower computation cost than other spatially-regularized methods.
Most current functional Magnetic Resonance Imaging (fMRI) decoding analyses rely on statistical summaries of the data resulting from a deconvolution approach: each stimulation event is associated with a brain response. This standard approach leads to simple learning procedures, yet it is ill-suited for decoding events with short inter-stimulus intervals. In order to overcome this issue, we propose a novel framework that separates the spatial and temporal components of the prediction by decoding the fMRI time-series continuously, i.e. scan-by-scan. The stimulation events can then be identified through a deconvolution of the reconstructed time series. We show that this model performs as well as or better than standard approaches across several datasets, most notably in regimes with small inter-stimuli intervals (3 to 5s), while also offering predictions that are highly interpretable in the time domain. This opens the way toward analyzing datasets not normally thought of as suitable for decoding and makes it possible to run decoding on studies with reduced scan time.
Structured sparsity penalization has recently improved statistical models applied to high-dimensional data in various domains. As an extension to medical imaging, the present work incorporates priors on network hierarchies of brain regions into logistic-regression to distinguish neural activity effects. These priors bridge two separately studied levels of brain architecture: functional segregation into regions and functional integration by networks. Hierarchical region-network priors are shown to better classify and recover 18 psychological tasks than other sparse estimators. Varying the relative importance of region and network structure within the hierarchical tree penalty captured complementary aspects of the neural activity patterns. Local and global priors of neurobiological knowledge are thus demonstrated to offer advantages in generalization performance, sample complexity, and domain interpretability.
Predictive models ground many state-of-the-art developments in
statistical brain image analysis: decoding, MVPA, searchlight, or
extraction of biomarkers. The principled approach to establish their
validity and usefulness is cross-validation, testing prediction on
unseen data. Here, we raise awareness on error bars of
cross-validation, which are often underestimated. Simple experiments
show that sample sizes of many neuroimaging studies inherently lead to
large error bars, eg
We present an automated algorithm for unified rejection and repair of bad trials in magnetoencephalography (MEG) and electroencephalography (EEG) signals. Our method capitalizes on cross-validation in conjunction with a robust evaluation metric to estimate the optimal peak-to-peak threshold – a quantity commonly used for identifying bad trials in M/EEG. This approach is then extended to a more sophisticated algorithm which estimates this threshold for each sensor yielding trial-wise bad sensors. Depending on the number of bad sensors, the trial is then repaired by interpolation or by excluding it from subsequent analysis. All steps of the algorithm are fully automated thus lending itself to the name Autoreject. In order to assess the practical significance of the algorithm, we conducted extensive validation and comparisons with state-of-the-art methods on four public datasets containing MEG and EEG recordings from more than 200 subjects. The comparisons include purely qualitative efforts as well as quantitatively benchmarking against human supervised and semi-automated preprocessing pipelines. The algorithm allowed us to automate the preprocessing of MEG data from the Human Connectome Project (HCP) going up to the computation of the evoked responses. The automated nature of our method minimizes the burden of human inspection, hence supporting scalability and reliability demanded by data analysis in modern neuroscience.
Cognitive neuroscience is enjoying rapid increase in extensive public brain-imaging datasets. It opens the door to large-scale statistical models. Finding a unified perspective for all available data calls for scalable and automated solutions to an old challenge: how to aggregate heterogeneous information on brain function into a universal cognitive system that relates mental operations/cognitive processes/psychological tasks to brain networks? We cast this challenge in a machine-learning approach to predict conditions from statistical brain maps across different studies. For this, we leverage multi-task learning and multi-scale dimension reduction to learn low-dimensional representations of brain images that carry cognitive information and can be robustly associated with psychological stimuli. Our multi-dataset classification model achieves the best prediction performance on several large reference datasets, compared to models without cognitive-aware low-dimension representations; it brings a substantial performance boost to the analysis of small datasets, and can be introspected to identify universal template cognitive concepts.
We have presented for the first time the implementation of
non-Cartesian trajectories on a 7T scanner for 2D anatomical
imaging. The proposed SPARKLING curves (Segmented Projection Algorithm
for Random K-space sampLING) are a new type of non-Cartesian segmented
sampling trajectories which allow fast and efficient coverage of the
k-space according to a chosen variable density. To demonstrate
their potential, a high-resolution (0.4
We consider the problem of projecting a probability measure
where
The Wendelin project has been granted on December 3rd, 2014. It has been selected at the Programme d’Investissements d’Avenir (PIA) that supports "cloud computing et Big Data". It gives visibility and fosters the French technological big data sector, and in particular the scikit-learn library, the NoSQL “NEO” et the decentralized “SlapOS” cloud, three open-source software supported by the Systematic pôle de compétitivité.
Scikit-learn is a worldwide reference library for machine learning. Gaël Varoquaux, Olivier Grisel and Alexandre Gramfort have been major players in the design of the library and Scikit-learn has then been supported by the growing scientific Python community. It is currently used by major internet companies as well as dynamic start-ups, including Google, Airbnb, Spotify, Evernote, AWeber, TinyClues; it wins more than half of the data science "Kaggle" competitions. Scikit-learn makes it possible to predict future outcomes given a training data, and thus to optimize company decisions. Almost 1 million euros will be invested to improve the algorithmic core of scikit-learn through the Wendelin project thanks to the Inria, ENS and Institut Mines Télécom teams. In particular, scikit-learn will be extended in order to ease online prediction and to include recent stochastic gradient algorithms.
NEO is the native NoSQL base of the Python language. It was initially designed by Nexedi and is currently used and embedded in the main software of company information systems. More than one million euros will be invested into NEO, so that scikit-learn can process within 10 years (out-of-core) data of 1 exabyte size.
Paris13 university and the Mines Télécom institute will extend the SlapOS distributed mesh cloud to deploy Wendelin in Big Data as a Service (BDaaS) mode, to achieve the interoperability between the Grid5000 and Teralab infrastructures and to extend the cloud toward smart sensor systems.
The combination of scikit-learn, NEO and SlapOS will improve the
predictive maintenance of industrial plants with two major use cases:
connected windmills (GDF SUEZ, Woelfel) and customer satisfaction in
car sale systems (MMC Rus). In both cases it is about non-personal,
yet profitable big data.
The Wendelin project actually demonstrates that Big data can improve
infrastructure and everyday-life equipment without intrusive data
collection. For more information, please see http://
The project partners are:
Nexedi (leader)
GDF SUEZ
Abilian
2ndQuadrant
Institut Mines Télécom
Inria
Université Paris 13
This is a collaborative project with Jean-Luc Stark, (CEA) funded by the DRF-impulsion CEA program.
Compressed Sensing is a recent theory in maths that allows the perfect recovery of signals or images from compressive acquisition scenarios. This approach has been popularized in MRI over the last decade as well as in astrophysics (noticeably in radio-astronomy). So far, both of these fields have developed skills in CS separately. The aim of the COSMIC project is to foster collaborations between CEA experts in MRI (Parietal team within NeuroSpin) and in astrophysics (CosmoStat lab within the Astrophysics Department). These interactions will allow us to share different expertise in order to improve image quality, either in MRI or in radio-astronomy (thanks to the interferometry principle). In this field, given the data delivered by radio-telescopes, the goal consists in extracting high temporal resolution information in order to study fast transient events.
This is a collaborative project with Lenka Zdeborová, Theoretical Physics Institute (CEA) funded by the DRF-impulsion CEA program.
In many scientific fields, the data acquisition devices have benefited
of hardware improvement to increase the resolution of the observed
phenomena, leading to ever larger datasets. While the dimensionality
has increased, the number of samples available is often limited, due
to physical or financial limits. This is a problem when these data are
processed with estimators that have a large sample complexity, such as
multivariate statistical models. In that case it is very useful to
rely on structured priors, so that the results reflect the state of
knowledge on the phenomena of interest. The study of the human brain
activity through high-field MRI belongs among these problems, with up
to
We are missing fast estimators for multivariate models with structured priors, that furthermore provide statistical control on the solution. Approximate message passing methods are designed to work optimally with low-sample-complexity, they accommodate rather generic class of priors and come with an estimation of statistical significance. They are therefore well suited for our purposes.
We want to join forces to design a new generation of inverse problem solvers that can take into account the complex structure of brain images and provide guarantees in the low-sample-complexity regime. To this end, we will first adapt AMP to the brain mapping setting, using first standard sparsity priors (e.g. Gauss-Bernoulli) on the model. We will then consider more complex structured priors that control the variation of the learned image patterns in space. Crucial gains are expected from the use of the EM algorithm for parameter setting, that comes naturally with AMP. We will also examine the estimators provided by AMP for statistical significance. BrainAMP will design a reference inference toolbox released as a generic open source library. We expect a 3- to 10-fold improvement in CPU time, that will benefit to large-scale brain mapping investigations.
This is a Digiteo project (2014-2017).
Mapping brain functional connectivity from functional Magnetic Resonance Imaging (MRI) data has become a very active field of research. However, analysis tools are limited and many important tasks, such as the empirical definition of brain networks, remain difficult due to the lack of a good framework for the statistical modeling of these networks. We propose to develop population models of anatomical and functional connectivity data to improve the alignment of subjects brain structures of interest while inferring an average template of these structures. Based on this essential contribution, we will design new statistical inference procedures to compare the functional connections between conditions or populations and improve the sensitivity of connectivity analysis performed on noisy data. Finally, we will test and validate the methods on multiple datasets and distribute them to the brain imaging community.
This is a Digicosme project (2016-2019) and a collaboration with Fabian Suchanek (Telecom Paritech).
Understanding how cognition emerges from the billions of neurons that constitute the human brain is a major open problem in science that could bridge natural science –biology– to humanities –psychology. Psychology studies performed on humans with functional Magnetic Resonance Imaging (fMRI) can be used to probe the full repertoire of high-level cognitive functions. While analyzing the resulting image data for a given experiment is a relatively well-mastered process, the challenges in comparing data across multiple datasets poses serious limitation to the field. Indeed, such comparisons require to pool together brain images acquired under different settings and assess the effect of different experimental conditions that correspond to psychological effects studied by neuroscientists.
Such meta-analyses are now becoming possible thanks to the development of
public data resources –OpenfMRI http://
The purpose of this project is to learn a semantic structure in cognitive terms from their occurrence in brain activation. This structure will simplify massive multi-label statistical-learning problems that arise in brain mapping by providing compact representations of cognitive concepts while capturing the imprecision on the definition these concepts.
This is a Digicosme project (2017-2020) and a collaboration with Joseph Salmon (Telecom Paritech).
The HiDimStat project aims at handling uncertainty in the challenging context of high dimensional regression problem. Though sparse models have been popularized in the last twenty years in contexts where many features can explain a phenomenon, it remains a burning issue to attribute confidence to the predictive models that they produce. Such a question is hard both from the statistical modeling point of view, and from a computation perspective. Indeed, in practical settings, the amount of features at stake (possibly up to several millions in high resolution brain imaging) limit the application of current methods and require new algorithms to achieve computational efficiency. We plan to leverage recent developments in sparse convex solvers as well as more efficient reformulations of testing and confidence interval estimates to provide several communities with practical software handling uncertainty quantification. Specific validation experiments will be performed in the field of brain imaging.
This is a Digicosme project (2017-2020) and a collaboration with Joseph Salmon (Telecom Paritech) and Lenka Zdeborova (CEA, IPhT).
In many scientific fields, the data acquisition devices have benefited
of hardware improvement to increase the resolution of the observed
phenomena, leading to ever larger datasets. While the dimensionality
has increased, the number of samples available is often limited, due
to physical or financial limits. This is a problem when these data are
processed with estimators that have a large sample complexity, such as
multivariate statistical models. In that case it is very useful to
rely on structured priors, so that the results reflect the state of
knowledge on the phenomena of interest. The study of the human brain
activity through neuroimaging belongs among these problems, with up to
CDS2 is an "Strategic research initiative” of the Paris Saclay
University Idex http://
The scale-free concept formalizes the intuition that, in many systems, the analysis of temporal dynamics cannot be grounded on specific and characteristic time scales. The scale-free paradigm has permitted the relevant analysis of numerous applications, very different in nature, ranging from natural phenomena (hydrodynamic turbulence, geophysics, body rhythms, brain activity,...) to human activities (Internet traffic, population, finance, art,...).
Yet, most successes of scale-free analysis were obtained in contexts where data are univariate, homogeneous along time (a single stationary time series), and well-characterized by simple-shape local singularities. For such situations, scale-free dynamics translate into global or local power laws, which significantly eases practical analyses. Numerous recent real-world applications (macroscopic spontaneous brain dynamics, the central application in this project, being one paradigm example), however, naturally entail large multivariate data (many signals), whose properties vary along time (non-stationarity) and across components (non-homogeneity), with potentially complex temporal dynamics, thus intricate local singular behaviors.
These three issues call into question the intuitive and founding identification of scale-free to power laws, and thus make uneasy multivariate scale-free and multifractal analyses, precluding the use of univariate methodologies. This explains why the concept of scale-free dynamics is barely used and with limited successes in such settings and highlights the overriding need for a systematic methodological study of multivariate scale-free and multifractal dynamics. The Core Theme of MULTIFRACS consists in laying the theoretical foundations of a practical robust statistical signal processing framework for multivariate non homogeneous scale-free and multifractal analyses, suited to varied types of rich singularities, as well as in performing accurate analyses of scale-free dynamics in spontaneous and task-related macroscopic brain activity, to assess their natures, functional roles and relevance, and their relations to behavioral performance in a timing estimation task using multimodal functional imaging techniques.
This overarching objective is organized into 4 Challenges:
Multivariate scale-free and multifractal analysis,
Second generation of local singularity indices,
Scale-free dynamics, non-stationarity and non-homogeneity,
Multivariate scale-free temporal dynamics analysis in macroscopic brain activity.
Context: The NiConnect project (2012-2017) arises from an increasing need of medical imaging tools to diagnose efficiently brain pathologies, such as neuro-degenerative and psychiatric diseases or lesions related to stroke. Brain imaging provides a non-invasive and widespread probe of various features of brain organization, that are then used to make an accurate diagnosis, assess brain rehabilitation, or make a prognostic on the chance of recovery of a patient. Among different measures extracted from brain imaging, functional connectivity is particularly attractive, as it readily probes the integrity of brain networks, considered as providing the most complete view on brain functional organization.
Challenges: To turn methods research into popular tool widely usable by non specialists, the NiConnect project puts specific emphasis on producing high-quality open-source software. NiConnect addresses the many data analysis tasks that extract relevant information from resting-state fMRI datasets. Specifically, the scientific difficulties are i) conducting proper validation of the models and tools, and ii) providing statistically controlled information to neuroscientists or medical doctors. More importantly, these procedures should be robust enough to perform analysis on limited quality data, as acquiring data on diseased populations is challenging and artifacts can hardly be controlled in clinical settings.
Outcome of the project: In the scope of computer science and statistics, NiConnect pushes forward algorithms and statistical models for brain functional connectivity. In particular, we are investigating structured and multi-task graphical models to learn high-dimensional multi-subject brain connectivity models, as well as spatially-informed sparse decompositions for segmenting structures from brain imaging. With regards to neuroimaging methods development, NiConnect provides systematic comparisons and evaluations of connectivity biomarkers and a software library embedding best-performing state-of-the-art approaches. Finally, with regards to medical applications, the NiConnect project also plays a support role in on going medical studies and clinical trials on neurodegenerative diseases.
Consortium
Parietal Inria research team: applied mathematics and computer science to model the brain from MRI
LIF INSERM research team: medical image data analysis and modeling for clinical applications
CATI center: medical image processing center for large scale brain imaging studies
Henri-Mondor hospital neurosurgery and neuroradiology: clinical teams conducting research on treatments for neurodegenerative diseases, in particular Huntington and Parkinson diseases
Logilab: consulting in scientific computing
Title: The Human Brain Project
Programm: FP7
Duration: October 2013 - September 2016
Coordinator: EPFL
Partners:
Inria contact: Olivier Faugeras
Understanding the human brain is one of the greatest challenges facing 21st century science. If we can rise to the challenge, we can gain profound insights into what makes us human, develop new treatments for brain diseases and build revolutionary new computing technologies. Today, for the first time, modern ICT has brought these goals within sight. The goal of the Human Brain Project, part of the FET Flagship Programme, is to translate this vision into reality, using ICT as a catalyst for a global collaborative effort to understand the human brain and its diseases and ultimately to emulate its computational capabilities. The Human Brain Project will last ten years and will consist of a ramp-up phase (from month 1 to month 36) and subsequent operational phases. This Grant Agreement covers the ramp-up phase. During this phase the strategic goals of the project will be to design, develop and deploy the first versions of six ICT platforms dedicated to Neuroinformatics, Brain Simulation, High Performance Computing, Medical Informatics, Neuromorphic Computing and Neurorobotics, and create a user community of research groups from within and outside the HBP, set up a European Institute for Theoretical Neuroscience, complete a set of pilot projects providing a first demonstration of the scientific value of the platforms and the Institute, develop the scientific and technological capabilities required by future versions of the platforms, implement a policy of Responsible Innovation, and a programme of transdisciplinary education, and develop a framework for collaboration that links the partners under strong scientific leadership and professional project management, providing a coherent European approach and ensuring effective alignment of regional, national and European research and programmes. The project work plan is organized in the form of thirteen subprojects, each dedicated to a specific area of activity. A significant part of the budget will be used for competitive calls to complement the collective skills of the Consortium with additional expertise.
Title: Machine learning for meta-analysis of functional neuroimaging data
International Partner (Institution - Laboratory - Researcher):
Stanford (United States) - Department of Psychology - Russ Poldrack
Start year: 2015
See also: https://
Neuroimaging produces huge amounts of complex data that are used to better understand the relations between brain structure and function. Observing that the neuroimaging community is still largely missing appropriate tools to store and organize the knowledge related to the data, Parietal team and Poldrack's lab, have decided to join forces to set up a framework for functional brain image meta-analysis, i.e. a framework in which several datasets can be jointly analyzed in order to accumulate information on the functional specialization of brain regions. MetaMRI will build upon Poldrack's lab expertise in handling, sharing and analyzing multi-protocol data and Parietal's recent developments of machine learning libraries to develop a new generation of meta-analytic tools.
Title: Characterizing Large-scale Brain Networks Using Novel Computational Methods for dMRI and fMRI-based Connectivity
International Partner (Institution - Laboratory - Researcher):
Stanford (United States) - Stanford Cognitive and Systems Neuroscience Laboratory - Vinod Menon
Start year: 2016
See also: http://
In the past two decades, brain imaging of neurotypical individuals and clinical populations has primarily focused on localization of function and structures in the brain, revealing activation in specific brain regions during performance of cognitive tasks through modalities such as functional MRI. In parallel, technologies to identify white matter structures have been developed using diffusion MRI. More recently, interest has shifted towards developing a deeper understanding of the brain's intrinsic architecture and its influence on cognitive and affective information processing. Using for this resting state fMRI and diffusion MRI to build the functional and structural networks of the human brain.
The human brain is a complex patchwork of interconnected regions, and graph-theoretical approaches have become increasingly useful for understanding how functionally connected systems engender, and constrain, cognitive functions. The functional nodes of the human brain and their structural inter-connectivity, collectively the "connectome", are, however, poorly understood. Critically, there is a dearth of computational methods for reliably identifying functional nodes of the brain and their structural inter-connectivity in vivo, despite an abundance of high-quality data from the Human Connectome Project (HCP). Devising and validating methods for investigating the human connectome has therefore taken added significance.
The first major goal of this project is to develop and validate appropriate sophisticated computational and mathematical tools for identifying functional nodes at the whole-brain level and measuring structural and functional connectivity between them, using state-of-the-art human brain imaging techniques and open-source HCP data. To this end, we will first develop and validate novel computational tools for (1) identifying stable functional nodes of the human brain using resting-state functional MRI and (2) measuring structural connectivity between functional nodes of the brain using multi-shell high-angular diffusion MRI. Due to the complementarity of the two imaging techniques fMRI and dMRI, our novel computational methods methods, the synergy between the two laboratories of this associate team will allow us to reveal in unprecedented detail the structural and functional connectivity of the human brain.
The second major goal of this project is to use our newly developed computational tools to characterize normal structural and functional brain networks in neurotypical adults.
Parietal has welcome François Meyer, Univ Colorado at Boulder, for a six months visit (Jan-June 2017), funded by a D'Alembert fellowship of Paris Saclay University. The project of François is to assess novel statistical models of functional connectivity based on the generalized resistivity model he has developed within a graph theoretical framework.
has spent two months in Boston (April-May) with the MEG Core lab, Athinoula A. Martinos Center (MGH/Harvard-MIT) working on functional connectivity methods and population analysis for MEG.
has spent 3 months in Japan (Sept-Dec) with NTT, working on dynamic time warping problems with Mathieu Blondel.
has spent two months with Poldracklab at Stanford, as part of the MetaMRI associated team. He has worked on the statistical relationships between neuroscientific concepts (whether anatomical or cognitive) and brain activation loci.
MICCAI, NIPS, IPMI, ICML, CCN
NIPS, IPMI, OHBM, PRNI
Gaël Varoquaux, Editor, Neuroimage
Alexandre Gramfort, Editor, Frontiers in Brain Imaging Methods, Journal of Machine Learning Research (JMLR)
Bertrand Thirion, Editor, Frontiers in Brain Imaging Methods
Nature Methods, JMLR, PLOS Comp Bio, NeuroImage, IEEE TBME, Annals of Applied Statistics, Biological Psychiatry, MedIA
JMLR, PLOS Comp Bio, NeuroImage, IEEE TBME, IEEE TMI, IEEE TSP, MedIA, Brain Topography, NIPS, ICML, ICLR.
PLOS One, NeuroImage, HBM, IEEE TMI, IEEE TSP, IEEE SP letters, MedIA, J of Neuroscience, JCBFM, ISBI, ICASSP, EUSIPCO.
PLOS Biology, PLOS Computational Biology, Nature Scientific Reports, Neuroimage, Neuroimage Clinical, Human Brain Mapping, Journal of Machine Learning Research, Brain Topography, Brain Connectivity, Journal of Alzheimer's Disease, PLOS ONE, Frontiers in Neuroscience, Psychiatry and Clinical Neurosciences, Sensors
Nature communications, Neuroimage, Medical Image Analysis, IEEE TMI, PNAS, PLOS Comp Bio.
Gaël Varoquaux, Keynote, BDEC Workshop 2017, Wuxi, China
Gaël Varoquaux, Keynote, scipy 2017, Austin TX
Gaël Varoquaux, Keynote, Swiss Python summit 2017, Zurich, Switzerland
Gaël Varoquaux, Keynote, NIPS learning with limited labels workshop, Long Beach, CA
Philippe Ciuciu, Seminar in the neuroscience Department, NYU, School of Medicine, May 2017, NYC.
Philippe Ciuciu, Laufer Center Seminar Series, May 2017, Stony Brook Univ, NY.
Philippe Ciuciu, Dedale workshop on Dictionary learning on manifolds, Sep. 2017, Nice, France
Philippe Ciuciu, Colloquium at Perform centre, Dec 2017, Montreal, Canada
Philippe Ciuciu, Seminar at Ecole Polytechnique Montreal, Dec 2017, Montreal, Canada
Alexandre Gramfort, 2017 EU-US Frontiers of Engineering Symposium, UC Davis, CA
Alexandre Gramfort, Computational Challenges in Image Processing, Turing Institute, Cambridge Univ., UK
Alexandre Gramfort, Machine learning for functional brain imaging Symposium and Workshop, Karolinska Institute, Sweden
Denis A. Engemann, Lessons learned from high-dimensional statistics and M/EEG: From automated preprocessing to detection of consciousness, University of Naples, Italy, March 2017
Denis A. Engemann, Learning from Oscillations – New Vistas for Translational Neuroimaging, MIT, Cambridge MA, April 2017
Denis A. Engemann, Building better biomarkers using M/EEG and statistical learning, Eespo University, Finland, June 2017
Bertrand Thirion, ARSEP, February 2017, Paris
Bertrand Thirion, AI for Medical Imaging symposium, June 2017, Paris
Bertrand Thirion, Causality and big data workshop, October 2017, Saclay
Bertrand Thirion, Machine learning workshop, Telcom Paris Sud, October 2017, Evry
Bertrand Thirion, Icube seminary, Strasburg, July 2017
Bertrand Thirion, Laboratoire de Mathématiques d'Orsay, December 2017
Philippe Ciuciu, Member of the "ANR evualuation committee (CES 45) in Mathématique, informatique, automatique, traitement du signal"
Philippe Ciuciu, Member of the IEEE BISP committe for the international symposium in biomedical imaging
Philippe Ciuciu, Member of the Biomedical Imaging & Signal Analytics special area team (SAT) in Eurasip
Bertrand Thirion, Member of the prospective group of ITMO Neurosciences
Gaël Varoquaux, Member of the Inria Saclay CDT, "Comission de développement téchnologique"
Gaël Varoquaux, Member of the Inria Saclay CSD, "Comission de suivi doctoral"
Gaël Varoquaux, Member of the Inria Saclay cluster committee
Alexandre Gramfort, Member of the "comité de pilotage" DataIA
Alexandre Gramfort, Member of the "comité de pilotage" DIM RFSI
Philippe Ciuciu, Member of the "Inria Saclay scientific committee"
Philippe Ciuciu, Member of the CEA/DRF Impulsion evaluation committee
Philippe Ciuciu, Member of the "jury du prix de thèse de la société EEA et du GDR ISIS"
Bertrand Thirion, Deputy scientific director of Inria Saclay research center
Bertrand Thirion, Leader of the Datasense research axis of the Digicosme Labex
Bertrand Thirion, Member of the steering committee of the Dataia Convergence Institute
Bertrand Thirion, Member of the steering committee of the Computer Science Deprtment of paris Saclay University.
Master : Machine learning with scikit-learn, 2 heures équivalent TD, M1, ENSAE, France
Master : Brain functional connectivity, 8 heures équivalent TD, M2, Télécom Paris Tech, France
Doctorat : Machine learning for brain imaging with nilearn, 16 heures équivalent TD, Université de Montréal, Canada
Doctorat : Ecole d'hiver, Computational Brain Connectivity Mapping, Juan-Les-Pins, 1h30
Doctorat : Ecole d'hiver, Computational Brain Connectivity Mapping, 1h30, Juan-Les-Pins
Doctorat : Machine learning for brain imaging, 1h00, IPAM, UCLA, Los Angeles, USA
Doctorat : Scipy: numerical algorithms in Python, 1h30, Euroscipy 2017, Erlangen, Germany
Doctorat : Estimation of brain functional connectomes, 30mn, OHBM 2017, Vancouver, Canada
Doctorat : Software engineering for reproducible science, Open science summer school 2017, EPFL, Lausanne, Switzerland
Master 2 : “Functional MRI: From data acquisition to analysis”, 3h,Univ. Paris V René Descartes & Télécom-Paristech, Master of Biomedical Engineering
Master 2 : “FMRI data analysis”, 3h, Univ. Paris-Saclay, Master of medical Physics
Master 2 : “Functional Brain Imaging with EEG and MEG”, 6h, Univ. Paris V René Descartes & Télécom-Paristech, Master of Biomedical Engineering
Master 2 : “Optimization for Data Science”, 20h, Univ. Paris-Saclay, Master of Mathematics / Data Science
Master 2 : “Data Camp”, 18h, Univ. Paris-Saclay, Master of Mathematics / Data Science
Ecole d'hiver, Computational Brain Connectivity Mapping, Juan-Les-Pins, 1h30
Scikit-learn tutorial, Scipy Conf., Austin, USA, 6h
MNE tutorial, Univ. Libre de Bruxelles, Belgique, 6h
MNE-Python workshop, University of Bari, June 2017, 14h
Master : Functional Neuroimaging and brain-computer interfaces, 12h, MVA, ENS Paris-Saclay, France
Doctorat : OHBM course 2017, Pattern Recognition for brain Imaging, Vancouver, Canada
Master: Resting-State Functional Magnetic Resonance Imaging, 3h, Paris V, France.
PhD in progress : Patricio Cerda, Statistical methods for analysis across datasets, started 01/09/2016, supervisors: Balazs Kegl (CNRS, LAL), Gaël Varoquaux
PhD in progress : Jérome Dockès, Mining text and brain activity to learning the semantics of cognitive science, started 01/09/2016, supervisors: Fabian Suchanek (Télécom Paristech), Gaël Varoquaux
PhD in progress : Arthur Mensch, Learning representations of fMRI, started 01/09/2015, supervisors: Bertrand Thirion, Gaël Varoquaux, Julien Mairal (Inria Grenoble)
PhD: Elvis Dohmatob, Amélioration de connectivité fonctionnelle par utilisation de modèles déformables dans l'estimation de décompositions spatiales des images de cerveau, defended sept 2017, supervisors Bertrand Thirion, Gaël Varoquaux
PhD in progress: Carole Lazarus (3rd year, Director)
PhD in progress: Loubna El Gueddari (2nd year, Director)
PhD in progress: Hamza Cherkaoui (1st year, Director)
PhD in progress: Sylvain Lannuzel (1st year, co-director)
PhD defended: Albert Thomas
PhD defended: Romain Laby
PhD in progress: Yousra Bekhti
PhD in progress: Mainak Jas
PhD in progress: Tom Dupré La Tour
PhD in progress: Stanislas Chambon
PhD in progress: Pierre Ablin
PhD in progress: Mathurin Massias
PhD in progress: Hicham Janati
PhD in progress: Mainak Jas (co-supervision on 1 main project)
PhD in progress: Sami aboud (co-supervision on 1 main project)
PhD in progress: Kamalakar Reddy (co-supervision)
PhD defended: Elvis Dohmatob (co-supervision with G.Varoquaux)
PhD in progress: Arthur Mensch (co-supervision with G. Varoquaux and J. Mairal)
PhD in progress: Jérome Dockès (co-supervision with G. Varoquaux and F. Suchanek)
PhD in progress: Kamalakar Reddy (co-supervision with G.Varoquaux and D. Engemann)
PhD in progress: Jérome-Alexis Chevalier (co-supervision with J.Salmon)
12/15: Reviewer of Lucie Thiebaut-Lonjaret PhD Thesis, Univ. Aix-Marseille, France.
04/23: Reviewer of Yosra Marnissi PhD Thesis, Univ. Paris-Est, France.
external reviewer, PhD defence of Christian Dansereau, Université de Montréal.
12/19: Examiner of Thomas Moreau, Univ. Paris Saclay / ENS Paris-Saclay
12/13: Reviewer of Etienne Combrisson PhD Thesis, Univ. Lyon 1, France.
11/10: Reviewer of Lankinen Kaisu PhD Thesis, Univ. Aalto, Finland.
04/26: Reviewer of Seyed Mostafa Kia PhD Thesis, Trento FBK Univ., Italy.
18/01: Reviewer of Jonathan Vacher PhD thesis, Univ. Paris VI
3O/05: Examiner of Brahim Bougacha PhD thesis, University Nice
Présentation "Démocratisation de l'IA", Station F, (Dec 2017).
Présentation sur les logiciels libres de machine learning aux journées annuelles du groupe de travail sur les logiciels libres (Oct 2017)
Présentation sur scikit-learn au "Paris ML meetup" (Dec 2017)
Présentation sur scikit-learn au "Paris Pydata meetup" (Mars 2017)
Présentation sur "L'intelligence artificielle au service de la société"
à l'UNESCO dans le cadre des journées des ingénieurs et scientifiques
de France (IESF) (Oct 2017).
Présentation sur l'Intelligence Artificielle au conseil d'administration de l'Institut National du Cancer (INCA) (Nov 2017).