The Parietal team focuses on mathematical methods for modeling and statistical inference based on neuroimaging data, with a particular interest in machine learning techniques and applications of human functional imaging. This general theme splits into four research axes:
Parietal is also strongly involved in open-source software development in scientific Python (machine learning) and for neuroimaging applications.
Many problems in neuroimaging can be framed as forward and inverse
problems.
For instance, brain population imaging is concerned with the
inverse problem that consists in predicting individual
information (behavior, phenotype) from neuroimaging data, while the
corresponding forward problem boils down to explaining
neuroimaging data with the behavioral variables.
Solving these problems entails the definition of two terms: a loss
that quantifies the goodness of fit of the solution (does the model
explain the data well enough?), and a regularization scheme that
represents a prior on the expected solution of the problem.
These priors can be used to enforce some properties on the solutions,
such as sparsity, smoothness or being piece-wise constant.
Let us detail the model used in typical inverse problem: Let
where the vector contains
where
with
Note that, while the qualitative aspect of the solutions are very different, the predictive power of these models is often very close.
The performance of the predictive model can simply be evaluated as the
amount of variance in learning
This framework is easily extended by considering
Multi-task learning: if several target variables
are thought to be related, it might be useful to constrain the
estimated parameter vector
For instance, when one of the variables
then
Multivariate decompositions provide a way to model complex
data such as brain activation images: for instance, one might be
interested in extracting an atlas of brain regions from a given
dataset, such as regions exhibiting similar activity during a
protocol, across multiple protocols, or even in the absence of
protocol (during resting-state).
These data can often be factorized
into spatial-temporal components, and thus can be estimated through
regularized Principal Components Analysis (PCA) algorithms,
which share some common steps with regularized regression.
Let
where
where
The problem is not jointly convex in all the variables but each
penalization given in Eq (2) yields a convex problem on
Ultimately, the main limitations to these algorithms is the cost due to the memory requirements: holding datasets with large dimension and large number of samples (as in recent neuroimaging cohorts) leads to inefficient computation. To solve this issue, online methods are particularly attractive 1.
Another important estimation problem stems from the general issue of learning the relationship between sets of variables, in particular their covariance. Covariance learning is essential to model the dependence of these variables when they are used in a multivariate model, for instance to study potential interactions among them and with other variables. Covariance learning is necessary to model latent interactions in high-dimensional observation spaces, e.g. when considering multiple contrasts or functional connectivity data.
The difficulties are two-fold: on the one hand, there is a shortage of data to learn a good covariance model from an individual subject, and on the other hand, subject-to-subject variability poses a serious challenge to the use of multi-subject data. While the covariance structure may vary from population to population, or depending on the input data (activation versus spontaneous activity), assuming some shared structure across problems, such as their sparsity pattern, is important in order to obtain correct estimates from noisy data. Some of the most important models are:
Adequate model selection procedures are necessary to achieve the right level of sparsity or regularization in covariance estimation; the natural evaluation metric here is the out-of-sample likelihood of the associated Gaussian model. Another essential remaining issue is to develop an adequate statistical framework to test differences between covariance models in different populations. To do so, we consider different means of parametrizing covariance distributions and how these parametrizations impact the test of statistical differences across individuals.
The brain as a highly structured organ, with both functional specialization and a complex network organization. While most of the knowledge historically comes from lesion studies and animal electophysiological recordings, the development of non-invasive imaging modalities, such as fMRI, has made it possible to study routinely high-level cognition in humans since the early 90's. This has opened major questions on the interplay between mind and brain , such as: How is the function of cortical territories constrained by anatomy (connectivity) ? How to assess the specificity of brain regions ? How can one characterize reliably inter-subject differences ?
Functional connectivity is defined as the interaction structure that underlies brain function. Since the beginning of fMRI, it has been observed that remote regions sustain high correlation in their spontaneous activity, i.e. in the absence of a driving task. This means that the signals observed during resting-state define a signature of the connectivity of brain regions. The main interest of resting-state fMRI is that it provides easy-to-acquire functional markers that have recently been proved to be very powerful for population studies.
While fMRI has been very useful in defining the function of regions at the mm scale, Magneto-encephalography (MEG) provides the other piece of the puzzle, namely temporal dynamics of brain activity, at the ms scale. MEG is also non-invasive. It makes it possible to keep track of precise schedule of mental operations and their interactions. It also opens the way toward a study of the rhythmic activity of the brain. On the other hand, the localization of brain activity with MEG entails the solution of a hard inverse problem.
Human neuroimaging targets two major goals:
i) the study of neural responses involved in sensory, motor or
cognitive functions, in relation to models from cognitive psychology,
i.e. the identification of neurophysiological and neuroanatomical correlates of cognition;
ii) the identification of markers in brain structure and
function of neurological or psychiatric diseases.
Both goals have to deal with a tension between
Importantly, the signal-to-noise ratio (SNR) of the data remains
limited due to both resolution improvements 2 and between-subject variability.
Altogether, these factors have led to realize that results of
neuroimaging studies were statistically weak, i.e. plagued
with low power and leading to unreliable inference
60, and particularly so due to the typically number
of subjects included in brain imaging studies (20 to 30, this number
tends to increase 61): this is at the core of the
neuroimaging reproducibility crisis.
This crisis is deeply related to a second issue, namely that only few
neuroimaging datasets are publicly available, making it impossible to
re-assess a posteriori the information conveyed by the data.
Fortunately, the situation improves, lead by projects such as
NeuroVault or
OpenfMRI. A framework for integrating such
datasets is however still missing.
Parietal has a long tradition of software development.
Scikit-learn can be used as a middleware for prediction tasks. For example, many web startups adapt Scikitlearn to predict buying behavior of users, provide product recommendations, detect trends or abusive behavior (fraud, spam). Scikit-learn is used to extract the structure of complex data (text, images) and classify such data with techniques relevant to the state of the art.
Easy to use, efficient and accessible to non datascience experts, Scikit-learn is an increasingly popular machine learning library in Python. In a data exploration step, the user can enter a few lines on an interactive (but non-graphical) interface and immediately sees the results of his request. Scikitlearn is a prediction engine . Scikit-learn is developed in open source, and available under the BSD license.
The PySAP (Python Sparse data Analysis Package, https://github.com/CEA-COSMIC/pysap) open-source image processing software package has been developed for the 3 years between the Compressed Sensing group at Iniria-CEA Parietal team led by Philippe Ciuciu and the CosmoStat team (CEA/IRFU) led by Jean-Luc Statck. It has been developed for the COmpressed Sensing for Magnetic resonance Imaging and Cosmology (COSMIC) project. This package provides a set of flexible tools that can be applied to a variety of compressed sensing and image reconstruction problems in various research domains. In particular, PySAP offers fast wavelet transforms and a range of integrated optimisation algorithms. It also offers a variety of plugins for specific application domains: on top of Pysap-MRI and PySAP-astro plugins, several complementary modules are now in development for electron tomography and electron microscopy for CEA colleagues. In October 2019, PySAP has been released on PyPi (https://pypi.org/project/python-pySAP/, currently version 0.0.3) and in conda (https://anaconda.org/agrigis/python-pysap).
The Pysap-MRI has been advertised through a specific abstract accepted to the next workshop of ISMRM on Data Sampling & Image Reconstruction in late January 2020. It will be presented during a power pitch session together wih an hands-on demo session using JuPyter notebooks.
Parietal is involved in the Neurospin platform.
Inter-individual variability in the functional organization of the brain presents a major obstacle to identifying generalizable neural coding principles. Functional alignment—a class of methods that matches subjects’ neural signals based on their functional similarity—is a promising strategy for addressing this variability. To date, however, a range of functional alignment methods have been proposed and their relative performance is still unclear. In this work, we benchmark five functional alignment methods for inter-subject decoding on four publicly available datasets. Specifically, we consider three existing methods: piecewise Procrustes, searchlight Procrustes, and piecewise Optimal Transport. We also introduce and benchmark two new extensions of functional alignment methods: piecewise Shared Response Modelling (SRM), and intra-subject alignment. We find that functional alignment generally improves inter-subject decoding accuracy though the best performing method depends on the research context. Specifically, SRM and Optimal Transport perform well at both the region-of-interest level of analysis as well as at the whole-brain scale when aggregated through a piecewise scheme. We also benchmark the computational efficiency of each of the surveyed methods, providing insight into their usability and scalability. Taking inter-subject decoding accuracy as a quantification of inter-subject similarity, our results support the use of functional alignment to improve inter-subject comparisons in the face of variable structure-function organization. We provide open implementations of all methods used.
Cognitive brain imaging is accumulating datasets about the neural substrate of many different mental processes. Yet, most studies are based on few subjects and have low statistical power. Analyzing data across studies could bring more statistical power; yet the current brain-imaging analytic framework cannot be used at scale as it requires casting all cognitive tasks in a unified theoretical framework. We introduce a new methodology to analyze brain responses across tasks without a joint model of the psychological processes. The method boosts statistical power in small studies with specific cognitive focus by analyzing them jointly with large studies that probe less focal mental processes. Our approach improves decoding performance for 80% of 35 widely-different functional-imaging studies. It finds commonalities across tasks in a data-driven way, via common brain representations that predict mental processes. These are brain networks tuned to psychological manipulations. They outline interpretable and plausible brain structures. The extracted networks have been made available; they can be readily reused in new neuro-imaging studies. We provide a multi-study decoding tool to adapt to new data.
Supervised learning paradigms are often limited by the amount of labeled data that is available. This phenomenon is particularly problematic in clinically-relevant data, such as electroencephalography (EEG), where labeling can be costly in terms of specialized expertise and human processing time. Consequently, deep learning architectures designed to learn on EEG data have yielded relatively shallow models and performances at best similar to those of traditional feature-based approaches. However, in most situations, unlabeled data is available in abundance. By extracting information from this unlabeled data, it might be possible to reach competitive performance with deep neural networks despite limited access to labels.
We investigated self-supervised learning (SSL), a promising technique for discovering structure in unlabeled data, to learn representations of EEG signals. Specifically, we explored two tasks based on temporal context prediction as well as contrastive predictive coding on two clinically-relevant problems: EEG-based sleep staging and pathology detection. We conducted experiments on two large public datasets with thousands of recordings and performed baseline comparisons with purely supervised and hand-engineered approaches.
Linear classifiers trained on SSL-learned features consistently outperformed purely supervised deep neural networks in low-labeled data regimes while reaching competitive performance when all labels were available. Additionally, the embeddings learned with each method revealed clear latent structures related to physiological and clinical phenomena, such as age effects.
We demonstrate the benefit of SSL approaches on EEG data. Our results suggest that self-supervision may pave the way to a wider use of deep learning models on EEG data.
How to learn a good predictor on data with missing values? Most efforts focus on first imputing as well as possible and second learning on the completed data to predict the outcome. Yet, this widespread practice has no theoretical grounding. Here we show that for almost all imputation functions, an impute-then-regress procedure with a powerful learner is Bayes optimal. This result holds for all missing-values mechanisms, in contrast with the classic statistical results that require missing-at-random settings to use imputation in probabilistic modeling. Moreover, it implies that perfect conditional imputation is not needed for good prediction asymptotically. In fact, we show that on perfectly imputed data the best regression function will generally be discontinuous, which makes it hard to learn. Crafting instead the imputation so as to leave the regression function unchanged simply shifts the problem to learning discontinuous imputations. Rather, we suggest that it is easier to learn imputation and regression jointly. We propose such a procedure, adapting NeuMiss, a neural network capturing the conditional links across observed and unobserved variables whatever the missing-value pattern. Experiments confirm that joint imputation and regression through NeuMiss is better than various two step procedures in our experiments with finite number of samples.
Inferring the parameters of a stochastic model based on experimental observations is central to the scientific method. A particularly challenging setting is when the model is strongly indeterminate, i.e. when distinct sets of parameters yield identical observations. This arises in many practical situations, such as when inferring the distance and power of a radio source (is the source close and weak or far and strong?) or when estimating the amplifier gain and underlying brain activity of an electrophysiological experiment. In this work, we present hierarchical neural posterior estimation (HNPE), a novel method for cracking such indeterminacy by exploiting additional information conveyed by an auxiliary set of observations sharing global parameters. Our method extends recent developments in simulation- based inference (SBI) based on normalizing flows to Bayesian hierarchical models. We validate quantitatively our proposal on a motivating example amenable to analytical solutions and then apply it to invert a well known non-linear model from computational neuroscience.
We consider shared response modeling, a multi-view learning problem where one wants to identify common components from multiple datasets or views. We introduce Shared Independent Component Analysis (ShICA) that models each view as a linear transform of shared independent components contaminated by additive Gaussian noise. We show that this model is identifiable if the components are either non-Gaussian or have enough diversity in noise variances. We then show that in some cases multi-set canonical correlation analysis can recover the correct unmixing matrices, but that even a small amount of sampling noise makes Multiset CCA fail. To solve this problem, we propose to use joint diagonalization after Multiset CCA, leading to a new approach called ShICA-J. We show via simulations that ShICA-J leads to improved results while being very fast to fit. While ShICA-J is based on second-order statistics, we further propose to leverage non-Gaussianity of the components using a maximum-likelihood method, ShICA-ML, that is both more accurate and more costly. Further, ShICA comes with a principled method for shared components estimation. Finally, we provide empirical evidence on fMRI and MEG datasets that ShICA yields more accurate estimation of the components than alternatives.
The activations of language transformers like GPT-2 have been shown to linearly map onto brain activity during speech comprehension. However, the nature of these activations remains largely unknown and presumably conflate distinct linguistic classes. Here, we propose a taxonomy to factorize the high-dimensional activations of language models into four combinatorial classes: lexical, compositional, syntactic, and semantic representations. We then introduce a statistical method to decompose, through the lens of GPT-2's activations, the brain activity of 345 subjects recorded with functional magnetic resonance imaging (fMRI) during the listening of 4.6 hours of narrated text. The results highlight two findings. First, compositional representations recruit a more widespread cortical network than lexical ones, and encompass the bilateral temporal, parietal and prefrontal cortices. Second, contrary to previous claims, syntax and semantics are not associated with separated modules, but, instead, appear to share a common and distributed neural substrate. Overall, this study introduces a versatile framework to isolate, in the brain activity, the distributed representations of linguistic constructs.
Effective characterisation of the brain grey matter cytoarchitecture with quantitative sensitivity to soma density and volume remains an unsolved challenge in diffusion MRI (dMRI). Solving the problem of relating the dMRI signal with cytoarchitectural characteristics calls for the definition of a mathematical model that describes brain tissue via a handful of physiologically-relevant parameters and an algorithm for inverting the model. To address this issue, we propose a new forward model, specifically a new system of equations, requiring six relatively sparse b-shells. These requirements are a drastic reduction of those used in current proposals to estimate grey matter cytoarchitecture. We then apply current tools from Bayesian analysis known as likelihood-free inference (LFI) to invert our proposed model. As opposed to other approaches from the literature, our LFI-based algorithm yields not only an estimation of the parameter vector that best describes a given observed data point, but also a full posterior distribution over the parameter space. This enables a richer description of the model inversion results providing indicators such as confidence intervals for the estimations, and better understanding of the parameter regions where the model may present indeterminacies. We approximate the posterior distribution using deep neural density estimators, known as normalizing flows, and fit them using a set of repeated simulations from the forward model. We validate our approach on simulations using dmipy and then apply the whole pipeline to the HCP MGH dataset.
Accelerating MRI scans is one of the principal outstanding problems in the MRI research community. Towards this goal, we hosted the second fastMRI competition targeted towards reconstructing MR images with subsampled k-space data. We provided participants with data from 7,299 clinical brain scans (de-identified via a HIPAA-compliant procedure by NYU Langone Health), holding back the fully-sampled data from 894 of these scans for challenge evaluation purposes. In contrast to the 2019 challenge, we focused our radiologist evaluations on pathological assessment in brain images. We also debuted a new Transfer track that required participants to submit models evaluated on MRI scanners from outside the training set. We received 19 submissions from eight different groups. Results showed one team scoring best in both SSIM scores and qualitative radiologist evaluations. We also performed analysis on alternative metrics to mitigate the effects of background noise and collected feedback from the participants to inform future challenges. Lastly, we identify common failure modes across the submissions, highlighting areas of need for future research in the MRI reconstruction community.
The Cython+ grant, funded by BPI France, and Region Ile de France,
unites Inria, Telecom Paristech, Nexedis, and Abilian to improve parallel
computing in Python.
Neuroscience is at an inflection point. The 150-year old cortical specialization paradigm, in which cortical brain areas have a distinct set of functions, is experiencing an unprecedented momentum with over 1000 articles being published every year. However, this paradigm is reaching its limits. Recent studies show that current approaches to atlas brain areas, like relative location, cellular population type, or connectivity, are not enough on their own to characterize a cortical area and its function unequivocally. This hinders the reproducibility and advancement of neuroscience.
Neuroscience is thus in dire need of a universal standard to specify neuroanatomy and function: a novel formal language allowing neuroscientists to simultaneously specify tissue characteristics, relative location, known function and connectional topology for the unequivocal identification of a given brain region.
The vision of NeuroLang is that a unified formal language for neuroanatomy will boost our understanding of the brain. By defining brain regions, networks, and cognitive tasks through a set of formal criteria, researchers will be able to synthesize and integrate data within and across diverse studies. NeuroLang will accelerate the development of neuroscience by providing a way to evaluate anatomical specificity, test current theories, and develop new hypotheses.
NeuroLang will lead to a new generation of computational tools for neuroscience research. In doing so, we will be shedding a novel light onto neurological research and possibly disease treatment and palliative care. Our project complements current developments in large multimodal studies across different databases. This project will bring the power of Domain Specific Languages to neuroscience research, driving the field towards a new paradigm articulating classical neuroanatomy with current statistical and machine learning-based approaches.
The Human Brain Project (HBP) is one of the three FET (Future and Emerging Technology) Flagship projects. Started in 2013, it is one of the largest research projects in the world . More than 500 scientists and engineers at over than 140 universities, teaching hospitals, and research centres across Europe come together to address one of the most challenging research targets – the human brain.
To tame brain complexity, the project is building a research infrastructure to help advance neuroscience, medicine, computing and brain-inspired technologies - EBRAINS. The HBP is developing EBRAINS to create lasting research platforms that benefit the wider community.
The HBP provides a framework where teams of researchers and technologists work together to scale up ambitious ideas from the lab, explore the different aspects of brain organisation, and understand the mechanisms behind cognition, learning, or plasticity.
Scientists in the HBP conduct targeted experimental studies and develop theories and models to shed light on the human connectome, addressing mechanisms that underlie information processing, from the molecule to cellular signaling and large-scale networks.
The project teams transfer the acquired knowledge to make an impact in health and innovation: Insights from basic research are translated into medical applications, to prepare the ground for new diagnoses and therapies. Discoveries about learning and brain plasticity mechanisms are used to inspire technologic progress, e.g., in artificial intelligence. In addition, the project studies the ethical and societal implications of the advancement of neuroscience and related fields.
In its final phase (April 2020 – March 2023) the HBP’s focus is to advance three core scientific areas – brain networks, their role in consciousness, and artificial neural nets – while further expanding EBRAINS.
Currently transitioning into a sustainable infrastructure, EBRAINS will remain available to the scientific community, as a lasting contribution of the HBP to global scientific progress.
While mild traumatic brain injury (mTBI) has become the focus of many neuroimaging studies, the understanding of mTBI, particularly in patients who evince no radiological evidence of injury and yet experience clinical and cognitive symptoms, has remained a complex challenge. Sophisticated imaging tools are needed to delineate the kind of subtle brain injury that is extant in these patients, as existing tools are often ill-suited for the diagnosis of mTBI. For example, conventional magnetic resonance imaging (MRI) studies have focused on seeking a spatially consistent pattern of abnormal signal using statistical analyses that compare average differences between groups, i.e., separating mTBI from healthy controls. While these methods are successful in many diseases, they are not as useful in mTBI, where brain injuries are spatially heterogeneous.
The goal of this proposal is to develop a robust framework to perform subject-specific neuroimaging analyses of Diffusion MRI (dMRI), as this modality has shown excellent sensitivity to brain injuries and can locate subtle brain abnormalities that are not detected using routine clinical neuroradiological readings. New algorithms will be developed to create Individualized Brain Abnormality (IBA) maps that will have a number of clinical and research applications. In this proposal, this technology will be used to analyze a previously acquired dataset from the INTRuST Clinical Consortium, a multi-center effort to study subjects with Post- Traumatic Stress Disorder (PTSD) and mTBI. Neuroimaging abnormality measures will be linked to clinical and neuropsychological assessments. This technique will allow us to tease apart neuroimaging differences between PTSD and mTBI and to establish baseline relationships between neuroimaging markers, and clinical and cognitive measures.
Machine learning has inspired new markets and applications by extracting new insights from complex and noisy data. However, to perform such analyses, the most costly step is often to prepare the data. It entails correcting errors and inconsistencies as well as transforming the data into a single matrix-shaped table that comprises all interesting descriptors for all observations to study. Indeed, the data often results from merging multiple sources of informations with different conventions. Different data tables may come without names on the columns, with missing data, or with input errors such as typos. As a result, the data cannot be automatically shaped into a matrix for statistical analysis.
This proposal aims to drastically reduce the cost of data preparation by integrating it directly into the statistical analysis. Our key insight is that machine learning itself deals well with noise and errors. Hence, we aim to develop the methodology to do statistical analysis directly on the original dirty data. For this, the operations currently done to clean data before the analysis must be adapted to a statistical framework that captures errors and inconsistencies. Our research agenda is inspired from the data-integration state of the art in database research combined with statistical modeling and regularization from machine learning.
Data integrating and cleaning is traditionally performed in databases by finding fuzzy matches or overlaps and applying transformation rules and joins. To incorporate it in the statistical analysis, an thus propagate uncertainties, we want to revisit those logical and set operations with statistical-learning tools. A challenge is to turn the entities present in the data into representations well-suited for statistical learning that are robust to potential errors but do not wash out uncertainty.
Prior art developed in databases is mostly based on first-order logic and sets. Our project strives to capture errors in the input of the entries. Hence we formulate operations in terms of similarities. We address typing entries, deduplication -finding different forms of the same entity- building joins across dirty tables, and correcting errors and missing data.
Our goal is that these steps should be generic enough to digest directly dirty data without user-defined rules. Indeed, they never try to build a fully clean view of the data, which is something very hard, but rather include in the statistical analysis errors and ambiguities in the data.
The methods developed will be empirically evaluated on a variety of dataset, including the French public-data repository, datagouv. The consortium comprises a company specialized in data integration, Data Publica, that guides business strategies by cross-analyzing public data with market-specific data.
In many scientific applications, increasingly-large datasets are being acquired to describe more accurately biological or physical phenomena. While the dimensionality of the resulting measures has increased, the number of samples available is often limited, due to physical or financial limits. This results in impressive amounts of complex data observed in small batches of samples.
A question that arises is then : what features in the data are really informative about some outcome of interest ? This amounts to inferring the relationships between these variables and the outcome, conditionally to all other variables. Providing statistical guarantees on these associations is needed in many fields of data science, where competing models require rigorous statistical assessment. Yet reaching such guarantees is very hard.
FAST-BIG aims at developing theoretical results and practical estimation procedures that render statistical inference feasible in such hard cases. We will develop the corresponding software and assess novel inference schemes on two applications : genomics and brain imaging.
The scale-free concept formalizes the intuition that, in many systems, the analysis of temporal dynamics cannot be grounded on specific and characteristic time scales. The scale-free paradigm has permitted the relevant analysis of numerous applications, very different in nature, ranging from natural phenomena (hydrodynamic turbulence, geophysics, body rhythms, brain activity,...) to human activities (Internet traffic, population, finance, art,...).
Yet, most successes of scale-free analysis were obtained in contexts where data are univariate, homogeneous along time (a single stationary time series), and well-characterized by simple-shape local singularities. For such situations, scale-free dynamics translate into global or local power laws, which significantly eases practical analyses. Numerous recent real-world applications (macroscopic spontaneous brain dynamics, the central application in this project, being one paradigm example), however, naturally entail large multivariate data (many signals), whose properties vary along time (non-stationarity) and across components (non-homogeneity), with potentially complex temporal dynamics, thus intricate local singular behaviors.
These three issues call into question the intuitive and founding identification of scale-free to power laws, and thus make uneasy multivariate scale-free and multifractal analyses, precluding the use of univariate methodologies. This explains why the concept of scale-free dynamics is barely used and with limited successes in such settings and highlights the overriding need for a systematic methodological study of multivariate scale-free and multifractal dynamics. The Core Theme of MULTIFRACS consists in laying the theoretical foundations of a practical robust statistical signal processing framework for multivariate non homogeneous scale-free and multifractal analyses, suited to varied types of rich singularities, as well as in performing accurate analyses of scale-free dynamics in spontaneous and task-related macroscopic brain activity, to assess their natures, functional roles and relevance, and their relations to behavioral performance in a timing estimation task using multimodal functional imaging techniques.
This overarching objective is organized into 4 Challenges:
The project has finally started in 2021. A postdoc has been identified, Tiziana Cattai, and she will be hired in Spring 2022.
The DARLING project will aim to propose new adaptive learning methods, distributed and collaborative on large dynamic graphs in order to extract structured information of the data flows generated and/or transiting at the nodes of these graphs. In order to obtain performance guarantees, these methods will be systematically accompanied by an in-depth study of random matrix theory. This powerful tool , never exploited so far in this context although perfectly suited for inference on random graphs, will thereby provide even avenues for improvement. Finally, in addition to their evaluation on public data sets, the methods will be compared with each other using two advanced imaging techniques in which two of the partners are involved: radio astronomy with the giant SKA instrument (Obs. Côte d'Azur) and magnetoencephalographic brain imaging (Inria Parietal at NeuroSpin, CEA Saclay). These involve the processing of time series on graphs while operating at extreme observation scales.
The project will be starting in 2021 with a post-doc or PhD student to be hired probably in fall 2021 or 2022.
VLFMRI aims at developing a very low-field Magnetic Resonance Imaging (MRI) system as an alternative to conventional high-field MRI for continuous imaging of premature newborns to detect hemorrhages or ischemia. This system is based on a combination of a new generation of magnetic sensors based on spin electronics, optimized MR acquisition sequences (based on the SPARKLING patent, Inria-CEA Parietal team at NeuroSpin) and a open and compatible system with an incubator that will allow to achieve an image resolution of 1mm
The project accepted by ANR in 2019 started in 2020 with an engineer hired in 2020. This project is in collaboration with the MEG groups at CEA NeuroSpin and the Brain and Spine Institute (ICM) in Paris.
The neuroimaging community recently started an international effort to standardize the sharing of data recorded with magnetoencephalography (MEG) and with electroencephalography (EEG). This format, known as the Brain Imaging Data Structure (BIDS), now needs a wider adoption, notably in the French neuroimaging community, along with the development of dedicated software tools that operate seamlessly on BIDS formatted datasets. The meegBIDS.fr project has three aims: 1) accelerate the research cycles by allowing analysis software tools to work with BIDS formated data, 2) simplify data sharing with high quality standards thanks to automated validation tools, 3) train French neuroscientists to leverage existing public BIDS MEG/EEG datasets and to share their own data with little efforts.
The project accepted by ANR in 2020 started in 2021 with a PhD student. An engineer should be hired in 2022 to lead the software engineering developments. This project is in collaboration with the University of Freiburg in Germany and the RIKEN AIP in Japan.
Worldwide, people are living longer than ever before in history. Today, most people can expect to live into their sixties and beyond. Ageing societies, however, bring social, economical, and healthcare challenges. Japan (#1), France (#3) and Germany (#4) belong to the top five countries worldwide with the highest economic old-age dependency ratio of people aged over 65 years and more. Particularly detrimental health conditions in older age include depression, and dementia. Today, around 50 M people globally suffer from dementia and there are nearly 10 M new cases every year. According to WHO there is a new case of dementia every 3 seconds globally. Mastering the challenges associated with aging societies in general, and those associated with age-related brain disorders in particular, is therefore of outstanding global importance, and especially for the three countries involved in the present trilateral call, Japan, France, and Germany. Therefore, the aim of the present project is to leverage the potential of artificial intelligence (AI) approaches to foster healthy aging. To this aim we will study objective machine-learning-driven biomarkers to evaluate cognitive interventions as well as support personalized therapies. We will develop novel, dedicated machine learning (ML) methods and adapt them to the special signal types that can be recorded from the human brain. We will make our methods publicly available in an open-source reference software package, focussing on unsupervised learning, data augmentation, domain adaptation, and interpretable machine learning models. Our main scientific aims is to optimize the decodable information about the current functional state of the brain, to identify biomarkers of the risk for cognitive impairments and different forms of dementia, and use these improved methods to guide AI-facilitated cognitive training. These joint efforts between Japan, France and Germany will be accompanied by a focus on ethical and societal aspects of AI in the context of aging, paired with participatory, transnational outreach activities, to foster the dialog between our scientific community and the general public.
The project accepted in 2020 by ANR in the "Chaire IA" call started in 2021 with the recruitment of an engineer, 1 PhD and one post-doc.
The general objectives of BrAIN is to develop ML algorithms that can learn with weak or no supervision on neural time series. It will require contributions to self-supervised learning, domain adaptation and data augmentation techniques, exploiting the known underlying physical mechanisms that govern the data generating process of neurophysiological signals.
The project accepted in 2020 by ANR in the "Chaire IA" call will be starting in 2021 with an engineer, 1 PhD and a starting position to be hired in 2021.
Cognitive science describes mental operations, and functional brain imaging provides a unique window into the brain systems that support these operations. A growing body of neuroimaging research has provided significant insight into the relations between psychological functions and brain activity. However, the aggregation of cognitive neuroscience results to obtain a systematic mapping between structure and function faces the roadblock that cognitive concepts are ill-defined and may not map cleanly onto the computational architecture of the brain.
To tackle this challenge, we propose to leverage rapidly increasing data sources: text and brain locations described in neuroscientific publications, brain images and their annotations taken from public data repositories, and several reference datasets. Our aim here is to develop multi-modal machine learning techniques to bridge these data sources.
The project accepted in 2020 by ANR in the "Chaire IA" call will be starting in 2021 with an engineer, 2 PhDs and a post-doc to be hired in 2021.
The goal of LearnI is to develop machine-learning across multiple sources of relational data, with numerical and symbolic entries. LearnI will address the core challenge of joining and aggregating across tables where the information is represented with different symbols. For this, LearnI will develop methods to embed the discrete elements in vector spaces and perform data assembly across tables with these vectorial representations.
Gael Varoquaux was member of the organizing committee of the autoDS workshop at ECML.
Bertrand Thirion organized a workshop on modern statistical methods at the OHBM 2021 conference.
Bertrand Thirion has given the following talks:
Alexandre Gramfort has given the following talks:
Philippe Ciuciu has given the following talks:
Thomas Moreau has given the following talks:
Gael Varoquaux has given the following talks:
Bertrand Thirion has been part of a panel reviewing CEA-DRT activities in AI in nov. 2021.
Gael Varoquaux has been part of the Global Partnership on AI
Philippe Ciuciu has published an article in the Issue of March 2021 in the Contact SKA Magazine, entitled “When the brain meets the stars: Knowledge made visible to the naked eye” (pp. 25-26 or see here)
After the seminal publication about SPARKLING on the Dr Imago website in 20193, Philippe Ciuciu has written a novel article for this online journal dedicated to the medical doctors about Deep learning for MRI (see details here: la-recherche-en-astrophysique-faconne-les-algorithmes-dimagerie-de-demain/).
Gaël Varoquaux, Olivier Grisel, Guillaume Lemaitre, and Loic Esteve have created and run the scikit-learn MOOC (10 000 enrolled, 1 000 finisher)
Bertrand Thirion has given a talk at semaine de la science, NeuroSpin, on March 18th, entitled Le décodage de l’activité du cerveau.