- A5.3. Image processing and analysis
- A5.4. Computer vision
- A5.4.1. Object recognition
- A5.4.5. Object tracking and motion analysis
- A5.4.6. Object localization
- A5.6. Virtual reality, augmented reality
- A5.10.2. Perception
- B2.6. Biological and medical imaging
- B5.9. Industrial maintenance
- B9.5.3. Physics
1 Team members, visitors, external collaborators
- Marie-Odile Berger [Team leader, Inria, Senior Researcher, HDR]
- Erwan Kerrien [Inria, Researcher, HDR]
- Fabien Pierre [Univ de Lorraine, Associate Professor]
- Gilles Simon [Univ de Lorraine, Associate Professor, HDR]
- Frédéric Sur [Univ de Lorraine, Professor, HDR]
- Pierre-Frédéric Villard [Univ de Lorraine, Associate Professor]
- Brigitte Wrobel-Dautcourt [Univ de Lorraine, Associate Professor]
- Youssef Assis [Univ de Lorraine, from Nov 2020]
- Abdelkarim Elassam [Inria, from Oct 2020]
- Nariman Khaledian [Inria, from Oct 2020 ]
- Daryna Panicheva [Univ de Lorraine]
- Matthieu Zins [Inria]
- Romain Boisseau [Inria, Engineer, from Sep 2020 ]
- Vincent Gaudilliere [Inria, Engineer]
- Thomas Mangin [Univ de Lorraine, Engineer, until Mar 2020]
Interns and Apprentices
- Amandine Caut [Univ de Lorraine, from Feb 2020 until Jul 2020]
- Liang Liao [Centre hospitalier universitaire de Nancy, until Jun 2020]
- Hugo Steiger [Univ de Lorraine, until Sep 2020]
- Isabelle Blanchard [Inria]
- Virginie Priester [CNRS]
2 Overall objectives
2.1 Augmented Reality
The basic concept of Augmented Reality (AR) is to place information correctly registered with the environment into the user's perception. What makes AR stand out is that this new technology offers the potential for big changes in many application fields such as industrial maintenance, creative technologies, image guided medical gestures, entertainment...
Augmented reality technologies have made major advances in recent years, both in terms of capability, mobile development and integration into current mobile devices. Most applications are dedicated to multimedia and entertainment, games, lifestyle and healthcare and use rough localization information provided by the sensors of the mobile phones. Cutting-edge augmented reality applications which take place in complex environments and require high accuracy in augmentation are less prevalent. There are indeed still technological barriers that prevent applications from reaching the robustness and the accuracy required by such applications.
The aim of the MAGRIT project is to develop vision-based methods which allow significant progress of AR technologies in terms of ease of implementation, reliability and robustness. An expected consequence is the widening of the current application field of AR.
The team is active in both medical and classical applications of augmented reality for which accurate integration of the virtual objects within the scene is essential. Key requirements of AR systems are the availability of registration techniques, both rigid and elastic, that allow the virtual objects to be correctly aligned with the environment, as well as means to build 3D models which are appropriate for pose computation and for handling interactions between the virtual objects and the real scene. Considering the common needs for tracking, navigation, advanced modeling and visualization technologies in both medical and industrial applications, the team focuses on three main objectives: matching, localization and modeling. Methods are developed with a view to meet the expected robustness and accuracy over time and to provide the user with a realistic perception of the augmented scene, while satisfying the real-time achievements required by these procedures.
In the last decade, Deep Learning approaches have made it possible to achieve unprecedented performance not only on object recognition but also on a broad range of computer vision problems. One of the objectives of the MAGRIT team is to integrate machine learning techniques into the geometric problems under consideration in order to improve the robustness of the designed solutions.
3 Research program
3.1 Matching and 3D tracking
One of the most basic problems currently limiting AR applications is the registration problem. The objects in the real and virtual worlds must be properly aligned with respect to each other, or the illusion that the two worlds coexist will be compromised.
As a large number of potential AR applications are interactive, real-time pose computation is required. Although the registration problem has received a lot of attention in the computer vision community, the problem of real-time registration is still far from being a solved problem, especially for unstructured environments. Ideally, an AR system should work in all environments, without the need to prepare the scene ahead of time, independently of the variations in experimental conditions (lighting, weather conditions,...)
For several years, the MAGRIT project has been aiming at developing on-line and marker-less methods for camera pose computation. The main difficulty with on-line tracking is to ensure robustness of the process over time. For off-line processes, robustness is achieved by using spatial and temporal coherence of the considered sequence through move-matching techniques. To get robust open-loop systems, we have investigated various methods, ranging from statistical methods to the use of hybrid camera/sensor systems. Many of these methods are dedicated to piecewise-planar scenes and combine the advantage of move-matching methods and model-based methods. In order to reduce statistical fluctuations in viewpoint computation, which lead to unpleasant jittering or sliding effects, we have also developed model selection techniques which allow us to noticeably improve the visual impression and to reduce drift over time. Another line of research which has been considered in the team to improve the reliability and the robustness of pose algorithms is to combine the camera with another form of sensor in order to compensate for the shortcomings of each technology.
The success of pose computation over time largely depends on the quality of the matching at the initialization stage. Indeed, the current image may be very different from the appearances described in the model both on the geometrical and the photometric sides. Research is thus conducted in the team on the use of probabilistic methods to establish robust correspondences of features. The use of a contrario methods has been investigated to achieve this aim 8. We especially addressed the complex case of matching in scenes with repeated patterns which are common in urban scenes. We are also investigating the problem of matching images taken from very different viewpoints which is central for the re-localization issue in AR. Within the context of a scene model acquired with structure-from-motion techniques, we are currently investigating the use of viewpoint simulation in order to allow successful pose computation even if the considered image is far from the positions used to build the model 5.
Over time, the issue of tracking deformable objects has gained importance in the team. This topic is mainly addressed in the context of medical applications through the design of bio-mechanical models guided by visual features 3. We have successfully investigated the use of such models in laparoscopy, with a vascularized model of the liver and with a hyper-elastic model for tongue tracking in ultrasound images. However, these results have been obtained so far in relatively controlled environments, with non-pathological cases. When clinical routine applications are to be considered, many parameters and considerations need to be taken into account. Among the problems that need to be addressed are more realistic model representations, the specification of the range of physical parameters and the need to enforce the robustness of the tracking with respect to outliers, which are common in the interventional context.
3.2 Image-based Modeling
Modeling the scene is a fundamental issue in AR for many reasons. First, pose computation algorithms often use a model of the scene or at least some 3D knowledge on the scene. Second, effective AR systems require a model of the scene to support interactions between the virtual and the real objects such as occlusions, lighting reflections, contacts... in real-time. Unlike pose computation which has to be performed in a sequential way, scene modeling can be considered as an off-line or an on-line problem depending on the requirements of the targeted application. Interactive in-situ modeling techniques have thus been developed with the aim to enable the user to define what is relevant at the time the model is being built during the application. On the other hand, we also proposed off-line multimodal techniques, mainly dedicated to AR medical applications, with the aim of obtaining realistic and possibly dynamic models of organs suitable for real-time simulation 4.
In-situ modeling allows a user to directly build a 3D model of his/her surrounding environment and verify the geometry against the physical world in real-time. This is of particular interest when using AR in unprepared environments or building scenes that either have an ephemeral existence (e.g., a film set) or cannot be accessed frequently (e.g., a nuclear power plant). We have especially investigated two systems, one based on the image content only and the other based on multiple data coming from different sensors (camera, inertial measurement unit, laser rangefinder). Both systems use the camera-mouse principle 31 (i.e., interactions are performed by aiming at the scene through a video camera) and both systems have been designed to acquire polygonal textured models, which are particularly useful for camera tracking and object insertion in AR.
Multimodal modeling for real-time simulation
With respect to classical AR applications, AR in a medical context differs in the nature and the size of the data which are available: a large amount of multimodal data is acquired on the patient or possibly on the operating room through sensing technologies or various image acquisitions 28. The challenge is to analyze these data, to extract interesting features, to fuse and to visualize this information in a proper way. Within the MAGRIT team, we address several key problems related to medical augmented environments. Being able to acquire multimodal data which are temporally synchronized and spatially registered is the first difficulty we face when considering medical AR. Another key requirement of AR medical systems is the availability of 3D (+t) models of the organ/patient built from images, to be overlaid onto the users' view of the environment.
Methods for multimodal modeling are strongly dependent on the imaging modalities and the organ specificities. We thus only address a restricted number of medical applications –interventional neuro-radiology, laparoscopic surgery– for which we have a strong expertise and close relationships with motivated clinicians. In these applications, our aim is to produce realistic models and then realistic simulations of the patient to be used for the training of surgeons or the re-education of patients.
One of our main applications is about neuroradiology. For the last 25 years, we have been working in close collaboration with the neuroradiology laboratory (CHRU-University Hospital of Nancy) and GE Healthcare. As several imaging modalities are now available in an intraoperative context (2D and 3D angiography, MRI, ...), our aim is to develop a multimodal framework to assist therapeutic decision and treatment.
We have mainly been interested in the effective use of a multimodal framework in the treatment of arteriovenous malformations (AVM) and aneurysms in the context of interventional neuroradiology. The goal of interventional gestures is to guide endoscopic tools towards the pathology with the aim to perform embolization of the AVM or to fill the aneurysmal cavity by placing coils. We have proposed and developed multimodality and augmented reality tools which make various image modalities (2D and 3D angiography, fluoroscopic images, MRI, ...) cooperate in order to assist physicians in clinical routine. One of the successes of this collaboration is the implementation of the concept of augmented fluoroscopy, which helps the surgeon to guide endoscopic tools towards the pathology. Lately, in cooperation with the team MIMESIS, we have proposed new methods for implicit modeling of the vasculature with the aim of obtaining near real-time simulation of the coil deployment in the aneurysm 4. These works open the way towards near real-time patient-based simulations of interventional gestures both for training and for planning.
3.3 Parameter estimation
Many problems in computer vision or image analysis can be formulated in terms of parameter estimation from image-based measurements. This is the case of many problems addressed in the team such as pose computation or image-guided estimation of 3D deformable models. Often traditional robust techniques which take into account the covariance on the measurements are sufficient to achieve reliable parameter estimation. However, depending on their number, their spatial distribution and the uncertainty on these measurements, some problems are very sensitive to noise and there is a considerable interest in considering how parameter estimation could be improved if additional information on the noise were available. Another common problem in our field of research is the need to estimate constitutive parameters of the models, such as (bio)-mechanical parameters for instance. Direct measurement methods are destructive, and elaborating image-based methods is thus highly desirable. Besides designing appropriate estimation algorithms, a fundamental question is to understand what group of parameters under study can be reliably estimated from a given experimental setup.
This line of research is relatively new in the team. One of the challenges is to improve image-based parameter estimation techniques considering sensor noise and specific image formation models. In a collaboration with the Pascal Institute (Clermont Ferrand), metrological performance enhancement for experimental solid mechanics has been addressed through the development of dedicated signal processing methods 7. In the medical field, specific methods based on an adaptive evolutionary optimization strategy have been designed for estimating respiratory parameters 9. In the context of designing realistic simulators for neuroradiology, we are now considering how parameters involved in the simulation could be adapted to fit real images 16.
4 Application domains
4.1 Augmented reality
We have a significant experience in AR that has allowed good progress in building usable, reliable and robust AR systems. Our contributions cover the entire process of AR: matching, pose initialization, 3D tracking, in-situ modeling, handling interaction between real and virtual objects....
4.2 Medical Imaging
For more than 20 years, we have been working in close collaboration with University Hospital of Nancy and GE Healthcare in interventional neuroradiology. Our common aim is to develop a multimodal framework to assist therapeutic decisions and interventional gestures. The contributions of the team focus on the development of AR tools for neuro-navigation as well as the development of simulation tools for training or planning. Laparoscopic surgery is another field of interest with the development of methods for tracking deformable organs based on bio-mechanical models. Some of these projects are developed in collaboration with the MIMESIS project team.
4.3 Experimental mechanics
In experimental solid mechanics, an important problem is to characterize properties of specimens subject to mechanical constraints, which makes it necessary to measure tiny strains. Contactless measurement techniques have emerged in the last few years and are spreading quickly. They are mainly based on images of the surface of the specimen on which a regular grid or a random speckle has been deposited. We are engaged since June 2012 in a transdisciplinary collaboration with Institut Pascal (Clermont-Ferrand Université). The aim is to characterize the metrological performance of these techniques limited by, e.g., the sensor noise, and to improve them by several dedicated image processing tools.
5 Highlights of the year
This year has seen the maturation of the concept of objects as landmarks for pose computation 13, 24, 21 as well as the first and successful use of deep learning based methods for measuring displacement and strain fields in photomechanics.
6 New results
6.1 Matching and localization
Participants: Marie-Odile Berger, Romain Boisseau, Abdelkarim Elassam, Vincent Gaudilliere, Gilles Simon, Matthieu Zins.
Localization from objects
Recent years have seen the emergence of very effective ConvNet-based object detectors that have reconfigured the computer vision landscape. As a consequence, new approaches that propose object-based reasoning to solve traditional problems, such as camera pose estimation, have appeared. In particular, these methods have shown that modelling 3D objects by ellipsoids and 2D detections by ellipses offers a convenient manner to link 2D and 3D data. Following the work that was initiated on that subject last year 13, we have proposed a novel object-based pose estimation algorithm that does not require any sensor but a RGB camera. Our method operates from at least two object detections, and is based on a new paradigm that enables us to decrease the Degrees of Freedom (DoF) of the pose estimation problem from six to three, while two simplifying yet realistic assumptions reduce the remaining DoF to only one. Exhaustive search is performed over the unique unknown parameter to recover the full camera pose. Robust algorithms designed to deal with any number of objects as well as a refinement step are introduced. The effectiveness of the method has been assessed on the challenging T-LESS and Freiburg datasets 13, 18. These works about object based localization from 2D/3D ellipses-ellipsoids correspondences have been the subject of Vincent Gaudillière's PhD thesis 24.
3D-Aware Ellipse Prediction for Object-Based localization
Though promising, the above described approaches use the ellipses fitted to the detection bounding boxes as an approximation of the imaged objects. This may originate in possible large inaccuracies in the computed pose. Within the framework of Matthieu Zins's PhD Thesis, we go one step further and propose a learning-based method which detects improved elliptic approximations of objects which are coherent with the 3D ellipsoid in terms of perspective projection. Experiments prove that the accuracy of the computed pose significantly increases thanks to our method and is more robust to the variability of the boundaries of the detection boxes 21. This is achieved with very little effort in terms of training data acquisition: a few hundred calibrated images of which only three need manual object annotation.
An Inria technological transfer action (ATT) on the subject of object based localization started in January 2020 with the aim to produce a demonstrator for industrial maintenance in complex environments. Vincent Gaudillière was recruited in this context on a one-year contract, as a continuation of his PhD. A library written in C, called EllCV, was first developed.
This library especially integrates two state-of-the-art object-based computer vision applications: 3D scene model reconstruction and camera pose estimation. A client-server-service model has also been set up –with the help of Benjamin Dexheimer, a SED development engineer– to enable remote access to these applications. Services are launched by the server in the form of a dedicated docker container running on a host machine. This architecture is ideal for allowing partners to use our technology without having to compile code on a specific architecture, making it easier to distribute while ensuring code property. Access to these services has already been authorized to an American company, as well as to our DFKI partners in the MOVEON project.
Vanishing point computation
Abdelkarim Elassam joined the Magrit team in October 2020 as a PhD student funded by the Inria/DFKI MOVEON project. The ambition of this project is to design semantic SLAM techniques operating over long sequences in real-life scenarios (poor lighting, dynamic scenes, non-flat surfaces, etc.). In addition to the 3D points used in classical SLAM approaches, we plan to exploit semantic primitives such as planes, objects, horizon line and vanishing points. Abdelkarim started by dealing with the horizon line. His first investigations confirmed that the offset and slope of that line can be robustly obtained by CNN classification. He is currently studying whether computing vanishing points along a horizon line can also be expressed as a–multiclass–classification problem. In parallel, with our German partners, he is studying how the estimation of the horizon and vanishing points can be merged with that of planar surfaces using a multitask CNN so that the resolution of each task benefits the resolution of the other two.
Camera pose data collection in Industrial Environments
Collecting correlated scene images and camera poses is an essential step towards learning absolute camera pose regression models. While the acquisition of such data in living environments is relatively easy by following regular roads and paths, it is still a challenging task in constrained industrial environments. This is because industrial objects have varied sizes and inspections are usually carried out with non-constant motions. As a result, regression models are more sensitive to scene images with respect to viewpoints and distances. Motivated by this, we designed a simple but efficient camera pose data collection method, WatchPose, to improve the generalization and robustness of camera pose regression models 17. Specifically, WatchPose tracks nested markers and visualizes viewpoints in an Augmented Reality-based manner to properly guide users to collect training data from broader camera-object distances and more diverse views around the objects. Experiments show that WatchPose can effectively improve the accuracy of existing camera pose regression models compared to the traditional data acquisition method. We also introduce a new dataset, Industrial10 1, which can be used to train and/or assess a pose computation algorithm in an industrial environment.
6.2 Handling non-rigid deformations
Participants: Marie-Odile Berger, Jaime Garcia Guevara, Erwan Kerrien, Nariman Khaledian, Daryna Panicheva, Raffaella Trivisonne, Pierre-Frédéric Villard.
Compliance-based non rigid registration
Within Jaime Guevara's PhD thesis, we have investigated non rigid registration methods which exploit the matching of the vascular trees and are able to cope with large deformations of the organ. This year, we have developed a matching method which is entirely based on the mechanical properties of the organ. We thus avoid tedious parameter tuning which is required by many methods and instead use parameters whose values are known or can be measured. Our method makes use of an advanced biomechanical model which handles heterogeneities and anisotropy due to vasculature. The main originality of the method lies in the definition of a better and novel metric for generating improved graph-matching hypotheses, based on the notion of compliance, the inverse of stiffness. This method reduces the computation time by predicting first the most plausible matching hypotheses on a mechanical basis and reduces the sensitivity on the search space parameters. These contributions improve the registration quality and meet intra-operative timing constraints. Experiments have been conducted on ten realistic synthetic datasets and two real porcine datasets which where automatically segmented. This work has been published in the journal Annals of Biomedical Engineering12.
Individual-specific heart valve modeling
Mitral valve computational models are widely studied in the literature. They can be used for preoperative planning or anatomical understanding. Manual extraction of the valve geometry on medical images is tedious and requires special training while automatic segmentation is still an open problem. In previous works, we have proposed a fully-automatic pipeline to extract the valve chordae architecture in order to use it for simulating the mitral valve in the closed state. This requires that the representation of the mitral valve be suitable with the mechanical requirements of the biomechanical model while still being close to the segmented chordae. This year, we especially focussed on optimizing this architecture with respect to an objective function based on mechanical, anatomical and image-based considerations. We have also proposed a method for validating the segmentation based on a graph-based metric. We used this metric on 5 micro CT scans and had a 87.5% accuracy rate. This work was realized in the context of Daryna Panicheva's PhD thesis.
Image-based biomechanical simulation of the diaphragm during mechanical ventilation
When intensive care patients are subjected to mechanical ventilation, the ventilator causes damage to the muscles that govern the normal breathing, leading to Ventilator Induced Diaphragmatic Dysfunction (VIDD). The INVIVE project aims at studying the mechanics of respiration through numerical simulation in order to learn more about the onset of VIDD. In previous works, we proposed to use a meshfree RBF method 22. We worked this year on applying it in the 2D case by implementing the resolution with two kinds of boundary conditions: only displacements (Dirichlet case) and only forces (Robin case). Experiments were conducted on the diaphragm and compared with results from a basic FEM solver. We validated that our RBF method is accurate when a linear elastic problem is solved.
3D catheter navigation from monocular images
In interventional radiology, the 3D shape of the micro-tool (guidewire, micro-catheter or micro-coil) can be very difficult, if not impossible to infer from fluoroscopy images. We consider this question as a single view 3D curve reconstruction problem. Our aim is to assess whether, and under which conditions, a sophisticated physics-based model can be effective to compensate for the incomplete data in this ill-posed problem.
We follow a non-rigid shape-from-motion approach based on Bayesian filtering to realize the fusion of image data with a physical model implemented through simulation. The focus of this year was on validation. Thomas Mangin was hired on a 1-year engineer contract (until March 2020) to design and develop an experimental platform to acquire ground truth 3D shapes of a catheter navigating within a translucent silicon phantom. This work was finalized this year by evaluating the accuracy of the platform and assessing the uncertainty along the acquisition chain with Monte Carlo simulations: catheter shapes can be reconstructed within 0.5 mm accuracy. This platform was used to validate the algorithms for monocular reconstruction developed during Raffaella Trivisonne's PhD work, and further aspects of her work were also experimentally validated (fidelity of the simulation under different parameterization). The results were published in 16. Raffaella Trivisonne defended her PhD thesis in October 25. Our work on the subject will be funded by PreSPIN ANR project in the next few years.
6.3 Image processing
Participants: Marie-Odile Berger, Fabien Pierre, Gilles Simon, Frédéric Sur.
In computational photomechanics, mainly two methods are available for estimating displacement and strain fields on the surface of a material specimen subjected to a mechanical test, namely digital image correlation (DIC) and localized spectrum analysis (LSA). With both methods, a contrasted pattern marks the surface of the specimen: either a random speckle pattern for DIC or a regular pattern for LSA, this latter method being based on Fourier analysis. It is a challenging problem since strains are tiny quantities giving deformations often not visible to the naked eye. The recent outcomes of our collaboration with Insitut Pascal (Université Clermont-Auvergne) focus on three areas.
First, we have investigated the optimization of the pattern marking the specimen, which is the topic of several recent papers. Checkerboard is the optimized pattern in terms of sensor noise propagation when the signal is correctly sampled (see numerical assessments in 15 and verification in a real experimental settings in 29), but its periodicity causes convergence issues with DIC. We have assessed the metrological performance of three methods designed to extract displacement fields from deformed specimen marked with checkerboards 14.
Second, we have characterized the results of DIC, which is probably the most used method in photomechanics. We have proposed new metrological indicators to compare several incarnations of DIC on the same ground 10. We have also proposed a careful analysis of DIC in 26. The displacement estimated by DIC is shown to be impaired by bias and uncertainty terms related to sensor noise, the interpolation scheme needed to reach subpixel accuracy, image gradient distribution, as well as the difference between the hypothesized parametric transformation and the true displacement. To this end, we have generalized results from the literature on stereo-imaging, we have reexamined the so-called fattening effect, and we have presented in a unified work several results from the DIC community.
Third, we have proposed a convolutional neural network to estimate displacement fields in photomechanics. Displacement fields can be regarded as a particular case of classic concept of optical flow. However, CNNs have never been used so far to perform such measurements. Our paper 27 explains how a CNN called StrainNet can be developed to reach this goal, and how specific ground truth datasets are elaborated. The main result is that StrainNet successfully performs such measurements, and that it achieves competitive results in terms of metrological performance and computation time. The conclusion is that CNNs like StrainNet offer a viable alternative to DIC, especially for real-time applications. Software is publicly available 2.
Variational methods for image processing
In the last years, image and video colorization has been considered from many points of view. The technique consists in the addition of a color component to a gray-scale image. This operation needs additional priors which can be given by manual intervention of the user from an example image or be extracted from a large dataset of color images. A very large variety of approaches has been used to solve this problem, like PDE models, non-local methods, variational frameworks, learning approaches, etc. In 23, we provide a general overview of state-of-the-art approaches with a focus on few representative methods. Moreover, some recent techniques from the different types of priors (manual, exemplar-based, dataset-based) are explained and compared. The organization of the chapter aims at describing the evolution of the techniques in relation to each other. A focus on some efficient strategies is proposed for each kind of methodology.
Generic document image dewarping
The goal for document image dewarping is broadly to rectify a document page so that it appears in a frontal-flat view for an OCR algorithm. It is still a challenging issue especially when documents are captured with one camera in an uncontrolled environment. A page or open book can most often be traced by a straight line moving parallel to a fixed straight line (considered as the vertical of the book), and intersecting a fixed horizontal curve, which is a particular case of cylindrical surface. Under this assumption, the shortening and curvature effects can be solved jointly following a computation of vanishing points along a horizon line. This problem is therefore a good field of investigation for the calculation of vanishing points on curved surfaces, such as pages of books but also, e.g., buildings of cylindrical shape (in the generic sense defined above). And indeed, our method is generic enough to apply to both types of problems 20. Unlike previous ones, it does not need to segment text or images but relies only on straight line segment detection. A discrete one-dimensional locus of vanishing points is determined based on a binary-tree descent search, using a probabilistic criterion to decide when to stop subdivisions. The camera's focal length is obtained from the zenith and the horizon line, which allows us to reconstruct a 3D model of the document (or the building) that just needs to be unfolded to get the final dewarped image. Regarding document dewarping, our method is state-of-the art on the only publicly available dataset. Good results have also been obtained on buildings and we will further investigate what this method can bring to urban AR.
Detection of tumors in fluorescence imaging
Fluorescence imaging is a molecular imaging modality of growing interest in oncological surgery for its ability to visually isolate tumors with indocyanine green (ICG) biomarker that reacts to near infrared light. A preliminary study was led with the Cancer Institute of Lorraine (Institut de cancérologie de Lorraine - ICL) to assess the potential of fluorescence imaging in the challenging case of previously irradiated head and neck tumors. A piece of software was developed based on our PoLAR library, to allow for a statistical comparison of tumor versus healthy tissues based on manual contouring 11.
6.4 Applications of Deep Learning to image processing
Participants: Youssef Assis, Erwan Kerrien, Fabien Pierre.
Detection of intranial aneurysms using deep learning
Youssef Assis joined the team in November 2020 as a PhD student co-funded by Région Lorraine and Nancy University Hospital (CHRU) resulting from the ANR call for PhD grants in Artificial Intelligence. Intracranial aneurysms are small bulges at the surface of the blood vessels with a diameter of typically 6 mm or below. The diagnosis of unruptured brain aneurysms is difficult and time consuming. Current deep learning-based detection solutions suffer from low specificity. This is especially true for tiny aneurysms below 3 mm. The challenge we are addressing at the start of this work is to build a deep neural network that is able to detect even very small aneurysms in Magnetic Resonance Angiography (MRA) 3D images. Liang Liao, an interventional neuroradiologist within the department of diagnostic and therapeutic interventional neuroradiology at CHRU Nancy, joined the team during his Masters internship in biomedical engineering. A database of 111 patients with unruptured aneurysms was collected and annotated and preliminary tests were made with a patch-based 3D Unet approach. The challenge is to tackle the scarcity of aneurysms within the whole vasculature. Our current investigations follow two paths: developping a dedicated data augmentation procedure and investigating various loss functions to compensate for sample imbalance in the training database.
Restauration of old movies
In collaboration with the Cinémathèque de Toulouse, we have developed an automatic restoration process for old movies that consists of two steps. The first step is to detect the defects present in an image. The second step consists in filling the areas thus detected by video inpainting. Taking into account the temporal information contained in the images adjacent to the degraded image is an essential aspect of both these steps. The detection of defects such as spots, dust or other defects is performed by means of automatic learning on deep neural networks 19. In particular, a U-Net receiving three successive images as input can detect temporal inconsistencies characteristic of defects. The output of this network is compared to a defect mask created from the restored version of the central image obtained with specialized software manipulated by an expert from the Cinémathèque de Toulouse. Finally, the filling of the damaged areas is carried out by alternating the reconstruction of the structure and the reconstruction of the texture of the image to be restored, both of which carry out the search for an optimum using variational methods 30.
7 Partnerships and cooperations
7.1 International initiatives
7.1.1 Inria International Labs
- Title: CURATIVE: CompUteR-based simulAtion Tool for mItral Valve rEpair
- Duration: 2020 - 2022
- Coordinator: Pierre-Frédéric Villard
- Participants: Marie-Odile Berger, Nariman Khaledian,Daryna Panicheva
- Harvard Biorobotics Lab (HBL), Harvard University (United States)
- Inria contact: Pierre-Frédéric Villard
- Summary: The mitral valve of the heart ensures the one-way flow of oxygenated blood from the left atrium to the left ventricle. However, many pathologies damage the valve anatomy producing undesired backflow, or regurgitation, decreasing cardiac efficiency and potentially leading to heart failure if left untreated. Such cases could be treated by surgical repair of the valve. However, it is technically difficult and the outcomes are highly dependent upon the experience of the surgeon. One of the main difficulties of valve repair is that valve tissues must be surgically altered during open heart surgery such that the valve opens and closes effectively after the heart is closed and blood flow is restored. In order to do this successfully, the surgeon must essentially mentally predict the displacement and deformation of anatomically and biomechanically complex valve leaflets and supporting structures . Even if patient-based mitral valve models have been recently used for scientific understanding of its complex physiology, the patient geometry is manually segmented on medical images. This task is long and cumbersome except if the valve has been artificially isolated in-vitro. There is a lack in the literature about the variety of metrics in both anatomy and biomechanics of the valve. In order to study mitral valve behavior or to prepare models for planning, it is necessary to develop methods to extract the valve components i) on real clinical data ii) with minor user input and iii) that are mechanically valid.
7.1.2 Inria international partners
Informal international partners
- Pierre-Frédéric Villard is currently working in the INVIVE project (http://
www. it. uu. se/ research/ scientific_computing/ project/ rbf/ biomech) funded by the Swedish Research Council and realized within a collaboration with Uppsala University and Karolinska Institute. Within this project, he is the co-supervisor of Igor Tominec's PhD thesis (with Elisabeth Larsson (Uppsala University) as the main advisor.
- Fabien Pierre is currently working with Gabriele Steidl (Technische Universität Kaiserslautern, Germany) on the subject of convolution on Riemannian manifolds for color images. The goal of this collaboration is the design of a CNN to process images which values are on a manifold.
7.2 European initiatives
7.2.1 Collaborations with major European organizations
Participants: M.-O. Berger, R. Boisseau, A. Elassam, G. Simon
This 3-year project is a collaboration with DFKI Kaiserslautern. The aim of the MOVEON project is to push forward the state of the art in vision-based, spatio-temporal scene understanding by merging novel machine-learning approaches with geometrical reasoning. Deep-learning-based recognition and understanding of high-level concepts such as vanishing points or large object classes will serve as unitary building blocks for a spatio-temporal localization and environment reconstruction that will use geometric reasoning as underlying support. This research will lead to a novel generation of visual positioning systems that go beyond classical localization and mapping, which focuses currently only on point cloud reconstruction. In contrast, our aim is to allow for 6DoF positioning and global scene understanding in wild and dynamic environments (e.g. crowded streets) that scales up nicely with the size of the environment, and that can be used persistently over time by reusing consistent maps.
7.3 National initiatives
7.3.1 ANR JCJC ICaRes
Participant: F. Sur
This 3-year project (2019-2022) headed by B. Blaysat (Université Clermont-Auvergne), is supported by the Agence Nationale de la Recherche. It addresses residual stresses, which are introduced in the bulk of materials during processing or manufacturing. Since unintended residual stresses often initiate early failure, it is of utmost importance to correctly measure them. The goal of the ICaRes project is to improve the performance of residual stress estimation through the so-called virtual digital image correlation (DIC) which will be developed. The basic idea of virtual DIC is to mark the specimen with virtual images coming from a controlled continuous image model, instead of the standard random pattern. Virtual DIC is expected to outperform standard DIC by, first, matching real images of the materials with the virtual images, then, running DIC on the virtual images on which strain fields are estimated, giving ultimately residual stresses.
7.3.2 RAPID EVORA project
Participants: M.-O. Berger, V. Gaudillière, G. Simon.
This 4-year project (2016-2020) is supported by DGA/DGE and led by the SBS-Interactive company. The objective is to develop a prototype for location and object recognition in large-scale industrial environments (factories, ships...), with the aim to enrich the operator's field of view with digital information and media. The main issues concern the size of the environment, the nature of the objects (often non textured, highly specular...) and the presence of repeated patterns. This project was the support of V. Gaudillière's PhD thesis.
7.3.3 ANR PRC PreSPIN
Participants: E. Kerrien, P-F. Villard.
This 4-year project (2020-2024) is coordinated by E. Kerrien and is supported by the Agence Nationale de la Recherche since November 2020. It aims at improving the planning phase in the therapeutic management of cerebral ischemic strokes thanks to predictive simulation of both the therapeutic interventional gesture and post-interventional images. Besides the Magrit team, the partners are CReSTIC (N. Passat, S. Salmon; Reims), Creatis (C. Frindel, O. Merveille; Lyon) and CIC-IT/CHRU Nancy (R. Anxionnat, M. Beaumont; Nancy). The consortium is set to address the challenges of geometrical and topological modeling of the full brain vasculature; physics-based simulation of interventional devices; simulation of MRI perfusion images; and clinical validation.
7.4 Regional initiatives
7.4.1 IRMGE project: (2018-2020)
Participants: M.-O. Berger, E. Kerrien, T. Mangin.
The project Imagerie et Robotique Médicale Grand Est (IRMGE) started in january 2018. Clinical and interventional imaging is a major public health issue. Teams from the Grand-Est region involved in medical imaging have thus proposed a research project to broaden and strenghten cooperation. The three axes of the project are about optic imaging, nuclear imaging and medical image processing. The Magrit team is especially involved in the third axis, with the aim to improve interventional procedures.
8.1 Promoting scientific activities
8.1.1 Scientific events: selection
- Marie-Odile Berger was a reviewer for ISMAR (International Symposium for Mixed and Augmented Reality), IPCAI (International Conference on Information Processing in Computer-Assisted Interventions), ICPR (International conference on Pattern Recognition), AE-CAI (Workshop on Augmented Environments for Computer Assisted Interventions), RFIAP 2020 (French conference on pattern recognition)
- Pierre-Frederic Villard was a reviewer for MICCAI (Medical Image Computing and Computer Assisted Intervention), the Eurographics Workshop on Visual Computing for Biology and Medicine and the International Conference on Computer Graphics, Visualization, Computer Vision And Image Processing
- Gilles Simon and Frédéric Sur were reviewers for RFIAP
- Erwan Kerrien was a reviewer for MICCAI, MIDL (Medical Imaging and Deep Learning), IPCAI, and AE-CAI.
Reviewer - reviewing activities
- Marie-Odile Berger and Pierre-Frédéric Villard were reviewers for the International Journal of Computer Assisted Radiology
- Frédéric Sur was a reviewer for the following journals: Measurement, Signal Processing: Image Communication, The Visual Computer, Transactions on Medical Imaging, Experimental Mechanics
- Gilles Simon was a reviewer for Pattern Recognition
- Erwan Kerrien was a reviewer for IEEE Transactions on Medical Imaging and the International Journal for Numerical Methods in Biomedical Engineering.
- Fabien Pierre was a reviewer for the Journal of Mathematical Imaging and Vision (Springer) and for Computer Vision and Image Understanding (Elsevier).
8.1.3 Invited talks
- Marie-Odile Berger and Pierre-Frédéric Villard gave a talk entitled " Modélisation fonctionnelle de la valve mitrale " at the IADI seminar (CHRU Nancy)
- Pierre-Frédéric Villard gave a talk entitled "Segmentation with Active Contours" at the Harvard Biorobotics Lab (Cambridge, USA)
- Erwan Kerrien gave a talk entitled "Augmented Interventional Neuroradiology" at the IADI seminar (CHRU Nancy).
8.1.4 Leadership within the scientific community
Marie-Odile Berger is the president of the Association française pour la reconnaissance et l’interprétation des formes (AFRIF).
8.1.5 Scientific expertise
- Marie-Odile Berger was a reviewer for two CIFRE grants
- Erwan Kerrien was a reviewer for one CIFRE grant.
8.1.6 Research administration
- Marie-Odile Berger was a member of the hiring committee for a professor position at Université de Bourgogne and for an assistant professor position at Université de Tours
- Frédéric Sur was a member of the hiring committee for a professor position at Politecnico di Milano.
8.2 Teaching - Supervision - Juries
The academic members of the MAGRIT team actively teach at Université de Lorraine with an annual number of around 200 teaching hours in computer sciences each, some of them being accomplished in the field of image processing. INRIA researchers have occasional teaching activities in computer vision and shape recognition mainly in the computer science Master of Nancy and in several Engineering Schools near Nancy (ENSMN Nancy, SUPELEC Metz, TELECOM Nancy, ENSG). Our goal is to attract Master students with good skills in applied mathematics towards the field of computer vision.
The complete list of courses given by staff members is detailed below:
- M.-O. Berger
- Master : Shape recognition, 24 h, Université de Lorraine.
- Master : Introduction to image processing, 12 h, Mines Nancy .
- Master : Image processing for Geosciences, ENSG, 12h.
- E. Kerrien
- Master : Introduction to image processing, 15 h, Mines Nancy.
- Licence: Basics of computer science, 71h, IUT Saint-Dié-des-Vosges.
- Fabien Pierre
- Master: Introduction to machine learning, 14h, Mines Nancy.
- Master: Computer vision and image processing, 12h, Polytech Nancy.
- Licence: Introduction to image processing, 30h, IUT Saint-Dié des Vosges.
- Licence: Basics of computer science, 87h, IUT Saint-Dié des Vosges
- Licence: Scientific culture and information processing, 69h, IUT Saint-Dié des Vosges
- Licence: Object-oriented and event-driven programming, 35h, IUT Saint-Dié des Vosges
- Licence: Introduction to artificial intelligence, 18h, IUT Saint-Dié des Vosges
- G. Simon
- Master: Augmented reality, 24 h, Télécom-Nancy.
- Master : Augmented reality, 3 h, SUPELEC Metz.
- Master: Augmented reality, 24h, M2 Informatique FST
- Master: Visual data modeling, 12h, M1 Informatique FST
- Master: Image processing and computer vision, 12h, M1 informatique, FST
- Licence pro: 3D modeling and integration, 40h FST - CESS d'Epinal
- Licence: Programming methodology, L1 informatique, 48h FST
- F. Sur
- Master: Introduction to machine learning, 60 h, Mines Nancy
- Master: Time series analysis, 30h, Mines Nancy
- Introduction to signal processing, 20h, IUT Charlemagne
- P.-F. Villard
- Master : Augmented and Virtual Reality, 16h, M2 Cognitive Sciences and Applications, Institut des Sciences du Digital, Université de Lorraine
- Licence: Computer Graphics with webGL, 30h, IUT Saint-Dié des Vosges.
- Licence: Game design with Unity3D, 15h, IUT Saint-Dié des Vosges.
- Licence: Virtual and Augmented Reality in Industrial Maintenance, 2h, Faculty of Science and Technology, Université de Lorraine
- Licence: Web programming, 20h, IUT Saint-Dié des Vosges.
- Licence: Graphical user interface programming, 30h, IUT Saint-Dié des Vosges.
- Licence: Object-oriented programming, 20h, IUT Saint-Dié des Vosges.
- Licence: UML modeling,16h, IUT Saint-Dié des Vosges.
- Licence: Security and life privacy with internet, 2h, IUT Saint-Dié des Vosges.
- Licence: Parallel programming, 18h, IUT Saint-Dié des Vosges.
- Licence: Initiation to machine learning, 24h, IUT Saint-Dié des Vosges.
- Brigitte Wrobel-Dautcourt
- Master: design of information systems, 30h, Télécom
- Master: java design and development project, 27h, Télécom 2A
- Licence: Basics of object-oriented programming, 44h, FST
- Licence: Graphic interfaces, 22h, FST
- Licence: Integrated project, 30 h, FST
- Licence: System, 24h, FST
- Licence: Compiling codes, 16h, FST
- PhD: Raffaella Trivisonne, Image-guided real-time simulation using stochastic filtering, Erwan Kerrien, Stéphane Cotin (MIMESIS). Defended in October 2020.
- PhD: Vincent Gaudillière, Reconnaissance de lieux et d'objets pour la réalité augmentée en milieux complexes, Marie-Odile Berger, Gilles Simon. Defended in june 2020.
- PhD in progress: Daryna Panicheva, Image-based biomechanical simulation of mitral valve closure, October 2017, Marie-Odile Berger, Pierre-Frédéric Villard.
- PhD in progress: Matthieu Zins, Localization in a world of objects, October 2019, Marie-Odile Berger, Gilles Simon.
- PhD in progress: Abdelkarim Elassam, Robust visual localization using high level features, October 2020, Marie-Odile Berger, Gilles Simon.
- PhD in progress: Nariman Khaledian, Toward a functional model of the mitral valve, October 2020, Marie-Odile Berger, Pierre-Frédéric Villard.
- PhD in progress: Youssef Assis, Deep learning for the automated detection of brain aneurysms, November 2020, Erwan Kerrien, René Anxionnat (CHRU Nancy).
- Marie-Odile Berger was an examiner for the PhD of Hela Hadri (Univ. Savoie Mont-Blanc), Sunit Sivasankaran (Univ. Lorraine), Tan Binh Phan (Univ. Lorraine)
- Frédéric Sur was an examiner for the PhD of Seyfeddine Boukhtache (Univ. Clermont-Auvergne)
- Gilles Simon was a reviewer for the PhD of Agniva Sengupta (Univ. Rennes 1)
- Fabien Pierre was an examiner for the PhD of Arthur Renaudeau (Toulouse).
8.3.1 Internal or external Inria responsibilities
Erwan Kerrien is Chargé de Mission for scientific mediation at Inria Nancy-Grand Est, and thereby is part of the Inria scientific mediation network. As such, he is a member of the steering committee of "la Maison pour la Science de Lorraine" ("Houses for Science" project, see http://
8.3.2 Articles and contents
- Erwan Kerrien participates in MOOCFOLIO, a PIA3-funded project https://
www. fun-mooc. fr/ news/ pia-3-le-projet-moocfolio-est-laureat/). The objective is to create a MOOC to help students choose their undergraduate studies to pursue after high school. Erwan is part of a working group to create a module about studies and professions related to digital usages and sciences, where he is co-responsible (with Anne Boyer from Loria) for the Artificial Intelligence topic. The MOOC was opened in November 2020 (see https:// www. fun-mooc. fr/ courses/ course-v1:lorraine+30012+session01/ about).
- Pierre-Frédéric Villard is involved with the secondary school of Champigneulles (France) as a "Collège Pilote" of "La Main à la pâte" foundation. He gave a seminar on augmented and virtual realities to the pupils, he helped the teacher with preparing some activities with augmented and virtual reality technologies. Eventually, he is supervising Université de Lorraine students to produce teaching applications with augmented and virtual reality technologies that will be used in secondary school classes. These projects replaced the internships that the students usually do in companies and that were cancelled due to the COVID-19 crisis.
- Erwan Kerrien is an associate researcher at a MATh.en.JEANS workshop (https://
www. mathenjeans. fr) within Loritz high school in Nancy. He also gave a testimony of his long time experience in a short movie that was shot during "La semaine des maths" (Maths week; see https:// pedagogie. ac-nancy-metz. fr/ loritz-chercheur/).
- In the context of the "Chiche!" initiative, Erwan Kerrien animated a workshop for high school students to reflect on AI technologies for health, on the grounds of a debate game developed by the association "L'arbre des connaissances" (see https://
9 Scientific production
9.1 Major publications
- 1 articleBiomechanics-based graph matching for augmented CT-CBCTInternational Journal of Computer Assisted Radiology and Surgery136April 2018, 805-813
- 2 inproceedingsCamera Relocalization with Ellipsoidal Abstraction of ObjectsISMAR 2019 - 18th IEEE International Symposium on Mixed and Augmented RealityBeijing, ChinaIEEEOctober 2019, 19-29
- 3 articleImpact of Soft Tissue Heterogeneity on Augmented Reality for Liver SurgeryIEEE Transactions on Visualization and Computer Graphics2152015, 584-597
- 4 articleBlood vessel modeling for interactive simulation of interventional neuroradiology proceduresMedical Image Analysis35January 2017, 685-698
- 5 inproceedings Viewpoint simulation for camera pose estimation from an unstructured scene model International Conference on Robotics and Automation Seattle, United States May 2015
- 6 inproceedings A-Contrario Horizon-First Vanishing Point Detection Using Second-Order Grouping Laws ECCV 2018 - European Conference on Computer Vision Munich, Germany September 2018
- 7 articleMeasuring the Noise of Digital Imaging Sensors by Stacking Raw Images Affected by Vibrations and Illumination FlickeringSIAM J. on Imaging Sciences81March 2015, 611-643
- 8 articleAn A Contrario Model for Matching Interest Points under Geometric and Photometric ConstraintsSIAM Journal on Imaging Sciences642013, 1956-1978URL: http://hal.inria.fr/hal-00876215
- 9 articleTuning of patient specific deformable models using an adaptive evolutionary optimization strategyIEEE Transactions on Biomedical Engineering5910October 2012, 2942-2949
9.2 Publications of the year
International peer-reviewed conferences
Scientific book chapters
Doctoral dissertations and habilitation theses
Reports & preprints
9.3 Cited publications
- 27 articleWhen Deep Learning Meets Digital Image CorrelationOptics and Lasers in Engineering1362021, 106308
- 28 articleStatistical Modeling and Recognition of Surgical WorkflowMedical Image Analysis2011, URL: http://hal.inria.fr/inria-00526493/en
- 29 articleInfluence of the sampling density on the noise level in displacement and strain maps obtained by processing periodic patternsMeasurement1732021, 108570
- 30 inproceedingsAlternate Structural-Textural Video Inpainting for Spot Defects Correction in Movies (regular paper)International Conference on Scale Space and Variational Methods in Computer Vision (SSVM 2019), Hofgeismar, Allemagne, 30/06/2019-04/07/2019https://link.springer.comSpringer7 2019, (electronic medium)URL: https://www.irit.fr/~Jean-Denis.Durou/PUBLICATIONS/ssvm_2019_2.pdf
- 31 inproceedings In-Situ 3D Sketching Using a Video Camera as an Interaction and Tracking Device 31st Annual Conference of the European Association for Computer Graphics - Eurographics 2010 Suède Norrkoping May 2010