Homepage Inria website

Section: Research Program

Scientific methodology

In this section we briefly describe the scientific methods we use to achieve our research goals.

Adaptive image processing

An impressive range of techniques have been developed in the fields of image processing, computer vision and computer graphics to manipulate and interpret image content for a variety of applications. So far only a few of these techniques have been applied in the context of vision aid systems and even less have been carefully evaluated with patients. However it is worth noticing a recent gain of interest from the artificial vision side to low vision applications (See, e.g., the Special issue on Assistive Computer Vision and Robotics - "Assistive Solutions for Mobility, Communication and HMI" from Computer Vision and Image Understanding (August 2016) or the International Workshop on Assistive Computer Vision and Robotics (ECCV 2016 Sattelite workshop)). We investigate which techniques could bring a real interest for vision aid systems, how to combine them and how to make them adapted to patient needs, so that they can not only "see" an image but understand it more efficiently.

Some techniques have already been explored. Among the first, enhancing image content (equalization, gamma correction, tone mapping, edge enhancement, image decomposition, cartoonization) seems a natural type of processing to make. Some methods have already been tested with low vision patients  [38], [54], [55] or even in retina prosthesis systems as a pre-processing  [37]. For some visual impairement it can be useful to consider methods that help patients to focus on the most relevant information, using techniques such as scene retargeting  [59], seam carving  [40], [39], saliency-based enhancements  [71], [82] or 3D-based enhancements when available  [64]. All the work done on image understanding could also be extremely useful to help patients navigate in natural cluttered environments both in low vision condition or for prosthetics vision  [58]. 3D information, obtained from stereo head systems or RGB-D cameras also bring useful information about the environment  [62] and integrated systems combining different expertise are appearing  [46].

Our goal will be to take the most of state-of-the-art computer vision methods, in combination with virtual and augmented reality devices (Sec. 3.2.2) to provide patients vision aid system that can adapt to their impairment and so that they can easily change the parameters of the processing in an intuitive way.

Virtual and augmented reality

Our goal is to develop vision-aid systems using virtual and augmented reality  [87]. There is a rich continuum of devices between virtual reality (which is a priori simpler to use since there is no problem of mobility and environment is well defined), and augmented reality (where information has to be superimposed in real time on top of the real environment to enrich it). Between these two extremes, new hybrid see-through systems are available or under development such as light glasses where additional information can be locally displayed at the center or on the corner (e.g., Google glass improving it). We invest on these technologies which enable new kinds of interaction with visual content which could be very powerful when adapted to low vision patients who want to use their remaining sight. We investigate how low vision patients could take benefits from this technology in their daily life activities  [47] (Note that wearing such headsets may not be easily accepted by patients who do not want to advertise their disability. More generally, this poses the general question of how users come to accept and use a technology. This question is debated in the Technology Acceptance Model (TAM) which postulates that two specific perceptions about technology determine one behavioral intention to use a technology: perceived ease of use and perceived usefulness (see, e.g.,  [50]).).

We focus on three activities: reading, watching movies and navigating in real world (indoor and outdoor). In these three scenario, this technology should offer crucial advantages for people in low vision. For reading, this could help them solving the page navigation problem or the limitations of magnification encountered when standard CCTVs are used. When watching a movie, the possibility to explore a pre-processed visual scene presented with very high visual angle can help patients to follow the storyline more easily and this poses some interesting questions on the creation of content specifically for virtual reality headsets. Finally, in real scenarios, augmented reality offers promising perspectives to enrich the scene by highly visible visual cues to facilitate low vision patients navigation. Of course the choices of adaptive image processing techniques (see Sec. 3.2.1) will be crucial and this will be the add-on value of our work.

Another important aspect of this work that will progressively need attention is ergonomic which will have to take into account the other potential functional limitations of these patients in addition to low vision (e.g., limitations in mobility, hearing, or agility).

Biophysical modeling

Modeling in neuroscience has to cope with several competing objective. On one hand describing the biological realm as close as possible, and, on the other hand, providing tractable equations at least at the descriptive level (simulation, qualitative description) and, when possible, at the mathematical level (i.e., affording a rigorous description). These objectives are rarely achieved simultaneously and most of the time one has to make compromises. In Biovision team we adopt the point of view of physicists: try to capture the phenomenological description of a biophysical mechanism, removing irrelevant details in the description, and try to have a qualitative description of equations behaviour at least at the numerical simulation level, and, when possible, get out analytic results. We do not focus on mathematical proofs, instead insisting on the quality of the model in predicting, and, if possible proposing new experiments. This requires a constant interaction with neuroscientists so as to keep the model on the tracks, warning of too crude approximation, still trying to construct equations from canonical principles [4],[33], [22].

Methods from theoretical physics

Biophysical models mainly consist of differential equations (ODEs or PDEs) or integro-differential equations (neural fields). We study them using dynamical systems and bifurcation theory as well as techniques coming from nonlinear physics (amplitude equations, stability analysis, Lyapunov spectrum, correlation analysis, multi-scales methods).

For the study of large scale populations (e.g., when studying population coding) we use methods coming from statistical physics. This branch of physics gave birth to mean-field methods as well statistical methods for large population analysis. We use both of them. Mean-field methods will be applied for large scale activity in the retina and in the cortex [7], [11],[15].

For the study of retina population coding we use the so-called Gibbs distribution, initially introduced by Boltzmann and Gibbs. This concept includes, but is not limited to, maximum entropy models  [60] used by numerous authors in the context of the retina (see, e.g.,  [73], [75], [57], [56], [78]). These papers were restricted to a statistical description without memory neither causality: the time correlations between successive times is not considered. A paradigmatic example of this is the Ising model, used to describe the retinal activity in, e.g.,  [73], [75]. However, maximum entropy extends to spatio-temporal correlations as we have shown in, e.g., [13], [5].

More generally, while maximum entropy models rely heavily on the questionable assumption of stationariy, the concept of Gibbs distribution does not need this hypothesis. Beside, it allows to handle models with large memory; it also provides a framework to model anticipation [16]. It includes as well existing models to explain retina statistics such as the Generalized Linear Model (GLM)  [44].