Research activities of the Lagadic team are concerned with visual servoing and active vision. Visual servoing consists in using the information provided by a vision sensor to control the movements of a dynamic system. This system can be real within the framework of robotics, or virtual within the framework of computer animation or augmented reality. This research topic is at the intersection of the fields of robotics, automatic control, and computer vision. These fields are the subject of profitable research since many years and are particularly interesting by their very broad scientific and application spectrum. Within this spectrum, we focus ourselves on the interaction between visual perception and action. This topic is significant because it provides an alternative to the traditional Perception-Decision-Action cycle. It is indeed possible to link more closely the perception and action aspects, by directly integrating the measurements provided by a vision sensor in closed loop control laws.
This set of themes of visual servoing is the central scientific topic of the Lagadic group. More generally, our objective is to design strategies of coupling perception and action from images for applications in robotics, computer vision, virtual reality and augmented reality.
This objective is significant, first of all because of the variety and the great number of the potential applications to which can lead our work. Secondly, it is also significant to be able to raise the scientific aspects associated with these problems, namely modeling of visual features representing in an optimal way the interaction between action and perception, taking into account of complex environments and the specification of high level tasks. We also work to treat new problems provided by imagery systems such as those resulting from an omnidirectional vision sensor or echographic probes. We are finally interested in revisiting traditional problems in computer vision (3D localization, structure and motion) through the visual servoing approach.
The Vimanco project realized for ESA in collaboration with the Trasys company in Brussels and KUL in Leuven has ended this year. It was devoted to study the feasability of vision-based manipulation tasks by the Eurobot outside of the International Space Station. The work realized by our group in this project has received the best application paper award at the IROS'2007 conference .
N. Mansard's Ph.D. has been awarded as one of the best french thesis considered by the GdR MACS (“Groupe de Recherche Modélisation, Analyse et Conduite des Systèmes dynamiques”). It has also been nominated as one of the best thesis considered by ASTI (“Fédération des Associations Françaises des Sciences et Technologies de l'Information”). Finally, N. Mansard received the award entitled “Prix Bretagne Jeune Chercheur” delivered by the Brittany council. His thesis was about tasks sequencing for robotics applications.
Basically, visual servoing techniques consist in using the data provided by one or several cameras in order to control the motions of a dynamic system . Such systems are usually robot arms, or mobile robots, but can also be virtual robots, or even a virtual camera. A large variety of positioning tasks, or mobile target tracking, can be implemented by controlling from one to all the degrees of freedom of the system. Whatever the sensor configuration, which can vary from one on-board camera on the robot end-effector to several free-standing cameras, a set of visual features has to be selected at best from the image measurements available, allowing to control the degrees of freedom desired. A control law has also to be designed so that these visual features reach a desired value , defining a correct realization of the task. A desired trajectory can also be tracked. The control principle is thus to regulate to zero the error vector . With a vision sensor providing 2D measurements, potential visual features are numerous, since as well 2D data (coordinates of feature points in the image, moments, ...) as 3D data provided by a localization algorithm exploiting the extracted 2D features can be considered. It is also possible to combine 2D and 3D visual features to take the advantages of each approach while avoiding their respective drawbacks .
More precisely, a set
of
kvisual features can be taken into account in a visual servoing scheme if it can be written:
where
describes the pose at the instant
tbetween the camera frame and the target frame,
the image measurements, and
a set of parameters encoding a potential additional knowledge, if available (such as for instance a coarse approximation of the camera calibration parameters, or the 3D model of the
target in some cases).
The time variation of can be linked to the relative instantaneous velocity between the camera and the scene:
where is the interaction matrix related to . This interaction matrix plays an essential role. Indeed, if we consider for instance an eye-in-hand system and the camera velocity as input of the robot controller, we obtain when the control law is designed to try to obtain an exponential decoupled decrease of the error:
where is a proportional gain that has to be tuned to minimize the time-to-convergence, is the pseudo-inverse of a model or an approximation of the interaction matrix, and an estimation of the target velocity.
From the selected visual features and the corresponding interaction matrix, the behavior of the system will have particular properties as for stability, robustness with respect to noise or to calibration errors, robot 3D trajectory, etc. Usually, the interaction matrix is composed of highly non linear terms and does not present any decoupling properties. This is generally the case when is directly chosen as . In some cases, it may lead to inadequate robot trajectories or even motions impossible to realize, local minimum, tasks singularities, etc. . It is thus extremely important to “cook” adequate visual features for each robot task or application, the ideal case (very difficult to obtain) being when the corresponding interaction matrix is constant, leading to a simple linear control system. To conclude in few words, visual servoing is basically a non linear control problem. Our Graal quest is to transform it as a linear control problem.
Furthermore, embedding visual servoing in the task function approach allows to solve efficiently the redundancy problems that appear when the visual task does not constrain all the degrees of freedom of the system. It is then possible to realize simultaneously the visual task and secondary tasks such as visual inspection, or joint limits or singularities avoidance. This formalism can also be used for tasks sequencing purposes.
Elaboration of object tracking algorithms in image sequences is an important issue for researches and applications related to visual servoing and more generally for robot vision. A robust extraction and real-time spatio-temporal tracking process of visual cues is indeed one of the keys to success of a visual servoing task. To consider visual servoing within large scale applications, it is mandatory to handle natural scenes without any fiducial markers but with complex objects in various illumination conditions. If fiducial markers may still be useful to validate theoretical aspects of visual servoing in modeling and control, non cooperative objects have to be considered to address realistic applications.
Most of the available tracking methods can be divided into two main classes: feature-based and model-based. The former approach focuses on tracking 2D features such as geometrical primitives (points, segments, circles,...), object contours, regions of interest...The latter explicitly uses a model of the tracked objects. This can be either a 3D model or a 2D template of the object. This second class of methods usually provides a more robust solution. Indeed, the main advantage of the model-based methods is that the knowledge about the scene allows improvement of tracking robustness and performance, by being able to predict hidden movements of the object, detect partial occlusions and acts to reduce the effects of outliers. The challenge is to build algorithms that are fast and robust enough to meet our applications requirements. Therefore, even if we still consider 2D features tracking in some cases, our researches mainly focus on real-time 3D model-based tracking, since these approaches are very accurate, robust, and well adapted to any class of visual servoing schemes. Furthermore, they also meet the requirements of other classes of application, such as augmented reality.
The natural applications of our research are obviously in robotics. In the past, we mainly worked in the field of the grasping and of the manipulation of tools, in the field of underwater robotics for the stabilization of images, and the positioning of uninstrumented robot arms, in the field of agro-industry for the positioning of a vision sensor in order to ensure an improvement of the quality controls of agro-alimentary products, as well as in the field of the video surveillance (control of the movements of a pan-tilt camera to track mobile natural objects). More recently, we addressed the field of mobile robotics via the activities undertaken around the Cycab vehicle: detection and tracking of mobile objects (pedestrians, other vehicles), control by visual servoing of the movements of the vehicle.
In fact, researches which we undertake in the Lagadic group can apply to all the fields of robotics implying a vision sensor. They are indeed conceived to be independent of the robot system considered (and the robot and the vision sensor can even be virtual for some applications).
Currently, we are interested in using visual servoing for the control of robot arms in space, and underactuated flying robots, such as miniature helicopters and aircrafts.
In collaboration with the Visages team, we also address the field of medical robotics. The applications under consideration for the moment turn around new functionalities of assistance to the clinician during a medical examination: visual servoing on echographic images, active perception for the optimal generation of 3D echographic images, etc.
Robotics is not the only possible application field to our researches. In the past, we were interested in collaboration with the Siames project to apply the techniques of visual servoing in the field of computer animation. It can be a question either of controlling the movement of virtual humanoids according to their pseudo-perception, or to control the point of view of visual restitution of an animation. In both cases, potential applications are in the field of virtual reality, for example for the realization of video games, or virtual cinematography.
Applications also exist in computer vision and augmented reality. It is then a question of carrying out a virtual visual servoing for the 3D localization of a tool with respect to the vision sensor, or for the estimation of its 3D motion. This field of application is very promising, because it is in full rise for the realization of special effects in the multi-media field or for the design and the inspection of objects manufactured in the industrial world.
Lastly, our work in visual servoing and active perception can be related with those carried out in cogniscience, in particular in the field of psychovision (for example on the study of eye motion in the animal and human visual system, or on the study of the representation of perception, or on the study of the links between action and perception).
Visual servoing is a very active research area in vision-based robotics. A software environment that allows fast prototyping of visual servoing tasks is then of prime interest. The main reason is certainly that it usually requires specific hardware (the robot and, most of the time, dedicated image framegrabbers). The consequence is that the resulting applications are often not portable and cannot be easily adapted to other environments. Today's software design allows one to propose elementary components that can be combined to build portable high-level applications. Furthermore, the increasing speed of micro-processors allows the development of real-time image processing algorithms on an usual workstation. We have developed a library of canonical vision-based tasks for eye-in-hand and eye-to-hand visual servoing that contains the most classical linkages that are used in practice. The ViSP software environment features all the following capabilities: independence with respect to the hardware, simplicity, extendability, portability. Moreover, ViSP involves a large library of elementary positioning tasks with respect to various basic visual features (points, lines, circles, spheres, cylinders,...) that can be combined together, and an image processing library that allows the tracking of visual cues (dot, segment, ellipse, spline,...). Simulation capabilities are also available. ViSP and its full functionalities are described in .
This year was devoted to improve the software and documentation quality. Daily builds deployed to compile and test ViSP considering different OS (Linux, OSX, Windows) and compilers were used to ensure the stability of the software. Moreover, new functionnalities and examples were introduced like the Kanade-Lucas-Tomasi feature tracker, histogram computation, perspective camera calibration tools, etc.
Two new versions were released the first half-year and a new version has been released at the end of the year. It is available from
http://
ViSP open source code has been downloaded and used in research labs in China, Hungary, India, Italy, Japan, Korea, Lebanon, Portugal, Spain, USA and France.
The Marker software implements an algorithm supplying the computation of camera pose and camera calibration using fiducial markers. The parameters estimation is handled using virtual visual servoing. The principle consists in considering the pose and the calibration as a dual problem of visual servoing. This method presents many advantages: similar accuracy as for the usual non-linear minimization methods, simplicity, effectiveness. A licence of this software was yielded to the Total Immersion company.
Markerless is an upgrade of the Marker software with additional features developed within the SORA Riam Project. It allows the computation of camera pose with no fiducial marker.
A real-time, robust and efficient 3 dmodel-based tracking algorithm for a monocular vision system has been developed. Tracking objects in the scene requires to compute the pose between the camera and the objects. Non-linear pose computation is formulated by means of a virtual visual servoing approach. In this context, the derivation of point-to-curves interaction matrices are given for different features including lines, circles, cylinders and spheres. A local moving-edge tracker is used in order to provide a real-time estimation of the displacements normal to the object contours. A method is proposed for combining local position uncertainty and global pose uncertainty in an efficient and accurate way by propagating uncertainty. Robustness is obtained by integrating an m-estimator into the visual control law via an iteratively re-weighted least squares implementation. More recently, we also considered the case of non-rigid objects. The proposed method has been validated on several complex image sequences including outdoor environments. Applications for this tracker are in the fields of robotics, visual servoing, and augmented reality.
We exploit several experimental platforms to validate our research work in visual servoing and in active vision. More precisely, we have two robotic systems built by Afma Robots in the 90 years. The first one is a Gantry robot with six degrees of freedom, the other one is a cylindrical robot with four degrees of freedom. These robots are equipped with cameras mounted on their end effector. Depending on the application, it could be either a classical ccdcamera associated to an Imaging Technology framegrabber, or a Marlin firewire camera. A pcon Linux communicates with the robot using a sbsTechnologies bus adapter. These equipments require specific hardware, but also software maintenance actions and new developments in order to make them evolve. Training and assistance of the users, presentation of demonstrations also form part of the daily activities.
At the beginning of the year, all the drivers (Imaging Technology framegrabber, firewire camera, sbsTechnologies bus adapter), developments and demonstrations around these two platforms have migrated from Fedora 1 to Fedora 5 Linux environment on a new Intel Core 2 Duo 3 GHz computer to provide an up to date and more powerful system.
A new ring light system was also installed around the ccdcamera to be less sensitive to external lighting conditions.
Since these platforms are quite old, we are looking forward upgrading the low level VME controller and the electronics associated to the motors.
To validate our research in the medical robotics field, since 2004 we exploit a six degrees of freedom arm designed by Sinters company. This robot is equipped with an ultrasound probe (US).
The high level computer on which image processing and visual servoing control law are hosted migrates from Fedora 1 to Fedora 6 Linux environment on an new Intel Xeon BiPro 3 GHz PCto provide an up to date and more powerful system. Existing softwares and demonstrations were updated to be compatible with this new environment.
The low level QNX part of the controller was modified to provide an optimised network communication as well as an optimized force-torque control of the robot.
As described in Section , a new demonstration using a visual servoing scheme based on moments extracted from the US probe was developed. It enables to automatically position a 2D US probe in order to reach a desired B-scan image of an object of interest.
The Cycab is a small four wheel drive autonomous electric car dedicated to vision-based mobile robotic applications. A pan-tilt head (Biclops PTM) equipped with a firewire Marlin camera with about 70 degrees field of view is mounted on the front avoid-shock. The Cycab is equipped with two computers connected through an internal network; a pcdedicated to the low level control of the actuators, and a laptop connected to the camera and dedicated to high level visual servoing applications. The vision-based navigation scheme to follow a visual path by autonomous navigation in outdoor urban environments using only monocular vision was improved (see Section ). Moreover, an image-based visual servoing scheme for path following with non-holonomic mobile robots was developed on the Cycab (see Section ).
This study is directly related to the search of optimal visual features, as described in Section . We are considering a spherical projection model. A high motivation to use this model is its simplicity compared to the complex equations corresponding to an omnidirectional vision sensor. Two spherical targets have been used: a simple sphere and a sphere marked with a tangent vector to a point on its surface. For each of these targets, a new minimal set of features has been designed. These new sets are decoupled and nearly-linear linked with respect to the sensor velocities. They can be computed on any central catadioptric system. In addition, these new sets provide adequate trajectories either in the image or in the cartesian space. For each newly proposed set of features, a classical control method has been analytically proved to be globally stable with respect to modeling errors.
For the simple sphere, the new set has been validated using a perspective camera and a paracatadioptric sensor . This sensor consists of the coupling of a parabolic mirror and an orthographic camera. For this type of system, straight line trajectories in the image plane are not always suitable because of the dead angle in the center of the field of view. This is why a specific set has been designed for paracatadioptric cameras . Finally the effects of camera calibration errors have been analysed regarding perspective and paracatadioptric sensors and confirmed through simulation and experimental results.
As for the second target, the new set draws a better camera trajectory in comparison with a set previously proposed. The stability with respect to modeling error has been validated experimentally using a non-spherical decoration balloon which is topologically equivalent to a sphere. Future works will be devoted to other primitives such as ellipses and circles.
One of the main problem in visual servoing is to extract and track robustly the image measurements that are used to build the visual features involved in the control scheme. This may lead to complex and time consuming image processing, and may increase the effect of image noise. To cope with this problem, we propose to use directly photometric features as input of the control scheme. More precisely, we use directly the luminance of all pixels in the image. We have shown that the classical control laws fail in this case. Therefore, we have turned the visual servoing problem into an optimization problem leading to a new control law derived from the particular shape of the cost function. It is first based on a gradient approach and then to a Levenberg-Marquardt like approach. Experimental results have validated this control law in the case of positioning tasks. The positioning error is very low. Supplementary advantages are that our approach is not sensitive to partial occlusions and to coarse approximations of the depths required to compute the interaction matrix. Finally, even if the modeling has been performed in the Lambertian case, experiments on a non Lambertian object has shown that very low positioning errors can be reached.
Moreover, from the Phong and Warn illumination models , the interaction matrix has been derived in some interesting cases (directional lighting, lighting source on the optical axis) when considering non Lambertian objects. In that case, simulation results have shown that tracking tasks can be achieved.
This study is devoted to the design of new control schemes to be used in visual servoing.
The first control law that has been developed follows rigorously the task function aproach. It has been demonstrated to be pseudo-globally asymptotically stable, in the sense that it is globally asymptotically stable in the task space, but not necessarily in the configuration space. In practice, we have been able to show that this control law is attracted by local minima.
Another new control scheme has also been developed. It is based on a linear combination of the interaction matrices computed at the current camera pose and at the desired pose. By selecting the parameter that sets the weight to each matrix, it is possible to adapt the bevavior of the control law. We have exhibited some configurations where all the classical control schemes fail in a local minimum while a particular value of the behavior parameter allows the system to converge. Some new singular configurations have also been exhibited for some classical control laws.
When the number of features used as input of the control law varies, it generally induces a discontinuity in the velocity sent to the robot controller when the control law is based on the classical pseudo-inverse. To deal with this problem, we have developed a new inversion operator . This operator is equal to the pseudo-inverse in the continuous cases, and ensures the continuity everywhere. The control scheme obtained using this new operator has been applied to ensure the continuity when some visual features leave the camera field of view during the realization of a visual task. Experiments have been realized to demonstrate the interest and the validity of this approach.
This study aims at developing visual servoing techniques for the control of the motions of aircrafts. As for fixed wing aircrafts, the considered application is automatic landing. After modeling decoupled visual features based on the measurements that can be extracted from the image of the runway (typically, its border and central lines), we have proposed a visual servoing scheme to align an airplane with respect to a runway . The control scheme has been built by using a linearized model of the airplane dynamics provided by Dassault Aviation. Then we have applied this control scheme for a complete automatic landing . A desired trajectory which takes into account the airplane dynamics has been designed. Coupling this trajectory and the control law allows the airplane to join its desired path. Then the airplane is controlled to follow the glide path, realize the flare manoeuvre and finally touchdown. Simulation results have been obtained with a quite realistic flight simulator (provided by Dassault Aviation), which is based on a non linear airplane dynamic model. These results have shown that the airplane is able to land automatically by visual servoing.
As for helicopters, we have designed and analysed several control schemes based on the centroid of a target expressed in a spherical coordinate system, for positioning and stabilization tasks. The goal was to find a couple of visual feature and control scheme so that the sensitivity to any translational motion was the same. We have experimented the most promising control laws proposed on the X4-flyer developed at CEA-List. It uses a miniature camera, with wireless video transmission, mounted on the X4-flyer. We have also compared the results with a classical visual servoing scheme using perspective zeroth and first order moments. In practice and as expected, three control schemes have led to demonstrate excellent performances of the system : the perspective image moments control design, as well as two of the control laws using spherical image moments.
This study started in October 2007. It is realized within the ANR Scuav project (see Section ). We are interested in fusing the data provided by several sensors directly in the control law, instead of estimating the state vector. Adaptive control schemes will be also designed to estimate on line the intrinsic and extrinsic parameters of each sensor so that all the data are compatible.
A visual servoing scheme enabling nonholonomic mobile robots with a fixed pinhole camera to reach and follow a continuous path on the ground has been developed. In the path following task, some suitable path error function, indicating the position of the robot with respect to the path, must be zeroed. The path error function is determined by a relationship called path following constraint. The main contribution of this work is that our scheme requires only a small set of visible path features, along with a coarse camera model, and that it guarantees convergence even when the initial error is large. Two versions of the path follower have been implemented: position-based and image-based. For both versions, a Lyapunov-based stability analysis has been carried out, and the performance has been experimentally validated on our CyCab.
In this research, visual path following in outdoor urban environments using only monocular vision has been investigated. The framework that has been developed is based on a first learning and mapping step and a navigation step , .
The first step involves teaching the robot the environment where it will have to navigate. The map is represented by a set of reference images together with 3D and 2D coordinates of point feature landmarks extracted from the reference images. During the teaching, reference images with corresponding interest points are automatically selected and the local 3D geometry of points shared between neighboring reference image pairs are reconstructed.
The second step in the framework involves navigation. Here the robot is first localized in the map by matching the current image from the camera to a reference image specified by the user. Matched features are then tracked as the robot moves along the path. The tracked features are used for local 3D reconstruction which enables feature reprojection and consistency checking. Further, the tracked features are used to control the robot motion using image-based visual servoing to follow qualitatively the reference path.
We have carried out numerous experiments using the CyCab in different environments and lighting conditions. We have conducted successful autonomous navigation experiments with vehicle velocities up to 1.8 m/s, and paths up to 740 m , , .
The goal of this research is to develop new visual servoing techniques based on ultrasound images in order to control directly the motion of medical robots. The main problem is to control both in-plane and out-of-plane motions of an ultrasound probe using only the 2D B-mode image provided by the ultrasound transducer. For that, it is necessary to model the complete interaction between the probe and the object. We have developed a new visual servoing method to control the motion of a 2D ultrasound probe held by a medical robot in order to reach a desired B-scan image of an object of interest. In this approach, combinations of image moments extracted from the current observed object cross-section are used as visual features. We derived the analytical form of the corresponding interaction matrix thanks to the use of a local approximation of the object surface shape. Simulations performed with a static ultrasound volume containing an egg-shaped object, and in-vitroexperiments using a robotized ultrasound probe that interacts with a rabbit heart immersed in water, showed the validity of this new approach and its robustness with respect to modeling and measurements errors.
This study has been started by Alexandre Krupa during his sabbatical in the Computer-Integrated Surgical Systems and Technology Engineering Research Center at the Johns Hopkins University of Baltimore. This work deals with the use of speckle information contained in ultrasound images to control the displacement of a robotized ultrasound probe with a visual servoing control scheme. A new visual servoing method that is able to stabilize a moving area of soft tissue within an ultrasound B-mode imaging plane has been developed . The problem was decoupled into motion out-of-plane and motion within plane. For the former, a new original method based on the speckle was developed. For the latter, an image region tracker was used to provide the in-plane motion. The method was first validated in simulation by controlling a virtual probe interacting with a static ultrasound volume acquired from a medical phantom. The approach was then demonstrated for translational motions combining a translation along the image X-axis (in-plane) and elevation Z-axis (out-of-plane) in an experimental setup consisting of an ultrasound speckle phantom, a robot for simulating tissue motion, and a robot controlling the motions of the ultrasound probe . This approach was also implemented and validated on the Lagadic medical robotic platform.
In order to consider in future works non-rigid motion of soft tissue, we also developed a simulator software, which generates realistic ultrasound images observed by a virtual probe interacting with a dynamic ultrasound volume. This simulator is able to apply volume deformation due to physiologic motions of the patient.
This study is devoted to object grasping using a manipulator within a multi cameras visual servoing scheme. The goal of this project, realized in cooperation with CEA/List (see Section ), is to allow disable persons to grasp an object with the help of a robot arm mounted on their wheel chair. This task should be achieved with a minimum of a priori information regarding the environment and the considered object and with very few interactions with the user.
A method, based on visual servoing and on the epipolar geometry of a multi-view system, has been proposed to automatically find and focus the object of interest. This year we have focused on the accurate localization of the object and its rough shape estimation. Considering an active vision process, the motion of the camera is automatically controlled to optimize the estimation of the object structure modeled by a quadric. This allows the grasping module to know the localization, the orientation and the shape of the object. Experiments have been realized on the Afma 6 robot. Currently, we are implementing the scheme developed onto a robot arm available at CEA-List.
A fully automated tracking system requires to automatically initialize camera pose and re-initialize it whenever objects of interest get lost. For real-time applications, this process needs to be done as quickly as possible. We approach this problem by using image matching techniques. This takes advantage of the fact that for pose estimation purpose, several training images may be available. Image matching enables to determine the displacement between the current view and the reference ones.
Matching two images consists of three steps. We first propose a very simple but efficient method to detect keypoints from grey-scale images. It provides stable points representing mostly physical corners of objects in the scene. Each keypoint is then assigned to canonical orientations which are local maxima of histogram of gradient orientations. A gradient magnitude based descriptor is then computed on a local region centered at a keypoint, in its canonical orientation. To reduce dimensions of feature space, a PCA technique is used to build an eigenspace. The search for the nearest neighbor in eigenspace is carried out by using an approximative nearest neighbor searching technique. Verification based on the knowledge of a 3D model of the scene and on multi-view geometry is then considered.
The proposed matching method has been applied for augmented reality applications and visual servoing . It runs at 25Hz and gives satisfactory results.
A real-time, robust and efficient 3D model-based tracking algorithm for a monocular vision system has been developed over the last five years . Tracking objects in the scene requires to compute the pose between the camera and the object. Non-linear pose computation is formulated by means of a virtual visual servoing scheme. We have extended this algorithm to handle input from multiple cameras with small or large baseline . Considering the link between the two cameras, visual information from the two images are used in the same minimization process to compute the global position of the system. This year, we studied algorithm able to detect tracking failure based on Hinkley tests.
We have extended our 3D model-based tracking based on contour to take into account texture. This approach fuses a 3D model-based approach based on edges tracking and a temporal matching relying on the texture analysis into a single non-linear objective function. Indeed, estimating both pose and camera displacement introduces an implicit spatio-temporal constraint that the simple 3D model-based tracker was not able to consider. Furthermore, fusing measurements based on texture and edges improves the robustness of the tracking .
This year we have extended our previous work in order to avoid to consider a texture model which have proved to be difficult to obtain and which makes the algorithm prone to failure when important lighting variations occurr. We have then studied other hybrid approaches with on one hand both contour information and optical flow obtained using a state of the art optic flow estimator , and on the other hand with both contour and information provided by the KLT tracking algorithm. This later method has proved to be very efficient and very robust. It has been tested on both indoor and outdoor images for visual servoing and augmented reality applications.
When no information on the observed object are available, approaches like the one proposed in Section cannot be used. Therefore, we have derived a tracking algorithm to cope with free 2D contours. It is based on the well known snake algorithm when considering a parametric curve instead of points as usually done. More precisely, the parameters we used are based on a Fourier expansion of a polar description of the contour. We have mainly focused on a way to increase the robustness of the algorithm with respect to a coarse initialization of the contour. It has been done by using a regularization approach where the regularization terms are related to the area inside the curve. Additional constraints have also been introduced to ensure a better convergence over image sequences (mainly the directions between the normals of all the points of the curve and their spatial gradients). This approach has been validated for ultrasound image-based visual servoing (see Section ) and also to describe the objects considered in Section .
This study focuses on real-time augmented reality for mobile device. It is related to the France Telecom contract presented in Section . The goal of this project is to enable augmented reality on mobile devices like GSM or PDA used by pedestrians in urban environments. With a camera and other external sensors, the absolute pose of the camera has to be computed in real-time to show to the end-user geolocalized information in an explicit way.
This year we have proposed a method for camera pose tracking that uses a partial knowledge about the scene. The method is based on monocular vision Simultaneous Localization And Mapping (SLAM). In contrast to existing SLAM implementations, this approach uses previously known information (map of walls) about the environment and takes advantages from the various available databases and blueprints to constrain the problem. This method states that the tracked image patches belong to known planes (which may contain some uncertainty) and that SLAM map can be represented by associations of camera and planes. We show that this method gives good results for a real sequence with complex motion.
This study started in July 2007. The goal of this work is to propose tracking algorithms that are suitable for the control of small helicopters (X4 flyers). For the first task considered, a rigid link between the target and the drone has to be maintained. This requires estimating in a first step four degrees of freedom (DOF) of the target (position, orientation and size). The preliminary works done this year focused on mean-shift based tracking algorithms coupled with particle filtering in order to handle fast target motion and occlusions. Future work will be devoted to the estimation of 6 DOF target motion.
This work is realized through a Ph.D. granted by DGA which began in October 2007. We are interested in rigid object tracking in non-structured environment. Images will be acquired by catadioptric or fisheye cameras. The tracking methods developed will be applied to the control of vehicles in outdoor areas for the positioning with respect to static or moving targets. The proposed algorithms will have to be robust and fast.
no. Inria 1263, duration : 18 months.
This project was about the automatic initialization of the tracking process using matching algorithms. This problematic has been extended to the development of a tracking by matching method (see Section ). This contract finished in March 2007.
no. Inria 46142206, duration : 36 months.
This contract is devoted to support the Cifre convention between France Telecom R&D and Irisa regarding Fabien Servant's Ph.D. (see Section ). The goal of the Ph.D. is to enable augmented reality on mobile devices like GSM or PDA used by pedestrians in urban environments. Using a camera and external sensors, the goal of this study is to compute the absolute pose of the camera to show to the end-user geolocalized information in an explicit way.
no. Inria 1457, duration : 36 months.
This contract started in November 2005. It is also supported by the Brittany Council (see Section ) through a grant to Claire Dune for her Ph.D. (“krog” means grasping in the Breton language). The goal of this project is to allow disabled persons to grasp an object with the help of a robotic arm mounted on a wheel chair. This task should be achieved with a minimum of a priori information regarding the environment, the considered object, etc.
no. Inria 1286, duration : 36 months.
This contract started in November 2005 (“krog” means grasping in the Breton language). It is also supported by the CEA (see Section ). The goal of this project is to allow disabled persons to grasp an object with the help of a robotic arm mounted on a wheel chair. This task should be achieved with a minimum of a priori information regarding the environment, the considered object, etc.
no. Inria 558, duration : 36 months.
This project was a large project headed by Inria Sophia Antipolis. It ended in May 2007 and was concerned with the navigation of mobile vehicles in urban environments. Within this project, our work consisted in designing autonomous vision-based navigation techniques using an image database of the environment (see Section ).
duration : 30 months.
This project is a large project realized for the DGA through a consortium leaded by Thales Optronics. We work in close collaboration with the EPI ARobAS at Sophia Antipolis, sharing an engineer, Melaine Gautier, who is in charge of the work to be realized. This project is about the development of tracking algorithms and the control of non-holonomic vehicles. Within this project, our work consists in developing 2D image-based tracking algorithms in complex outdoor scenes.
duration : 36 months.
This project, leaded by Tarek Hamel from I3S, started in June 2007. It is realized in collaboration with I3S, the EPI ARobAS at Inria Sophia-Antipolis, Heudiasyc in Compiègne, the CEA-List and the Bertin company. It is devoted to the sensor-based control of small helicopters for various applications (stabilization landing, target tracking, etc.)
no. Inria 1862, duration : 24 months.
We began in September 2005 a project for the European Space Agency. It is realized in collaboration with the Trasys company (Brussels), Galileo Avionica (Milano) and KUL (Leuven). Its aim is to develop a demonstrator of a robot arm in space environment able to grasp objects by visual servoing. The considered robot is the ESA Eurobot that should be on the International Space Station in 2008. Our task in this project is to provide algorithms for objects tracking and vision-based control. The work described in Section has been realized within this project.
no. Inria 1832, duration : 36 months.
This FP6 project started in September 2006. It is managed by Dassault Aviations. It is concerned with the automatic take off and landing of fixed wing aircrafts and helicopters. In this project, we are the leader of the workpackage devoted to visual tracking and visual servoing.
This international collaboration between France and Australia is supported by CNRS. It is about visual servo-control of unmanned aerial vehicles. It started fall 2005 for three years. It joins Rob Mahony (Australian National University, Canberra), Peter Corke and Jonathan Roberts (CSIRO, Brisbane), Tarek Hamel (I3S, Sophia-Antipolis), Vincent Moreau (CEA-List, Paris) and our group.
Prof. Seth Hutchinson from Beckman Institute at the University of Illinois at Urbana-Champaign (UIUC) has spent a two-month visit in September and October 2007.
Prof. Farrokh Sharifi from Ryerson University in Toronto has spent six months in our group from July till December 2007.
F. Chaumette was a member of the Evaluation Committee of the 2007 ANR Psirob call devoted to robotics.
F. Chaumette was a member of the CNRS expert committee in charge of the Lasmea evaluation.
F. Chaumette is a member of the Scientific Council of the GdR on Robotics.
E. Marchand is a member of the Scientific Council of the University of Rennes 1.
E. Marchand and F. Spindler evaluated projects for the 2007 ANR Psirob call.
E. Marchand is a member of the Administration Council of the Center for Computer Resources of the University of Rennes 1
F. Spindler is a member of the engineers reviewing committee of the French institute for agronomy research ( Inra).
F. Chaumette is a member of the Specialist Committee of IFSIC. He is also the Head of the CUMI at Irisa ( Commission des Utilisateurs des Moyens Informatiques).
Editorial boards of journals
F. Chaumette is Associate Editor of the Int. Journal of Optomechatronics, published by Taylor and Francis. He is in charge with Prof. Farrokh Sharifi of a special issue of this journal devoted to visual servoing.
Technical program committees of conferences
F. Chaumette: ICRA'2007, CVPR'2007, RSS'2007, IROS'2007, RFIA'2008, ICRA'2008.
E. Marchand: Orasis'2007, JNRR'07, RSS'07, ACIVS'07, Coresa'07, RFIA'08
Ph.D. and HdR jury
F. Chaumette: Mauro Maya-Mendes (Inria Sophia-Antipolis, reviewer), David Folio (Laas, reviewer), Christophe Doignon (HdR, LSIIT, reviewer), Etienne Mouragnon (Lasmea, reviewer), Tej Dallej (Lasmea, reviewer), Pascal Vasseur (HdR, Crea, president).
E. Marchand: Javier-Flavio Vigueras (Ph.D., Loria, reviewer), Madjid Maidi (Univ. Evry, reviewer).
Master M2RI of Computer Science, Ifsic, University of Rennes 1 (E. Marchand): 3D Computer vision.
Master SIBM (Signals and Images in Biology and Medicine), University of Rennes 1, Brest and Angers (A. Krupa): medical robotics for physician students.
Diic INC, Ifsic, University of Rennes 1 (E. Marchand, F. Chaumette: 3D vision, visual servoing; E. Marchand, F. Spindler: programming tools for image processing).
Insa Rennes, Electrical Engineering Dpt (F. Spindler, A. Dame: computer vision).
Graduate student interns: L. Bulteau (ENS Ulm), R. Brenguier (ENS Cachan), F. Giraud (ENS Cachan-Ker Lann), J. Charreyron (IFSIC Rennes), L. Merzouk (Univ. Paris 5)
F. Dionnet and E. Marchand received the best application paper award at IROS 2007 for paper .
N. Mansard's Ph.D. has been awarded as one of the best french thesis considered by the GdR MACS (“Groupe de Recherche Modélisation, Analyse er Conduite des Systèmes dynamiques”). It has also been nominated as one of the best thesis considered by ASTI (“Fédération des Associations Françaises des Sciences et Technologies de l'Information”). Finally, N. Mansard received the award entitled “Prix Bretagne Jeune Chercheur” delivered by the Brittany council.