Section: New Results
Dynamic and Robust Object Tracking in a Single Camera View
Participants : Duc-Phu Chau, Julien Badie, François Brémond, Monique Thonnat.
Keywords: Object tracking, online parameter tuning, controller, self-adaptation and machine learning
Object tracking quality usually depends on video scene conditions (e.g. illumination, density of objects, object occlusion level). In order to overcome this limitation, we present a new control approach to adapt the object tracking process to the scene condition variations. The proposed approach is composed of two tasks.
The objective of the first task is to select a convenient tracker for each mobile object among a Kanade-Lucas-Tomasi-based (KLT) tracker and a discriminative appearance-based tracker. The KLT feature tracker is used to decide whether an object is correctly detected. For badly detected objects, the KLT feature tracking is performed to correct object detection. A decision task is then performed using a Dynamic Bayesian Network (DBN) to select the best tracker among the discriminative appearance and KLT trackers.
The objective of the second task is to tune online the tracker parameters to cope with the tracking context variations. The tracking context, or context, of a video sequence is defined as a set of six features: density of mobile objects, their occlusion level, their contrast with regard to the surrounding background, their contrast variance, their 2D area and their 2D area variance. Each contextual feature is represented by a code-book model. In an offline phase, training video sequences are classified by clustering their contextual features. Each context cluster is then associated with satisfactory tracking parameters. In the online control phase, once a context change is detected, the tracking parameters are tuned using the learned values. This work has been published in [29] , [35] .
We have tested the proposed approach on several public datasets such as Caviar and PETS. Figure 16 illustrates the results of the object detection correction using the KLT feature tracker.
Figure 17 illustrates the tracking output for a Caviar video (on the left image) and for a PETS video (on the right image). The experimental results show that our method gets the best performance compared to some recent state of the art trackers.
Table 1 presents the tracking results for 20 videos from the Caviar dataset. The proposed approach obtains the best value (i.e. mostly tracked trajectories) compared to some recent state of the art trackers.
Method | MT (%) | PT (%) | ML (%) |
Zhang et al., CVPR 2008 [89] | 85.7 | 10.7 | 3.6 |
Li et al., CVPR 2009 [71] | 84.6 | 14.0 | 1.4 |
Kuo et al., CVPR 2010 [69] | 84.6 | 14.7 | 0.7 |
Proposed approach | 86.4 | 10.6 | 3.0 |
Table 2 presents the tracking results of the proposed approach and three recent approaches [56] , [82] , [67] for a PETS video. With the proposed approach, we obtain the best values in both metrics MOTA (i.e. Multi-object tracking accuracy) and MOTP (i.e. Multi-object tracking precision). The authors in [56] , [82] , [67] do not present the tracking results with the MT, PT and ML metrics.
Method | MOTA | MOTP | MT (%) | PT (%) | ML (%) |
Berclaz et al., PAMI 2011 [56] | 0.80 | 0.58 | - | - | - |
Shitrit et al., ICCV 2011 [82] | 0.81 | 0.58 | - | - | - |
Henriques et al., ICCV 2011 [67] | 0.85 | 0.69 | - | - | - |
Proposed approach | 0.86 | 0.72 | 71.43 | 19.05 | 9.52 |