PULSAR - 2011 - Annual activity report

PULSAR

PULSAR - 2011

Project Team Pulsar

Members

Overall Objectives

Scientific Foundations

Application Domains

Software

New Results

Contracts and Grants with Industry

Partnerships and Cooperations

Dissemination

Bibliography

Previous |

Home | Next next

Section: New Results

Group interaction and group tracking for video-surveillance in underground railway stations

Participants : Sofia Zaidenberg, Bernard Boulay, Carolina Garate, Duc-Phu Chau, Etienne Corvée, François Brémond.

One goal in the European project VANAHEIM is the tracking of groups of people. Based on frame to frame mobile object tracking, we try to detect which mobiles form a group and to follow the group through its lifetime. We define a group of people as two or more people being close to each other and having similar trajectories (speed and direction). The dynamics of a group can be more or less erratic: people may join or split from the group, one or more can disappear temporarily (occlusion or disappearance from the field of view) but reappear and still be part of the group. The motion detector which detects and labels mobile objects may also fail (misdetections or wrong labels). Analyzing trajectories over a temporal window allows handling this instability more robustly. We use the event-description language described in [50] to define events, described using basic group properties such as size, type of trajectory or number and density of people and perform the recognition of events and behaviors such as violence or vandalism (alarming events) or a queue at the vending machine (non-alarming events). Two approaches to this problem have been implemented. The first approach takes as input the frame-to-frame tracking results of individual mobiles and tries to gather them into groups based on their trajectories through the temporal window. Each group has a coherence coefficient. This coefficient is a weighted sum of three quantities characterizing a group: the group density (average of distances between mobiles), the similarity of mobile's speed and the similarity of their motion directions. The update of a group consists in re-calculating the group coherence with new mobiles from the current frame. If adding the mobile does not put the coherence under a defined threshold, the mobiles are added to the group. A pre-selection is made by only considering mobiles that are close enough to the center of gravity of the group. After the update step, all mobiles that have not been assigned to a group are analyzed to form new groups if possible.

A first improvement has been done by integrating the use of the LBP-based people detector described in [36] . This makes the algorithm more robust to false detections such as train doors closing. But on the other hand, it also introduces false negatives as, among other things, people are only detected if fully visible in the image. The group tracking algorithm has been tested both with the original, background subtraction-based mobile object detection (noted S hereafter) and the LBP-based people detection (noted LBP hereafter).

For evaluating the detection, we used 3 annotated sequences: Sequence 1 is a short sequence of 128 frames with just one ground truth object (one group), Sequence 2 has 1373 frames and 9 ground truth objects, and Sequence 3 is 17992 frames long and 25 ground truth objects were annotated. Detection and tracking results are shown in table4 .

**Table 4.** Segmentation (S) and Human Detector (HD) Results
	Sequence 1		Sequence 2		Sequence 3
	S	HD	S	HD	S	HD
True Positives (TP)	72	67	1395	1079	5635	3679
False Positives (FP)	0	0	11	111	1213	642
False Negatives (FN)	6	11	269	585	3686	5642
Precision (global)	1	1	0.99	0.90	0.82	0.85
Sensitivity (global)	0.92	0.84	0.83	0.65	0.60	0.40
Tracking confusion	1	1	1	0.99	0.92	0.96

The whole algorithm chain has been integrated into the common VANAHEIM platform and sent to partners for pre-integration.

We also used videos from the ViCoMo project, recorded in the Eindhoven airport to test our approach. No formal evaluation has been done yet on these sequences due to the lack of ground truth. Nevertheless, these videos contain several acted scenes which could be successfully recognized: groups merging, splitting and entering a forbidden zone.

This work has been published in [50] .

In parallel, a new approach is being developed, making use of a long-term tracker described in [35] . This tracker provides more robust individual trajectories to the group tracker, containing less confusions in cases where people cross each other. We apply the Mean Shift clustering algorithm on trajectories of people through a sliding time window (e.g. 10 frames). If the target is lost in one or several frames, we interpolate its positions. The clustering brings together mobiles having similar trajectories, which is our definition of a group. At each frame, clusters are calculated and then a matching is done to associate clusters to existing groups in the previous frame, and thus track groups. Looking backwards (within a window) on the trajectory of a mobile we might find a mobile on that trajectory that belongs to a group. If such a group is found, it is called the probable group of the current mobile. Each trajectory cluster is associated to the group that is the probable group of most mobiles in the cluster. Several clusters may be associated to the same group. This cluster association makes the algorithm robust to cases where one or several mobiles temporarily separate from the group. If the separation is longer than the time window, the probable group of these mobiles will be empty and a split will be detected.

Additionally, we work on improving the people detection by combining both methods: background subtraction-based and LBP-based. We compare overlapping mobiles from both methods and choose the best one based on their respective confidence values and their sizes. If a target was detected by only one of the two methods, we keep the target given that the confidence is high enough. If a mobile from the background subtraction method is big enough to cover several LBP-detected people (the LBP-based people detection output targets have the size of a human, whereas the background subtraction can detect a bigger mobile with the size of a GROUP_OF_PEOPLE), we attach the LBP-people as sub-mobiles of the group mobile so no information is lost. This method is a work in progress and no evaluation have been done yet.

Figure 20 shows two examples of group and event detection.

Figure 20. Example of detected groups and events: a group getting off the train (top) and a group having a lively behavior (bottom).

Previous |

Home | Next next