Collaborative Tracking of Objects in EPTZ Cameras

Viewer
Transcript

Collaborative Tracking of Objects in EPTZ Cameras Faisal Bashir Fatih Porikli Mitsubishi Electric Research Laboratories, Cambridge, MA 02139.

Overview • Problem Statement and Background – High-resolution wide area coverage – Object detection – Multi-kernel Mean-shift tracking

• Electronic Pan-Tilt-Zoom solution – System-camera interaction – EPTZ tracking algorithm

• Results Mitsubishi Electric Research Labs

2

Problem Statement and Approach • Problem – Given High Definition (1280x720) Video sequence (of a sports field), perform player detection and tracking generating high-resolution imagery of desired targets (players)

• Approach – Modern HD cameras deliver low resolution thumbnail (LRT) of wide area, as well as high resolution cropped image of the target region – We use these two time-stamped images to collaboratively detect and track players Mitsubishi Electric Research Labs

3

Challenges of HD Sports Tracking • Wide area coverage at high-resolution • Automatic detection of objects of interest • Tracking of detected objects

Mitsubishi Electric Research Labs

4

Wide Area Coverage at HighResolution • Traditional solutions – Scale existing narrow FOV systems using HD cameras • In our application, 16 static HD cameras will be needed using this approach!!!

– Master-slave architecture using fixed and PTZ cameras • Multiple independent cameras need to be carefully calibrated to joint world coordinate system Mitsubishi Electric Research Labs

5

Wide Area Coverage at HighResolution • Our approach – Scalable architecture that can be extended to higher resolution imagery – Single camera providing both wide area coverage and high resolution imagery of target – No need for calibration as relative homography between low resolution thumbnail and high resolution cropped image is trivially known Mitsubishi Electric Research Labs

6

Object Detection • Build a statistical model-based representation of the background – Model should be robust to variations in background appearances – Easy to update background and detect foreground objects

• Earlier approaches – Mixture of Gaussians (MoG) with expectationmaximization for parameter learning [Stauffer & Grimson ‘99] – Kernel density estimation for non-parametric background modeling [Elgammal et al ‘00] Mitsubishi Electric Research Labs

7

Object Detection • Recursive Bayesian learning for parameter estimation in MoG-based background representation [Porikli & Tuzel ‘05] – Recent history of each pixel is maintained by k layers (3D multivariate Gaussians) – Estimate the probability distribution of mean and variance for each layer

• Background model maintained at LRT size for low computational cost; projected to HD size for efficient background subtraction Mitsubishi Electric Research Labs

8

Object Tracking • Multi-kernel mean-shift tracking with background modeling in HRC view – Tracking in HRC view since in LRT view, objects too small for reliable tracking – Better object model (RGB Histogram) update in HRC view because more object data is available – Multi-kernel performs better because rapidly moving objects don’t spend much time in small FOV of HRC Mitsubishi Electric Research Labs

9

System-Camera Interaction HD

HD

HD

Camera

LRT

HRC

Process -BG Generation - Player Init

LRT

Process

LRT

-BG Update

HRC

Process -Track -Object Model Update -Background Update

Process -BG Update

HRC

Process -Track -Object Model Update -Background Update

The original HD image in the camera as well as the two images LRT and HRC transferred to system for processing. Mitsubishi Electric Research Labs

10

NO

YES

Get Next LRT Image

Get Next LRT and HRC Images

Generate/Update Background

Generate/Update Background Model (LRT Only)

HD Tracking Algorithm

YES

User-Input

Player Exit/Lost

Flow chart of the algorithm to process LRT and HRC images from camera. It also highlights the role of semi-automatic player initialization and collaborative tracking.

Track in HRC Mean-Shift

NO Update Object Model using LRT Mask and HRC pixels

YES

User-Input

NO

Mitsubishi Electric Research Labs

11

HD Detection and Tracking Scenario

(a)

(b)

(c)

(a) HD background maintained from the low resolution background and individual high resolution images. (b) Low resolution thumbnail (LRT) image of the whole FOV. Please note the very small object sizes. (c) EPTZ high resolution cropped (HRC) image of a detected player (pitcher) being tracked. Mitsubishi Electric Research Labs

12

Results 49

180

250

320

325

355

390

438

443

458

480

542

Collaborative HD player tracking in HRC view on base-ball sequence (1280x720). Mitsubishi Electric Research Labs

13

Results

(a)

(b)

(c)

835

903

917

1051

Collaborative HD human tracking in outdoor environment. (a) HD background image maintained through low resolution background and individual high resolution cropped images. (b) Low resolution thumbnail (LRT) image of the whole FOV. (c) EPTZ tracking result images in high resolution Mitsubishi Electric Research Labs

14

Results Table 1. Average per frame processing times for tracking in our collaborative solution as compared to HD only tracking. Collaborative LRT+HRC

HD Only

Background Update

50 mSec.

800 mSec.

Tracking

20 mSec.

25 mSec.

Miscellaneous

5 mSec.

10 mSec.

Total

75 mSec.

835 mSec.

Mitsubishi Electric Research Labs

15

References • • • • • • • • • • •

C. J. Needham and R.D. Boyle, Tracking Multiple Sports Players Through Occlusion, Congestion and Scale, British Machine Vision Conference, Manchester, UK, 2001. F. Rafi, S. M. Khan, K. Shafiq and M. Shah, Autonomous Target Following by Unmanned Aerial Vehicles, SPIE Defence and Security Symposium 2006, Orlando FL. F. Bashir, W. Qu, A. Khokhar and D. Schonfeld, HMM-based Motion Recognition System using Segmented PCA, IEEE International Conference on Image Processing, Genoa, Italy, 2005. W. Qu, D. Schonfeld and M. Mohamed, Distributed Bayesian Multiple Target Tracking in Crowded Environments using Multiple Collaborative Cameras, EURASIP J. Applied Signal Processing. 2007 (In print). S. M. Khan and M. Shah, A Multiview Approach to Tracking People in Crowded Scenes using a Planar Homography Constraint, 9th European Conference on Computer Vision ECCV 2006, Graz, Austria, 2006. X. Zhou, R. T. Collins, T. Kanade and P. Metes, A Master-Slave System to Acquire Biometric Imagery of Humans at Distance, ACM International Workshop on Video Surveillance, Nov. 2003. J. Migdal, T. Izo and C. Stauffer, Moving Object Segmentation using Super-Resolution Background Models, Workshop on Omnidirectional Vision, Camera Networks and Non-Classical Cameras, Oct. 2005. C. Stauffer and E. Grimson, Adaptive Background Mixture Models for Real-Time Tracking, in Proc. IEEE Conf. on Computer Vision and Pattern Recognition, Fort Collins, CO, Vol. II, 1999, pp. 246-252. A. Elgammal, D. Harwood and L. Davis, Non-parametric Model for Background Subtraction, in Proc. European Conference on Computer Vision, Dublin, Ireland, Vol. II, 2000, pp. 751-767. F. Porikli and O. Tuzel, Bayesian Background Modeling for Foreground Detection, ACM Int’l Workshop on Video Surveillance and Sensor Networks (VSSN), Nov. 2005, pp. 55-58. D. Comaniciu, V. Ramesh and P. Meer, Kernel-Based Object Tracking, IEEE Transactions on Pattern Analysis and Machine Intelligence (PAMI), Vol. 25, No. 5, pp. 564-575, 2003. Mitsubishi Electric Research Labs

16

Thank You !!! [email protected] [email protected]

Mitsubishi Electric Research Labs

17