Collaborative Tracking of Objects in EPTZ Cameras Faisal Bashir Fatih Porikli Mitsubishi Electric Research Laboratories, Cambridge, MA 02139.
Overview • Problem Statement and Background – High-resolution wide area coverage – Object detection – Multi-kernel Mean-shift tracking
• Electronic Pan-Tilt-Zoom solution – System-camera interaction – EPTZ tracking algorithm
• Results Mitsubishi Electric Research Labs
2
Problem Statement and Approach • Problem – Given High Definition (1280x720) Video sequence (of a sports field), perform player detection and tracking generating high-resolution imagery of desired targets (players)
• Approach – Modern HD cameras deliver low resolution thumbnail (LRT) of wide area, as well as high resolution cropped image of the target region – We use these two time-stamped images to collaboratively detect and track players Mitsubishi Electric Research Labs
3
Challenges of HD Sports Tracking • Wide area coverage at high-resolution • Automatic detection of objects of interest • Tracking of detected objects
Mitsubishi Electric Research Labs
4
Wide Area Coverage at HighResolution • Traditional solutions – Scale existing narrow FOV systems using HD cameras • In our application, 16 static HD cameras will be needed using this approach!!!
– Master-slave architecture using fixed and PTZ cameras • Multiple independent cameras need to be carefully calibrated to joint world coordinate system Mitsubishi Electric Research Labs
5
Wide Area Coverage at HighResolution • Our approach – Scalable architecture that can be extended to higher resolution imagery – Single camera providing both wide area coverage and high resolution imagery of target – No need for calibration as relative homography between low resolution thumbnail and high resolution cropped image is trivially known Mitsubishi Electric Research Labs
6
Object Detection • Build a statistical model-based representation of the background – Model should be robust to variations in background appearances – Easy to update background and detect foreground objects
• Earlier approaches – Mixture of Gaussians (MoG) with expectationmaximization for parameter learning [Stauffer & Grimson ‘99] – Kernel density estimation for non-parametric background modeling [Elgammal et al ‘00] Mitsubishi Electric Research Labs
7
Object Detection • Recursive Bayesian learning for parameter estimation in MoG-based background representation [Porikli & Tuzel ‘05] – Recent history of each pixel is maintained by k layers (3D multivariate Gaussians) – Estimate the probability distribution of mean and variance for each layer
• Background model maintained at LRT size for low computational cost; projected to HD size for efficient background subtraction Mitsubishi Electric Research Labs
8
Object Tracking • Multi-kernel mean-shift tracking with background modeling in HRC view – Tracking in HRC view since in LRT view, objects too small for reliable tracking – Better object model (RGB Histogram) update in HRC view because more object data is available – Multi-kernel performs better because rapidly moving objects don’t spend much time in small FOV of HRC Mitsubishi Electric Research Labs
9
System-Camera Interaction HD
HD
HD
Camera
LRT
HRC
Process -BG Generation - Player Init
LRT
Process
LRT
-BG Update
HRC
Process -Track -Object Model Update -Background Update
Process -BG Update
HRC
Process -Track -Object Model Update -Background Update
The original HD image in the camera as well as the two images LRT and HRC transferred to system for processing. Mitsubishi Electric Research Labs
10
NO
YES
Get Next LRT Image
Get Next LRT and HRC Images
Generate/Update Background
Generate/Update Background Model (LRT Only)
HD Tracking Algorithm
YES
User-Input
Player Exit/Lost
Flow chart of the algorithm to process LRT and HRC images from camera. It also highlights the role of semi-automatic player initialization and collaborative tracking.
Track in HRC Mean-Shift
NO Update Object Model using LRT Mask and HRC pixels
YES
User-Input
NO
Mitsubishi Electric Research Labs
11
HD Detection and Tracking Scenario
(a)
(b)
(c)
(a) HD background maintained from the low resolution background and individual high resolution images. (b) Low resolution thumbnail (LRT) image of the whole FOV. Please note the very small object sizes. (c) EPTZ high resolution cropped (HRC) image of a detected player (pitcher) being tracked. Mitsubishi Electric Research Labs
12
Results 49
180
250
320
325
355
390
438
443
458
480
542
Collaborative HD player tracking in HRC view on base-ball sequence (1280x720). Mitsubishi Electric Research Labs
13
Results
(a)
(b)
(c)
835
903
917
1051
Collaborative HD human tracking in outdoor environment. (a) HD background image maintained through low resolution background and individual high resolution cropped images. (b) Low resolution thumbnail (LRT) image of the whole FOV. (c) EPTZ tracking result images in high resolution Mitsubishi Electric Research Labs
14
Results Table 1. Average per frame processing times for tracking in our collaborative solution as compared to HD only tracking. Collaborative LRT+HRC
HD Only
Background Update
50 mSec.
800 mSec.
Tracking
20 mSec.
25 mSec.
Miscellaneous
5 mSec.
10 mSec.
Total
75 mSec.
835 mSec.
Mitsubishi Electric Research Labs
15
References • • • • • • • • • • •
C. J. Needham and R.D. Boyle, Tracking Multiple Sports Players Through Occlusion, Congestion and Scale, British Machine Vision Conference, Manchester, UK, 2001. F. Rafi, S. M. Khan, K. Shafiq and M. Shah, Autonomous Target Following by Unmanned Aerial Vehicles, SPIE Defence and Security Symposium 2006, Orlando FL. F. Bashir, W. Qu, A. Khokhar and D. Schonfeld, HMM-based Motion Recognition System using Segmented PCA, IEEE International Conference on Image Processing, Genoa, Italy, 2005. W. Qu, D. Schonfeld and M. Mohamed, Distributed Bayesian Multiple Target Tracking in Crowded Environments using Multiple Collaborative Cameras, EURASIP J. Applied Signal Processing. 2007 (In print). S. M. Khan and M. Shah, A Multiview Approach to Tracking People in Crowded Scenes using a Planar Homography Constraint, 9th European Conference on Computer Vision ECCV 2006, Graz, Austria, 2006. X. Zhou, R. T. Collins, T. Kanade and P. Metes, A Master-Slave System to Acquire Biometric Imagery of Humans at Distance, ACM International Workshop on Video Surveillance, Nov. 2003. J. Migdal, T. Izo and C. Stauffer, Moving Object Segmentation using Super-Resolution Background Models, Workshop on Omnidirectional Vision, Camera Networks and Non-Classical Cameras, Oct. 2005. C. Stauffer and E. Grimson, Adaptive Background Mixture Models for Real-Time Tracking, in Proc. IEEE Conf. on Computer Vision and Pattern Recognition, Fort Collins, CO, Vol. II, 1999, pp. 246-252. A. Elgammal, D. Harwood and L. Davis, Non-parametric Model for Background Subtraction, in Proc. European Conference on Computer Vision, Dublin, Ireland, Vol. II, 2000, pp. 751-767. F. Porikli and O. Tuzel, Bayesian Background Modeling for Foreground Detection, ACM Int’l Workshop on Video Surveillance and Sensor Networks (VSSN), Nov. 2005, pp. 55-58. D. Comaniciu, V. Ramesh and P. Meer, Kernel-Based Object Tracking, IEEE Transactions on Pattern Analysis and Machine Intelligence (PAMI), Vol. 25, No. 5, pp. 564-575, 2003. Mitsubishi Electric Research Labs
16
Thank You !!!
[email protected] [email protected]
Mitsubishi Electric Research Labs
17