Non-Camera Sensors, Tracking and Fusion Algorithms for Autonomous Driving
Michael James Principal Investigator Toyota Research Institute, North America
[email protected]
Plan • 50 min total • Intro, system architecture and considerations- 5 min • Sensors (camera, radar, lidar) - 5min • General picture (detection, association, filtering) - 5 min • Processing before Tracking 5 min • Filtering - 10 min • Data association - 10 min • Practical issues - 10 min
Generalized System Architecture
Tracker Outputs
• What is expected from tracking – A list of tracked targets, their locations and motion estimates – Can include probability distributions, often difficult to characterize well
Other Considerations
• Tracking algorithms also dependent on – Ego-motion – Calibration – Maps and other priors
Sensors
Typical Sensor Suite
Cameras • High information density – A challenge to process in real-time
• Projective data – No direct 3D information – Must be computed
• Passive sensing – Works with external light sources
•
Non-camera sensors Active Sensors – Emit EM, capture some returns – Basics: • •
Low energy (lower frequency) tends to behave like waves, Higher energy (higher frequency) tends to behave like particles
– Ambient EM can exist • •
Relatively little for Lidar and Radar Short emission times and directed antenna helps
– Water absorption / reflectance •
Many Lidar frequencies affected –
•
Rain, snow, fog
Radar frequencies generally unaffected
– Emission levels limited for safety
•
Positioning – Typically high-quality GPS with integrated IMU – Tied to maps, tied to tracking – Fine-turned using localization – For today, assumed sufficient
• Ranges – Visible light • 400-700 nm
– Lidar • 800-1600 nm (~400 THz)
– Automotive radar • 4mm – 1cm (29-77GHz)
•
Operation (per point)
Lidar Principles
– Laser emission – Light scatters on target – Photons collected by directed photodiode – Accumulate energy over time – Repeat lots, find peaks
•
Conceptually, a beam of photons – Makes a point cloud
•
Output characterization – Highly precise in distance measurement – Worse resolution lower than most cameras – Sensitive to rain/snow/fog – Get reflection (measurement) from most objects – Distance limits due to emission limits
* Toyota (TCRDL) developed Lidar; Matsubara et al.
(Automotive) Radar Principles •
Scans horizontally – Measurement • • •
Horizontal angle Distance Velocity (Doppler)
– Conceptually a wave – Requires reflective material • •
•
Metallic On-board signal processing required
Output characterization – Less precise in distance and angle than Lidar – Has velocity – Longer range than Lidar – Robust to rain, snow, fog – Not every obstacle reflects sufficiently – Signal processing on-chip (often including tracking) •
Sometimes a negative
Delphi sensor, Video by AutonomousStuff
•
Sensor Processing Usually processed before tracking starts – Lidar: • • •
Obstacle / ground discrimination Segmentation Filter based on map information
– Radar • •
•
Processing within radar unit Incorporates many assumptions
Important because tracking needs accurate models of sensor performance, noise, failure modes, etc
Remainder of Talk • Visual tracking – Briefly
• Traditional tracking – Filtering – Data association
• Sensor fusion • Practical issues • Summary
Visual Tracking and Relationship to Non-Camera Sensors • Operate in visual plane – 2D projection of 3D data – Ego-centric • Even if ego-motion is estimated, still track in ego-vehicle coordinates
• Appearance characteristics can be helpful • How to map to 3D? – Sensor fusion (well-calibrated system) • Use projective geometry and 3D info from other sensors
– Incorporate assumptions • Ground plane – From map – Estimated
• Scale assumptions – Size of cars, size of lanes, etc
Lidar, Radar, and Multi-modal Tracking Typical Architecture Tracker Sensor Processing
Data Association Predictions
Filter Update and Prediction
Output Handling
Filtering • Goal: To estimate “state” of a track over time • Commonly take a Bayesian approach • Basic assumptions – Single track, single measurement per time step – Track in “real-world” coordinates – Often “point-based”
• Methods – Kalman Filter (KF) – Particle Filter (PF)
Bayesian Filtering
• • • •
Measurement State Desired distribution Given
Bayesian Filtering • Previous Estimate • Prediction • Measurement Update
Kalman Filter • Assumptions – State is Gaussian vector – Linear state update – Linear measurement (observation)
• Special case of Bayesian Filtering – Special form of equations (analytically derived) – Computationally lightweight
Kalman Filter in Pictures Predict
Update
Particle Filter • Assumptions on distribution – None
• Special case of Bayesian Filtering – Distribution approximated by finite set of discrete particles – Must be able to sample from update – Evaluate measurement probability – Must keep the approximation healthy – Can be computationally expensive
Particle Filter (SIR) In Pictures Predict
Update
Multi-mode distribution
PF Example 6 DOF Model Based Tracking • PF with optimized predict (proposal) step – Reduces number of particles – Reduces computation
6-DOF Model Based Tracking via Object Coordinate Regression Krull, Michel, Brachmann, Gumhold, Ihrke, Rother
Extensions • Filters and state estimation – KF and variations • EKF / UKF (Extended / Unscented) – Approximate non-linear update/measurement functions
• RBPF (Rao-Blackwellized PF) – PF for some parts of distribution, analytic probability distribution for others
– Interacting Multiple Models (IMM) • Update includes a motion model, what if the target is maneuvering? • Combine multiple motion models
Data Association • Data association – Goal: Assign measurements to targets – Considerations • Hard and soft associations • Sensor / perception noise and ability to discriminate unique objects – – – –
False measurements Noisy measurements Missing measurements Appearance-based tracker with relatively unique objects will require much different approach than sparse lidar points with poor clustering
• Gating • Computation speed
• Topics to cover (traditional methods) – – – –
Nearest Neighbor association (NN) Probabilistic Data Association (PDA) Joint Probabilistic Data Association (JPDA) Multi-hypothesis Tracking (MHT)
•
Nearest Neighbor
Single, local, hard-association
– Basic idea: Given a single track and set of measurements, find and use the single best measurement • •
Filtering prediction step gives state distribution Measurement function to evaluate each measurement – – –
•
•
Integrate over state dist. Or some nice (e.g. Gaussian) form False alarms
Perform filtering update
Multi-target Extension – Global NN: Single, unique hard-association (GNN) – Idea: Given all tracks and set of measurements, find most likely global association • •
•
Follow first two steps above, then solve a global assignment problem Faster algorithms for large problems (e.g. Munkres)
Pros – Easy to code – Fast
•
Cons – Difficult to recover from mistake (not robust)
[Joint] Probabilistic Data Association
• Soft decisions, probabilistic assignments
– Pro: easy to represent single output (computation straightforward) – Con: mixes all assignments together
• PDA (single-target case) – Weighted sum of probability densities – Defines filter update • Special form for Kalman Filter
• JPDA (multi-target) – Separate by gating – Compute probability of each assignment hypotheses • Eg. Probability of assignment 2 in table
– Use hypothesis probabilities to compute assignment probabilities • E.g. probability of measurement 3 to target 1 is based on hypotheses 1 and 2
Multi-Hypothesis Tracking • Multiple hard decisions – Multiple universes, keep likely ones around • Same predict and update equations
– Probabilistically, most sound – Many variations • Hypothesis-based – N-best solutions – there are efficient algorithms to compute this (reference 92?)
• Track-based – Many hypotheses use same measurement for same track – Store tracks and use to populate hypotheses
– Issues • What is planner to do? • Computational cost
Practical Issues • Initialization, splitting, and deletion • Incorporating additional information – Map priors – Motion priors – Measurement constraints (occlusion reasoning)
Sensor Fusion • Large and growing literature • Track then fuse, or fuse into single tracks – Why track then fuse? • Well established tracking methodology within each sensor modality • Simpler models can be used • Easier to discard bad data from one sensor – More robust to noise and data association
– Why fuse into single tracks • Data from multiple sensors provides different information – e.g. radar velocity data to initialize motion
• Easier to model noise and uncertainty at sensor level than at track level – Less ad-hoc – Average case typically performs better, but often less robust
• Generative model – 3D structure as means to map between different sensors
Promising New Methods 1
• Point Cloud Matching – ICP and extensions – Focused Distribution Estimation * • Divide and evaluate cells based on point cloud matching • Use result to estimate transform
*Robust Real-Time Tracking Combining 3D Shape, Motion, and Color; Held, Levinson, Thrun, Savarese
Promising New Methods 2 • Optimization based – Kinnect Fusion – DART • 3D Rigid-body, articulated model • Gradient-based matching using signed distance functions • Converge to local optima
* Dense Articulated Real-time Tracking, Schmidt, Newcombe, Fox
Summary • Sensors have unique characteristics – Should be incorporated into tracking models
• Bayesian Formulation of predict, update is at the heart of many different algorithms – A wide variety of representations and methods within this framework – New ideas in proposal (6 DOF work) and update (DART work) give real-time implementations