The YouTube-8M Kaggle Competition: Challenges and Methods Haosheng Zou*, Kun Xu*, Jialian Li, Jun Zhu Presented by: Yinpeng Dong All from Tsinghua University 2017.7.26
Contents ■ ■ ■ ■
Introduction & Definition Challenges Our Methods & Results Other Methods
5M (or 6M) training videos, 225 frames / video, 1024 (+128) dimension features / frame. Disk I/O in each mini-batch. Validation takes several (~10) hours.
Downsample; smaller validation set; …
2. Noisy Labels: ◻ ◻ ◻
■
Rule-based annotated labels, not crowdsourcing 14.5% recall w.r.t. crowdsourcing, positive→negative Negative dominates; learning the annotation system
Ensemble; more randomness; … 6
Challenges (cont.) 3. Lack of Supervision: ◻ ◻
■
No information about each frame. Only video-level supervision for the whole model.
Attention; auto-encoders; …
4. Temporal Dependencies: ◻ ◻
■
Features haven’t yet taken into account. Humans can still understand videos at 1 fps.
RNNs; clustering-based models (e.g. VLAD); …
7
Challenges (cont.) ■
8
Challenges (cont.) ■
9
Our Methods, High-Level ■
Random cropping: Take 1 frame every 5 frames ◻ ◻
■
Multi-Crop Ensemble: ◻ ◻
■
Rougher temporal dependencies Only the start index is randomized One model, varying the start index Uniformly averaging
Early Stopping: ◻ ◻
Fix 5 epochs of training at most Train directly on training and validation sets.
10
Our Methods, Model ■
Prototype: stacked LSTM (1024-1024) + LR / 2MoE
■
Layer Normalization Late Fusion
■
11
Our Methods (cont.) ■
Attention
■
Bidirectional LSTM 12
Our Results
13
Other Methods ■
Separating Tasks ◻ ◻
■
Loss Manipulation ◻
■
Different frame understanding block, thus different video descriptor for each meta-task 25 verticals as meta-tasks, too slow (15 exmpls / s) Ignore negative labels when predicted confidence < 0.15
Unsupervised Representation Learning ◻
Using visual to reconstruct both visual and audio features
Team members for the Geography Bowl at the AAG meeting in Seattle are: Victoria Roman, George Washington University. Trevor Tisler, George Washington University. Chris Dube, University of Maryland-College Park. Colin Reisser, George Washington Univer
41, NO. 1 www.usenix.org. BeyondCorp. Design to Deployment at Google ... internal networks and external networks to be completely untrusted, and ... the Trust Inferer, Device Inventory Service, Access Control Engine, Access Policy, Gate-.
Jan 27, 2015 - free assemblies is theoretically possible.41 Though the trends show a marked .... loop of Tile A, and the polymerase extends the strand, unravelling the stem ..... Reif, J. Local Parallel Biomolecular Computation. In DNA-.
prediction, covering 2 well-known benchmark datasets and a real world wind ..... Wind provides a non-polluting renewable energy source, and it has been.
Togaware, again hosting the website and the conference management system, ... 10:30 - 11:00 INCORPORATE DOMAIN KNOWLEDGE INTO SUPPORT VECTOR ...... strength of every objects oi against itself to locate a 'best fit' based on the.
tion rates, including website popularity (top web- .... Several of the Internet's most popular web- sites .... can't capture search, e-mail, or social media when they ..... 10%. N/A. Table 2: HTTPS support among each set of websites, February 2017.
Dec 6, 2014 - Rather, one should assume that an internal network is as fraught with danger as .... service-level authorization to enterprise applications on a.
On-call/pager response is critical to the immediate health of the service, and ... Resolving each on-call incident takes between minutes ..... The conference has.