Building high-level features using large scale unsupervised learning Quoc  V.  Le  

Stanford  University  and  Google  

Joint  work  with:  Marc’Aurelio  Ranzato,  Rajat  Monga,  MaEhieu  Devin,  Kai  Chen,     Greg  Corrado,  Jeff  Dean,  Andrew  Y.  Ng  

Hierarchy  of  feature  representaRons   Face  detectors  

Face  parts   (combinaRon     of  edges)  

edges  

pixels   Lee  et  al,  2009.  Sparse  DBNs.  

Faces  

Random  images  from  the  Internet  

Key  results  

Face  detector  

Human  body  detector  

Cat  detector  

Quoc  V.  Le  

Algorithm  

Sparse   autoencoder  

Sparse   autoencoder   Sparse   autoencoder  

Each  RICA  layer  =  1  filtering  layer  +  pooling  layer  +  local  contrast   normalizaRon  layer     See  Le  et  al,  NIPS  11  and  Le  et  al,  CVPR  11  for  applicaRons  on  acRon   recogniRon,  object  recogniRon,  biomedical  imaging     Very  large  model  -­‐>  Cannot  fit  in  a  single  machine                                                                -­‐>  Model  parallelism,  Data  parallelism    

Image  

Quoc  V.  Le  

Local  recepRve  field  networks   Machine  #1  

Machine  #2  

Machine  #3  

Machine  #4  

Features  

Image  

Le,  et  al.,  Tiled  Convolu,onal  Neural  Networks.  NIPS  2010  

Quoc  V.  Le  

Asynchronous  Parallel  SGDs  

Parameter  server  

Quoc  V.  Le  

Asynchronous  Parallel  SGDs  

Parameter  server  

Quoc  V.  Le  

Training  

Sparse   autoencoder  

Sparse   autoencoder  

Dataset:  10  million  200x200  unlabeled  images    from  YouTube/Web     Train  on  1000  machines  (16000  cores)  for  1  week     1.15  billion  parameters   -­‐  100x  larger  than  previously  reported     -­‐  Small  compared  to  visual  cortex    

Sparse   autoencoder  

Image  

Quoc  V.  Le  

Top  sRmuli  from  the  test  set  

Quoc  V.  Le  

Face  detector   OpRmal  sRmulus  via  opRmizaRon  

Quoc  V.  Le  

Face  detector  

Human  body  detector  

Cat  detector  

Quoc  V.  Le  

Random  distractors   Faces  

Frequency  

Feature  value   Quoc  V.  Le  

0  pixels  

20  pixels  

Feature  response  

Feature  response  

Invariance  properRes  

0  pixels   VerRcal  shils  

o   90  

o   0   3D  rotaRon  angle  

Feature  response  

Feature  response  

Horizontal  shils  

20  pixels  

0.4x  

1x  

1.6x  

Scale  factor   Quoc  V.  Le  

ImageNet  classificaRon   20,000  categories,  16,000,000  images     Hand-­‐engineered  features  (SIFT,  HOG,  LBP),    SpaRal  pyramid,     SparseCoding/Compression,  Kernel  SVMs    

Quoc  V.  Le  

20,000  is  a  lot  of  categories…     …   smoothhound,  smoothhound  shark,  Mustelus  mustelus   American  smooth  dogfish,  Mustelus  canis   Florida  smoothhound,  Mustelus  norrisi   whiteRp  shark,  reef  whiteRp  shark,  Triaenodon  obseus   AtlanRc  spiny  dogfish,  Squalus  acanthias   Pacific  spiny  dogfish,  Squalus  suckleyi   hammerhead,  hammerhead  shark   smooth  hammerhead,  Sphyrna  zygaena   smalleye  hammerhead,  Sphyrna  tudes   shovelhead,  bonnethead,  bonnet  shark,  Sphyrna  Rburo   angel  shark,  angelfish,  SquaRna  squaRna,  monkfish   electric  ray,  crampfish,  numbfish,  torpedo   smalltooth  sawfish,  PrisRs  pecRnatus   guitarfish   roughtail  sRngray,  DasyaRs  centroura   buEerfly  ray   eagle  ray   spoEed  eagle  ray,  spoEed  ray,  Aetobatus  narinari   cownose  ray,  cow-­‐nosed  ray,  Rhinoptera  bonasus   manta,  manta  ray,  devilfish   AtlanRc  manta,  Manta  birostris   devil  ray,  Mobula  hypostoma   grey  skate,  gray  skate,  Raja  baRs   liEle  skate,  Raja  erinacea   …  

SRngray  

Mantaray  

Quoc  V.  Le  

0.005%   9.5%   Random  guess  

State-­‐of-­‐the-­‐art   (Weston,  Bengio  ‘11)  

?  

Feature  learning     From  raw  pixels  

Quoc  V.  Le  

0.005%   9.5%   15.8%   Random  guess  

State-­‐of-­‐the-­‐art   (Weston,  Bengio  ‘11)  

Feature  learning     From  raw  pixels  

ImageNet  2009  (10k  categories):  Best  published  result:  17%                                                                                                                        (Sanchez  &  Perronnin  ‘11  ),                                                                                                                        Our  method:  19%     Using  only  1000  categories,  our  method  >  50%    

Quoc  V.  Le  

Feature  1  

Feature  2  

Feature  3  

Feature  4  

Feature  5  

Quoc  V.  Le  

Feature  6  

Feature  7    

Feature  8  

Feature  9  

Quoc  V.  Le  

Feature  10  

Feature  11    

Feature  12  

Feature  13  

Quoc  V.  Le  

Conclusions   •  RICA  learns  invariant  features   •  Face  neuron  with  totally  unlabeled  data                    with  enough  training  and  data   •  State-­‐of-­‐the-­‐art  performances  on     –  AcRon  RecogniRon   –  Cancer  image  classificaRon   –  ImageNet  

Cancer  classificaRon  

Feature  visualizaRon  

0.005%  

ImageNet   9.5%  

Random  guess  

AcRon  recogniRon  

Face  neuron  

Best  published  result  

15.8%   Our  method  

Joint  work  with  

Kai  Chen  

Greg  Corrado  

Rajat  Monga   Andrew  Ng  

AddiRonal   Thanks:  

Jeff  Dean   MaEhieu  Devin  

Marc Aurelio   Paul  Tucker   Ranzato  

Ke  Yang  

Samy  Bengio,  Zhenghao  Chen,  Tom  Dean,  Pangwei  Koh,   Mark  Mao,  Jiquan  Ngiam,  Patrick  Nguyen,  Andrew  Saxe,   Mark  Segal,  Jon  Shlens,    Vincent  Vanhouke,    Xiaoyun  Wu,     Peng  Xe,  Serena  Yeung,  Will  Zou  

References   •  Q.V.  Le,  M.A.  Ranzato,  R.  Monga,  M.  Devin,  G.  Corrado,  K.  Chen,  J.  Dean,  A.Y.   Ng.  Building  high-­‐level  features  using  large-­‐scale  unsupervised  learning.   ICML,  2012.   •  Q.V.  Le,  J.  Ngiam,  Z.  Chen,  D.  Chia,  P.  Koh,  A.Y.  Ng.  Tiled  Convolu7onal  Neural   Networks.  NIPS,  2010.     •  Q.V.  Le,  W.Y.  Zou,  S.Y.  Yeung,  A.Y.  Ng.  Learning  hierarchical  spa7o-­‐temporal   features  for  ac7on  recogni7on  with  independent  subspace  analysis.  CVPR,   2011.   •  Q.V.  Le,  J.  Ngiam,  A.  Coates,  A.  Lahiri,  B.  Prochnow,  A.Y.  Ng.     On  op7miza7on  methods  for  deep  learning.  ICML,  2011.     •  Q.V.  Le,  A.  Karpenko,  J.  Ngiam,  A.Y.  Ng.    ICA  with  Reconstruc7on  Cost  for   Efficient  Overcomplete  Feature  Learning.  NIPS,  2011.     •  Q.V.  Le,  J.  Han,  J.  Gray,  P.  Spellman,  A.  Borowsky,  B.  Parvin.  Learning  Invariant   Features  for  Tumor  Signatures.  ISBI,  2012.     •  I.J.  Goodfellow,  Q.V.  Le,  A.M.  Saxe,  H.  Lee,  A.Y.  Ng,    Measuring  invariances  in   deep  networks.  NIPS,  2009.  

hEp://ai.stanford.edu/~quocle  

Building high-level features using large scale ... - Research at Google

Model parallelism, Data parallelism. Image ... Face neuron with totally unlabeled data with enough training and data. •. State-‐of-‐the-‐art performances on.

9MB Sizes 0 Downloads 353 Views

Recommend Documents

Building High-level Features Using Large Scale ... - Research at Google
Using Large Scale Unsupervised Learning. Quoc V. Le ... a significant challenge for problems where labeled data are rare. ..... have built a software framework called DistBelief that ... Surprisingly, the best neuron in the network performs.

Building High-level Features Using Large Scale ... - Research at Google
same network is sensitive to other high-level concepts such as cat faces and human bod- ies. Starting with these learned features, we trained our network to ...

Challenges in Building Large-Scale Information ... - Research at Google
Page 24 ..... Frontend Web Server query. Cache servers. Ad System. News. Super root. Images. Web. Blogs. Video. Books. Local. Indexing Service ...

Large-scale Incremental Processing Using ... - Research at Google
language (currently C++) and mix calls to the Percola- tor API with .... 23 return true;. 24. } 25. } 26 // Prewrite tries to lock cell w, returning false in case of conflict. 27 ..... set of the servers in a Google data center. .... per hour. At thi

Building Large-Scale Internet Services - Research at Google
Some Commonly Used Systems Infrastructure at Google. •GFS & Colossus (next gen GFS). –cluster-level file system (distributed across thousands of nodes).

Building Large-Scale Internet Services - Research
~1 network rewiring (rolling ~5% of machines down over 2-day span). ~20 rack ...... Web Search for a Planet: The Google Cluster Architecture, IEEE Micro, 2003.Missing:

Large-Scale Automated Refactoring Using ... - Research
matching infrastructure are all readily available for public con- sumption and improvements continue to be publicly released. In the following pages, we present ...

Large-scale speaker identification - Research at Google
promises excellent scalability for large-scale data. 2. BACKGROUND. 2.1. Speaker identification with i-vectors. Robustly recognizing a speaker in spite of large ...

Aggregating Frame-level Features for Large ... - Research at Google
Google Cloud & YouTube-8M Video Understanding Chal- lenge, which can be ... and 0.84193 on the private 50% of test data, ranking 4th out of 650 teams ...

Aggregating Frame-level Features for Large ... - Research at Google
RNN. Residual connections. GRU. -. GRU. Bi-directional ... "NetVLAD: CNN architecture for weakly supervised place recognition.“ CVPR 2016. ×. ×. ×. × sample.

Large Scale Performance Measurement of ... - Research at Google
Large Scale Performance Measurement of Content-Based ... in photo management applications. II. .... In this section, we perform large scale tests on two.

VisualRank: Applying PageRank to Large-Scale ... - Research at Google
data noise, especially given the nature of the Web images ... [19] for video retrieval and Joshi et al. ..... the centers of the images all correspond to the original.

Distributed Large-scale Natural Graph ... - Research at Google
Natural graphs, such as social networks, email graphs, or instant messaging ... cated values in order to perform most of the computation ... On a graph of 200 million vertices and 10 billion edges, de- ... to the author's site if the Material is used

HaTS: Large-scale In-product Measurement of ... - Research at Google
Dec 5, 2014 - ology, standardization. 1. INTRODUCTION. Human-computer interaction (HCI) practitioners employ ... In recent years, numerous questionnaires have been devel- oped and ... tensive work by social scientists. This includes a ..... the degre

Google Image Swirl: A Large-Scale Content ... - Research at Google
{jing,har,chuck,jingbinw,mars,yliu,mingzhao,covell}@google.com. Google Inc., Mountain View, ... 2. User Interface. After hierarchical clustering has been performed, the re- sults of an image search query are organized in the struc- ture of a tree. A

Google Image Swirl: A Large-Scale Content ... - Research at Google
used to illustrate tree data data structures, there are many options in the literature, ... Visualizing web images via google image swirl. In NIPS. Workshop on ...

Large-scale, sequence-discriminative, joint ... - Research at Google
[3]. This paper focuses on improving performance of such MTR. AMs in matched and ... energy with respect to the mixture energy at each T-F bin [5]. Typically, the estimated .... for pre-training the mask estimator, we use an alternative train- ing se

Evaluating Similarity Measures: A Large-Scale ... - Research at Google
Aug 24, 2005 - A Large-Scale Study in the Orkut Social Network. Ellen Spertus ... ABSTRACT. Online information services have grown too large for users ... similarity measure, online communities, social networks. 1. INTRODUCTION.

YouTube-8M: A Large-Scale Video ... - Research at Google
tities and have at least 1, 000 views, using the YouTube video annotation system ... video at 1 frame-per-second up to the first 360 seconds (6 minutes), feed the ...

LARGE-SCALE AUDIO EVENT DISCOVERY IN ... - Research at Google
from a VGG-architecture [18] deep neural network audio model [5]. This model was also .... Finally, our inspection of per-class performance indicated a bi-modal.