Examples with importance weights Sometimes some examples are more important. Importance weights pop up in: boosting, differing train/test distributions, active learning, etc. John can reduce everything to importance weighted binary classification.

Principle Having an example with importance weight h should be equivalent to having the example h times in the dataset.

Learning with importance weights

y

Learning with importance weights

wt> x

y

Learning with importance weights

−η(∇`)> x

wt> x

y

Learning with importance weights

−η(∇`)> x

> wt> x wt+1 x

y

Learning with importance weights

−6η(∇`)> x

wt> x

y

Learning with importance weights

−6η(∇`)> x

wt> x

y

> wt+1 x ??

Learning with importance weights

−η(∇`)> x

wt> x

y

> wt+1 x

Learning with importance weights

> wt> x wt+1 x y

Learning with importance weights

s(h)||x||2 > wt> x wt+1 x y

What is s(·)? Losses for linear models `(w > x, y ). ∇w ` = Update must be given by

∂`(p,y ) ∂p x

wt+1 = wt − s(h)x s(h) must satisfy ∂`(p, y ) s(h + ) = s(h) + η ∂p p=(wt −s(h)x)> x ∂`(p, y ) 0 s (h) = η ∂p p=(wt −s(h)x)> x Finally s(0) = 0

Many loss functions Loss

`(p, y )

Squared

(y − p)2

Logistic

log(1 + e −yp )

Exponential

e −yp

Logarithmic

y log yp + (1 − y ) log 1−y 1−p

W (e

p−y x>x hηx > x+yp+e yp

)−hηx > x−e yp for y ∈ {−1, 1} yx > x py −log(hηx > x+e py ) for y ∈ {−1, 1} x > xy q p−1+ (p−1)2 +2hηx > x

if y = 0

q x>x p− p 2 +2hηx > x x>x p−1+ 1 (12hηx > x+8(1−p)3/2 )2/3 4 y =0 x>x p− 1 (12hηx > x+8p 3/2 )2/3 4 y =1 “ ” x>x −y min hη, 1−yp for y ∈ {−1, 1} x>x if y > p −τ min(hη, y −p ) τx>x p−y if y ≤ p (1 − τ ) min(hη, ) (1−τ )x > x

if y = 1

Hellinger

(



p−

√ √ √ 2 y ) − ( 1 − p − 1 − y )2

if if

Hinge τ -Quantile

max(0, 1 − yp) if y > p if y ≤ p

τ (y − p) (1 − τ )(p − y )

Update s(h) „ « > 1 − e −hηx x

Robust results for unweighted problems astro - logistic loss

spam - quantile loss

0.97

0.98

0.96

0.97 0.96 standard

standard

0.95 0.94 0.93 0.92

0.95 0.94 0.93 0.92

0.91

0.91

0.9

0.9 0.9

0.91

0.92

0.93 0.94 0.95 importance aware

0.96

0.97

0.9

1

0.945

0.99

0.94

0.98

0.935

0.97

0.93

0.96

0.925 0.92

0.93 0.94 0.95 importance aware

0.96

0.97

0.98

0.95 0.94

0.915

0.93

0.91

0.92

0.905

0.91

0.9

0.92

webspam - hinge loss

0.95

standard

standard

rcv1 - squared loss

0.91

0.9 0.9 0.905 0.91 0.915 0.92 0.925 0.93 0.935 0.94 0.945 0.95 importance aware

0.9

0.91 0.92 0.93 0.94 0.95 0.96 0.97 0.98 0.99 importance aware

And now something completely different Adaptive, individual learning rates in VW. It’s really GD separately on each coordinate i with ηt,i = r Pt

s=1

1 

∂`(ws> xs ,ys ) ∂ws,i

2

Coordinate-wise scaling of the data less of an issue Can state this formally (Duchi, Hazan, and Singer / McMahan and Streeter, COLT 2010)

Some tricks involved Store sum of squared gradients w.r.t wi near wi . float InvSqrt(float x){ float xhalf = 0.5f * x; int i = *(int*)&x; i = 0x5f3759d5 - (i >> 1); x = *(float*)&i; x = x*(1.5f - xhalf*x*x); return x; } Special SSE rsqrt instruction is a little better

Experiments Raw Data ./vw --adaptive -b 24 --compressed -d tmp/spam_train.gz average loss = 0.02878 ./vw -b 24 --compressed -d tmp/spam_train.gz -l 100 average loss = 0.03267 TFIDF scaled data ./vw --adaptive -b 24 --compressed -d tmp/rcv1_train.gz -l 1 average loss = 0.04079 ./vw -b 24 --compressed -d tmp/rcv1_train.gz -l 256 average loss = 0.04465

Examples with importance weights - GitHub

Page 3 ... Learning with importance weights y. wT t x wT t+1x s(h)||x||2 ... ∣p=(wt−s(h)x)Tx s (h) = η. ∂l(p,y). ∂p. ∣. ∣. ∣. ∣p=(wt−s(h)x)Tx. Finally s(0) = 0 ...

186KB Sizes 55 Downloads 321 Views

Recommend Documents

Importance Weighting Without Importance Weights: An Efficient ...
best known regret bounds for FPL in online combinatorial optimization with full feedback, closing ... Importance weighting is a crucially important tool used in many areas of ...... Regret bounds and minimax policies under partial monitoring.

Importance Weighting Without Importance Weights: An Efficient ...
best known regret bounds for FPL in online combinatorial optimization with full feedback, closing the perceived performance gap between FPL and exponential weights in this setting. ... Importance weighting is a crucially important tool used in many a

A fast procedure for calculating importance weights in ...
the number of times sample point xi appears in x∗. Thus we can approximate ..... of n = 1664 repair times for Verizon's telephone customers. It is evident from the ...

Examples of DD effects - GitHub
Jun 29, 2010 - 3C147 field at L-Band with the EVLA. ○ Only 12 antennas used. ○ Bandwidth: 128 MHz. ○ ~7 hr. integration. ○ Dynamic range: ~700,000:1.

Adaptive Behavior with Fixed Weights in RNN: An ...
To illustrate the evolution of states, we choose the RMLP of [7] because it has only 14 hidden nodes in its two fully recurrent layers. Figures 2 and 3 show outputs ...

WEIGHTS & LAKES OPEN.pdf
WEIGHTS & LAKES Olympic Weightlifting Open. All contestants must pre-register online or by mail by Wednesday, May 10th. Late entries will not be accepted.

2017 weights and scan data.pdf
39 08D RRL HXC BIG IRON 0024X 1/19/2016 4.33 1200 37.9 3.80 14.2 0.15 weighed 2/19/17. 40 D40 JHJ LMAN KING ROB 8621 1/15/2016 4.20 1140 35.8 ...

Silver Spur Weights & Scrotals.pdf
9, 2017. Red Angus. 41 319C 1A Evolution 9/17/15 Red 92 500 1235 2.42 3.75 35. 42 22C 1A Nexus 9/18/15 Red 86 572 1270 2.50 3.35 36.5. 43 1945D 1A ...

CP2K with LIBXSMM - GitHub
make ARCH=Linux-x86-64-intel VERSION=psmp AVX=2. To target for instance “Knights ... //manual.cp2k.org/trunk/CP2K_INPUT/GLOBAL/DBCSR.html).

Java with Generators - GitHub
processes the control flow graph and transforms it into a state machine. This is required because we can then create states function SCOPEMANGLE(node).

the-importance-of-early-treatment-with-tranexamic-acid.pdf ...
data collection, data analysis, data interpretation, or. writing of the report. The corresponding author (IR) had. full access to all the data in the study and had fi nal.

OpenBMS connection with CAN - GitHub
Arduino with BMS- and CAN-bus shield as BMS a master. - LTC6802-2 or LTC6803-2 based boards as cell-level boards. - CAN controlled Eltek Valere as a ...

Better performance with WebWorkers - GitHub
Chrome52 on this Laptop. » ~14kbyte. String => 133ms ... 3-4 Seks processing time on samsung galaxy S5 with crosswalk to finish the transition with ... Page 17 ...

with ZeroMQ and gevent - GitHub
Normally, the networking of distributed systems is ... Service Oriented .... while True: msg = socket.recv() print "Received", msg socket.send(msg). 1. 2. 3. 4. 5. 6. 7.

Getting Started with CodeXL - GitHub
10. Source Code View . ..... APU, a recent version of Radeon Software, and the OpenCL APP SDK. This document describes ...... lel_Processing_OpenCL_Programming_Guide-rev-2.7.pdf. For GPU ... trademarks of their respective companies.

Getting Started with Go - GitHub
Jul 23, 2015 - The majority of my experience is in PHP. I ventured into Ruby, ... Compiled, Statically Typed, Concurrent, Imperative language. Originally ...

Graph-based Proximity with Importance and Specificity - University of ...
Advanced Digital Sciences Center, 1 Fusionopolis Way, #08-10 Connexis N. Tower, Singapore ..... is the reachability from v to q, which we call T-Rank (rank.

Graph-based Proximity with Importance and Specificity - University of ...
Abstract—Graph-based proximity has many applications with different ranking needs. ...... words (e.g., “the apple ipod” and “ipod of apple”). Evaluation. For each ...

Home security camera - The importance of camera installation with ...
Home security camera - The importance of camera installation with home.pdf. Home security camera - The importance of camera installation with home.pdf.

Graph-based Proximity with Importance and ... - Semantic Scholar
As each edge embeds certain semantic relationship, through these ...... Intl. Conf. on Web Services. Intl. Conf. on Web ... J. Web Engineering. J. Web Semantics.

Hoover Bull Yearling Weights .pdf
Page 1 of 1. 2017 Hoover Angus. Yearling Bull Weight and Scrotal Update. Lot #. Adj. YW. YW. Ratio. Adj. Scrotal Lot #. Adj. YW. YW. Ratio. Adj. Scrotal.

Stat Weights via Taylor Series.pdf
There was a problem previewing this document. Retrying... Download. Connect more apps... Try one of the apps below to open or edit this item. Stat Weights via ...

Getting Acquainted with R - GitHub
In this case help.search(log) returns all the functions with the string 'log' in them. ... R environment your 'working directory' (i.e. the directory on your computer's file ... Later in the course we'll discuss some ways of implementing sanity check