What?

I

vw can build polynomial decision surfaces: x 7→ 2 · 1[x1 x2 − 3x1 x4 x5 + 7x2 x4 x9 ≥ 0] − 1.

What?

I

vw can build polynomial decision surfaces: x 7→ 2 · 1[x1 x2 − 3x1 x4 x5 + 7x2 x4 x9 ≥ 0] − 1.

I

Just add --stage poly to your command line.

Why?

I

Polynomial features facilitate many problems (e.g., see kaggle forums).

Why?

I

Polynomial features facilitate many problems (e.g., see kaggle forums).

Why?

I

Polynomial features facilitate many problems (e.g., see kaggle forums).

I

vw already had (manual) polynomials of low degree: vw -q ff (..) vw --cubic fff (..)

Why?

I

Polynomial features facilitate many problems (e.g., see kaggle forums).

I

vw already had (manual) polynomials of low degree: vw -q ff (..) vw --cubic fff (..) --stage poly vs. --ksvm: different bias.

I

Does it work? Theory. Loss minimization guarantee.

Does it work? Theory. Loss minimization guarantee. Practice. . Relative error vs time tradeoff 1.0

linear quadratic cubic apple(0.125) apple(0.25) apple(0.5) apple(0.75) apple(1.0)

relative error

0.8 0.6 0.4 0.2 0.0 −0.2

0

10

1

10 relative time

2

10

Does it work? Theory. Loss minimization guarantee. Practice. . Relative error vs time tradeoff 1.0

linear quadratic cubic apple(0.125) apple(0.25) apple(0.5) apple(0.75) apple(1.0)

relative error

0.8 0.6 0.4 0.2 0.0 −0.2

0

10

1

10 relative time

2

10

More Info. NIPS 2014: Scalable Nonlinear Learning with Adaptive Polynomial Expansions. Alekh Agarwal, Alina Beygelzimer, Daniel Hsu, John Langford, Matus Telgarsky.

Usage

--stage poly

Usage

--stage poly --sched exponent arg1

Usage

--stage poly --sched exponent arg1 --batch sz arg2

Usage

--stage poly --sched exponent arg1 --batch sz arg2 --batch sz no doubling

Hacking

I

Different types of features.

Hacking

I

Different types of features.

I

Different support search.

More info @ github wiki & NIPS paper.

More info @ github wiki & NIPS paper.