Budget Optimization for Online Advertising Campaigns with Carryover Effects Nikolay Archak New York University, New York Vahab Mirrokni Google Research, New York S. Muthukrishnan Google Research, New York
Outline
Motivation Model and Problem Formulation Constrained MDP Improved Greedy Algorithm
User Conversion Attribution
Half the money I spend on advertising is wasted; the trouble is, I don’t know which half. (John Wanamaker)
User Conversion Attribution
Half the money I spend on advertising is wasted; the trouble is, I don’t know which half. (John Wanamaker) Online advertising helps with measurements like CTR & Conversion-Rate (CR).
User Conversion Attribution
Half the money I spend on advertising is wasted; the trouble is, I don’t know which half. (John Wanamaker) Online advertising helps with measurements like CTR & Conversion-Rate (CR). But, CTR and CR do not capture some important aspects of ad effectiveness.
Beyond Last Click
15:58:29 16:00:53 20:50:04 20:50:08 20:57:24
cheap Disney world vacation air fares priceline com priceline com
impression impression impression click conversion
Beyond Last Click
15:58:29 cheap Disney world vacation 16:00:53 air fares 20:50:04 priceline com 20:50:08 priceline com 20:57:24 Attribute the conversion to the last event?
impression impression impression click conversion
Beyond Last Click
15:58:29 cheap Disney world vacation impression 16:00:53 air fares impression 20:50:04 priceline com impression 20:50:08 priceline com click 20:57:24 conversion The last search might be triggered by previous ad impressions.
Beyond Last Click
15:58:29 cheap Disney world vacation 16:00:53 air fares 20:50:04 priceline com 20:50:08 priceline com 20:57:24 Go beyond the last click?
impression impression impression click conversion
Data: Number of Searches Before Conversion
With 66% probability the user will perform one more search before conversion. With 9.6% probability the user will perform at least 10 more searches before conversion.
Support in Prior Research
1
Theoretical claims in the marketing literature (Keller, 1996).
Support in Prior Research
1 2
Theoretical claims in the marketing literature (Keller, 1996). A randomized experiment performed by Yahoo! and a major retailer (Lewis, Reiley, 2008): the campaign had substantial impact also on those who merely viewed them.
Support in Prior Research
1 2
Theoretical claims in the marketing literature (Keller, 1996). A randomized experiment performed by Yahoo! and a major retailer (Lewis, Reiley, 2008): the campaign had substantial impact also on those who merely viewed them.
3
comScore study (2008): an incremental lift of 27% in the online sales lift in other important online behaviors (brand site visitation, trademark searches).
Support in Prior Research
1 2
Theoretical claims in the marketing literature (Keller, 1996). A randomized experiment performed by Yahoo! and a major retailer (Lewis, Reiley, 2008): the campaign had substantial impact also on those who merely viewed them.
3
comScore study (2008): an incremental lift of 27% in the online sales lift in other important online behaviors (brand site visitation, trademark searches).
4
Graph-based (Markov) Models (Archak, M., Muthukrishnan, WWW 2010)
Graph-based (Markov) Models
Archak, M., Muthukrishnan, WWW 2010 Graph-based (Markov) models show better fitness to data Compute Adfactors using AdGraphs, and show its effectiveness.
Graph-based (Markov) Models
Archak, M., Muthukrishnan, WWW 2010 Graph-based (Markov) models show better fitness to data Compute Adfactors using AdGraphs, and show its effectiveness. Construct a graph for the set of transactions (event sequences). Nodes or States: events like ad impression, ad click, search keyword etc. Special nodes like a conversion node.
Graph-based (Markov) Models
Archak, M., Muthukrishnan, WWW 2010 Graph-based (Markov) models show better fitness to data Compute Adfactors using AdGraphs, and show its effectiveness. Construct a graph for the set of transactions (event sequences). Nodes or States: events like ad impression, ad click, search keyword etc. Special nodes like a conversion node. Edges: between nodes representing consecutive events in the input.
Graph-based (Markov) Models
Archak, M., Muthukrishnan, WWW 2010 Graph-based (Markov) models show better fitness to data Compute Adfactors using AdGraphs, and show its effectiveness. Construct a graph for the set of transactions (event sequences). Nodes or States: events like ad impression, ad click, search keyword etc. Special nodes like a conversion node. Edges: between nodes representing consecutive events in the input. Edge weights: the frequency of pairs of events.
Graph-based (Markov) Models
Archak, M., Muthukrishnan, WWW 2010 Graph-based (Markov) models show better fitness to data Compute Adfactors using AdGraphs, and show its effectiveness. Construct a graph for the set of transactions (event sequences). Nodes or States: events like ad impression, ad click, search keyword etc. Special nodes like a conversion node. Edges: between nodes representing consecutive events in the input. Edge weights: the frequency of pairs of events. Edge weights → Transition probabilities between states in the Markov model.
Example: Markov Model and Budget Allocation
Null 0.6
0.8
Search
Search
Start
0.9
Generic
0.1
0.1
Retailer
0.4 Conversion
Example: Markov Model and Budget Allocation
Null 0.8/0.4
0.9/0.6
Start
0.9/0.9
0.1/0.1 Generic
0.2/0.2 0/0.2
Retailer
0.1/0.1 0/0.1
0/0.4
Conversion
Advertising actions may change transition probabilities, e.g. "advertise vs. not advertise" in each state may change the Markov model.
Example: Markov Model and Budget Allocation
Null 0.8/0.4
0.9/0.6
Start
0.9/0.9
0.1/0.1 Generic
0.2/0.2 0/0.2
Retailer
0.1/0.1 0/0.1
0/0.4
Conversion
Advertising actions may change transition probabilities, e.g. "advertise vs. not advertise" in each state may change the Markov model.
Budget Optimization Problem
Given: MDP model with state space X , advertising levels A: On each state x ∈ X , advertiser can take an advertising action a ∈ A. Each state x upon action a has a cost c(x, a), and each two states x and x 0 , upon action a on x, have a transition probability P x 0 ax
Budget Optimization Problem
Given: MDP model with state space X , advertising levels A: On each state x ∈ X , advertiser can take an advertising action a ∈ A. Each state x upon action a has a cost c(x, a), and each two states x and x 0 , upon action a on x, have a transition probability P x 0 ax
Advertising policy: Given a history of states, & time step t, determine an advertising action.
Budget Optimization Problem
Given: MDP model with state space X , advertising levels A: On each state x ∈ X , advertiser can take an advertising action a ∈ A. Each state x upon action a has a cost c(x, a), and each two states x and x 0 , upon action a on x, have a transition probability P x 0 ax
Advertising policy: Given a history of states, & time step t, determine an advertising action. Each advertising policy incurs some total cost, and results in some probability of conversion.
Budget Optimization Problem
Given: MDP model with state space X , advertising levels A: On each state x ∈ X , advertiser can take an advertising action a ∈ A. Each state x upon action a has a cost c(x, a), and each two states x and x 0 , upon action a on x, have a transition probability P x 0 ax
Advertising policy: Given a history of states, & time step t, determine an advertising action. Each advertising policy incurs some total cost, and results in some probability of conversion. Goal: Maximize probability of conversion. Constraint: for a budget V , total cost ≤ V .
This paper: Budget Optimization with Positive Carryover Effects An LP for the optimal advertising policy. Apply classical results from constrained MDP to this setting
This paper: Budget Optimization with Positive Carryover Effects An LP for the optimal advertising policy. Apply classical results from constrained MDP to this setting
An improved greedy algorithm in settings with Positive Carryover Effects Def: More advertising never hurts. Proof: Monotonicity and structural properties of the dual value function Advantage: Simple mapreducable algorithm with better running time
This paper: Budget Optimization with Positive Carryover Effects An LP for the optimal advertising policy. Apply classical results from constrained MDP to this setting
An improved greedy algorithm in settings with Positive Carryover Effects Def: More advertising never hurts. Proof: Monotonicity and structural properties of the dual value function Advantage: Simple mapreducable algorithm with better running time
Simulation Validation Compare the improved greedy algorithm, LP algorithm, and baseline greedy Improved greedy algorithm is almost the same as LP (without assumptions) Both have 5-10% improvement over baseline greedy.
Constrained MDP
The optimal policy is a Markov policy.
Constrained MDP
The optimal policy is a Markov policy. Markov policies ⇔ Occupancy measures ⇔ Stationary policies: using conservation flow linear equations.
Constrained MDP
The optimal policy is a Markov policy. Markov policies ⇔ Occupancy measures ⇔ Stationary policies: using conservation flow linear equations.
max ρ
s.t.
XX
r (x, a)ρ(x, a)
[P2]
x∈X 0 a∈A
XX
d(x, a)ρ(x, a)
≤V
x∈X 0 a∈A
XX
ρ(y , a)(δx (y ) − P yax ) = β(x) ∀x ∈ X 0
y ∈X 0 a∈A
ρ(x, a)
≥ 0 ∀x ∈ X 0 , a ∈ A.
Constrained MDP: Primal and Dual LP
max ρ
s.t.
XX x∈X 0
r (x, a)ρ(x, a)
[P2]
a∈A
XX
d(x, a)ρ(x, a)
≤V
ρ(y , a)(δx (y ) − P yax )
= β(x) ∀x ∈ X 0
x∈X 0 a∈A
XX y ∈X 0
a∈A
ρ(x, a)
≥ 0 ∀x ∈ X 0 , a ∈ A.
Constrained MDP: Primal and Dual LP
max ρ
s.t.
XX x∈X 0
[P2]
r (x, a)ρ(x, a)
a∈A
XX
d(x, a)ρ(x, a)
≤V
ρ(y , a)(δx (y ) − P yax )
= β(x) ∀x ∈ X 0
x∈X 0 a∈A
XX y ∈X 0
a∈A
≥ 0 ∀x ∈ X 0 , a ∈ A.
ρ(x, a) min π,λ
s.t.
X
[P3]
β(x)π(x) + λV
x∈X 0
λ≥0 π(x) ≥ r (x, a) − λd(x, a) +
X y ∈X 0
P xay π(y )
Constrained MDP: Primal and Dual LP
max ρ
s.t.
XX x∈X 0
[P2]
r (x, a)ρ(x, a)
a∈A
XX
d(x, a)ρ(x, a)
≤V
ρ(y , a)(δx (y ) − P yax )
= β(x) ∀x ∈ X 0
x∈X 0 a∈A
XX y ∈X 0
a∈A
≥ 0 ∀x ∈ X 0 , a ∈ A.
ρ(x, a) min π,λ
s.t.
X
[P3]
β(x)π(x) + λV
x∈X 0
λ≥0 π(x) ≥ r (x, a) − λd(x, a) +
X y ∈X 0
P xay π(y )
Constrained MDP: Modified Dual
min π,λ
s.t.
X
[P3]
β(x)π(x) + λV
x∈X 0
λ≥0 π(x) ≥ r (x, a) − λd(x, a) +
X y ∈X 0
∀x ∈ X 0 , a ∈ A
P xay π(y )
Constrained MDP: Modified Dual
min π,λ
s.t.
X
[P3]
β(x)π(x) + λV
x∈X 0
λ≥0 π(x) ≥ r (x, a) − λd(x, a) +
X
P xay π(y )
y ∈X 0
∀x ∈ X 0 , a ∈ A
min πλ
s.t.
X
[P3(λ)]
β(x)π(x)
x∈X 0
πλ (x) ≥ rλ (x, a) +
X y ∈X 0
∀x ∈ X 0 , a ∈ A
P xay πλ (y )
Positive Carryover Effects
min πλ
s.t.
X
[P3(λ)]
β(x)π(x)
x∈X 0
πλ (x) ≥ rλ (x, a) +
X
P xay πλ (y )
y ∈X 0
∀x ∈ X 0 , a ∈ A Assumption: More advertising never hurts, ...
Positive Carryover Effects
min πλ
s.t.
X
[P3(λ)]
β(x)π(x)
x∈X 0
πλ (x) ≥ rλ (x, a) +
X
P xay πλ (y )
y ∈X 0
∀x ∈ X 0 , a ∈ A Assumption: More advertising never hurts, ... fβ (λ) = optimum value of P3(λ).
Positive Carryover Effects
min πλ
s.t.
X
[P3(λ)]
β(x)π(x)
x∈X 0
πλ (x) ≥ rλ (x, a) +
X
P xay πλ (y )
y ∈X 0
∀x ∈ X 0 , a ∈ A Assumption: More advertising never hurts, ... fβ (λ) = optimum value of P3(λ). Lemma (Structure of Dual Value Function) fβ (λ) is a piecewise linear continuous function. Moreover, the slope of fβ at any particular λ is equal to −β T (I − P λ )−1 dλ ...
Greedy Algorithm
Algorithm: Iteratively and greedily, find a sequence of λi ’s for which we solve P3(λ). Lemma: At most |X | × |A| λi ’s are relevant.
Experimental Setup
Input: Paths to conversions. Time-stampted sequence of search clicks leading to conversions.
Experimental Setup
Input: Paths to conversions. Time-stampted sequence of search clicks leading to conversions.
Transition probabilities? "advertise": Frequencey of consecutive events with short time gap. "not advertise": Frequency of consecutive events with large time gap.
Experimental Evaluation
Summary
Budget optimization as constrained MDP, leading to an LP formulation. An improved greedy algorithm in settings Positive Carryover Effects Simulation Validation
Summary
Budget optimization as constrained MDP, leading to an LP formulation. An improved greedy algorithm in settings Positive Carryover Effects Simulation Validation Future Experiments: Markov model on conversion paths & a sample of non-conversion paths.
Thank You!