Overlapping Experiment Infrastructure: More, Better, Faster Diane Tang, Ashish Agarwal, Mike Meyer, Deirdre O'Brien

Why run experiments? Experiments: Live traffic = incoming search queries Experiments vs. experiment groups Gathers data on impact of changes How do users behave differently, if at all? Data-driven decisions: UI Algorithms

Why run experiments? Gathers data on impact of changes How do users behave differently, if at all? Data-driven decisions: UI

Why run experiments? Gathers data on impact of changes How do users behave differently, if at all? Test everything! Data-driven decisions UI

Why run experiments? Gathers data on impact of changes How do users behave differently, if at all? Data-driven decisions UI Algorithms, e.g. CTR prediction How many passes over the data Date range Different machine learning algorithms

Why run so many experiments? Goal: maintain innovation while growing More: More simultaneous experiments More variety in the types of experiments supported Better: Valid experiments Robust experiment design Faster: Easy and quick experiment set-up Experimental data available quickly and automatically Quick iteration

Why is running so many expts hard? Infinite traffic, right? Wrong! High variability of metrics English vs. Swahili "flowers" vs. "who said 'if i had the time, this letter would be shorter'" Low trigger rate changes e.g., weather information Consequence: experiments need a lot of traffic to get statistically significant results in a reasonable timeframe

Basic Experiment Definitions Incoming search query request R has: Cookie C Conditions T Query language, User country, Browser, etc. System has parameters E.g., top ad background color, Google Suggest on or off Default value Experiment: Diversion: is a request in the experiment? Conditions Unit of diversion: cookie vs. traffic Experiment parameter values

Extreme 1: Single Layer Our experiment infrastructure prior to 2007 Every request in at most one experiment Straightforward, but insufficiently scalable Variability Low trigger rate

Scaling the Single Layer Use incoming traffic more effectively by understanding which conditions are disjoint with other conditions e.g., Brazil vs. Japan (country) other examples: language, browser Increases scalability but more complex, more fragmentation

Extreme 2: Multi-factorial Expt Design Vary each parameter independently Issues: Must serve valid pages only e.g., blue text on blue background

Constantly changing system Adding / removing parameters Different experiments use different sets of parameters Can't design once and be done with it

Layers: Multiplies number of expts Partition parameters into sets --> layers Experiments can only modify parameters associated with that layer Each layer independent of every other layer Controls and experiments must be in same layer

Domains: Nesting to increase flexibility Domains: contain layers Layers: contain domains and experiments Nesting: Allows for different partitioning of parameters Trade-off: less efficient use of space due to fragmentation

Nesting: another example

Nesting: one last example

Merging Experiment Parameters Can we relax the constraint of associating each parameter with only one layer? Consequence: request could be in two experiments, each modifying the same parameter How to merge parameter values? Well-defined composition function, e.g., multiplication Well-understood parameter Example: Threshold t with base value V Layer 1: experiment with multiplier 1.5, control: 1.0 Layer 2: experiment with multiplier 2.0 control: 1.0 4 possibilities: t * 1.5 * 1.0 t * 1.0 * 1.0 t * 2.0 * 1.5

More: Results

Conclusions Overlapping experiment infrastructure delivers scalability & flexibility Conditions Layers Domains Mergeable parameters More than infrastructure needed though: Tools Experiment Design (sizing, finding cookies, experiment config) Analysis Education Culture

Questions?

Overlapping Experiment Infrastructure:More ... - Research at Google

... and quick experiment set-up. Experimental data available quickly and automatically ... Layers: contain domains and experiments. Nesting: Allows for different ...

960KB Sizes 4 Downloads 182 Views

Recommend Documents

Overlapping Experiment Infrastructure: More ... - Research at Google
Jul 28, 2010 - Android, Chrome, etc. At a high level, users interact with Google by sending requests for web pages via their browser. For search results pages, ...

Contrastive Summarization: An Experiment with ... - Research at Google
summarizer in the consumer reviews domain. 1 Introduction. Automatic summarization has historically focused on summarizing events, a task embodied in the.

Galvani's experiment at the nanoscale
platelets of hard phase (for example, hydroxyapatite. ARTICLE ... 7280, Signal Recovery) as described elsewhere [19]. A .... data on the piezoelectric properties of biomolecules and ..... 3, American Scientific Publishers, Los. Angeles, 2004, pp.

Mathematics at - Research at Google
Index. 1. How Google started. 2. PageRank. 3. Gallery of Mathematics. 4. Questions ... http://www.google.es/intl/es/about/corporate/company/history.html. ○.

Sharir - Overlapping Architectures.pdf
1Alon Brutzkus & Amir Globerson. Globally Optimal Gradient Descent for a. ConvNet with Gaussian Inputs. ICML 2017. Sharir & Shashua (HUJI) Expressiveness of Overlapping Architectures 30/06/17 3 / 21. Page 4 of 51. Sharir - Overlapping Architectures.p

Faucet - Research at Google
infrastructure, allowing new network services and bug fixes to be rapidly and safely .... as shown in figure 1, realizing the benefits of SDN in that network without ...

BeyondCorp - Research at Google
41, NO. 1 www.usenix.org. BeyondCorp. Design to Deployment at Google ... internal networks and external networks to be completely untrusted, and ... the Trust Inferer, Device Inventory Service, Access Control Engine, Access Policy, Gate-.

VP8 - Research at Google
coding and parallel processing friendly data partitioning; section 8 .... 4. REFERENCE FRAMES. VP8 uses three types of reference frames for inter prediction: ...

JSWhiz - Research at Google
Feb 27, 2013 - and delete memory allocation API requiring matching calls. This situation is further ... process to find memory leaks in Section 3. In this section we ... bile devices, such as Chromebooks or mobile tablets, which typically have less .

Yiddish - Research at Google
translation system for these language pairs, although online dictionaries exist. ..... http://www.unesco.org/culture/ich/index.php?pg=00206. Haifeng Wang, Hua ...

traits.js - Research at Google
on the first page. To copy otherwise, to republish, to post on servers or to redistribute ..... quite pleasant to use as a library without dedicated syntax. Nevertheless ...

sysadmin - Research at Google
On-call/pager response is critical to the immediate health of the service, and ... Resolving each on-call incident takes between minutes ..... The conference has.

Introduction - Research at Google
Although most state-of-the-art approaches to speech recognition are based on the use of. HMMs and .... Figure 1.1 Illustration of the notion of margin. additional ...

References - Research at Google
A. Blum and J. Hartline. Near-Optimal Online Auctions. ... Sponsored search auctions via machine learning. ... Envy-Free Auction for Digital Goods. In Proc. of 4th ...

BeyondCorp - Research at Google
Dec 6, 2014 - Rather, one should assume that an internal network is as fraught with danger as .... service-level authorization to enterprise applications on a.

Browse - Research at Google
tion rates, including website popularity (top web- .... Several of the Internet's most popular web- sites .... can't capture search, e-mail, or social media when they ..... 10%. N/A. Table 2: HTTPS support among each set of websites, February 2017.

Continuous Pipelines at Google - Research at Google
May 12, 2015 - Origin of the Pipeline Design Pattern. Initial Effect of Big Data on the Simple Pipeline Pattern. Challenges to the Periodic Pipeline Pattern.

Accuracy at the Top - Research at Google
We define an algorithm optimizing a convex surrogate of the ... as search engines or recommendation systems, since most users of these systems browse or ...

slide - Research at Google
Gunhee Kim1. Seil Na1. Jisung Kim2. Sangho Lee1. Youngjae Yu1. Code : https://github.com/seilna/youtube8m. Team SNUVL X SKT (8th Ranked). 1 ... Page 9 ...

1 - Research at Google
nated marketing areas (DMA, [3]), provides a significant qual- ity boost to the LM, ... geo-LM in Eq. (1). The direct use of Stolcke entropy pruning [8] becomes far from straight- .... 10-best hypotheses output by the 1-st pass LM. Decoding each of .

1 - Research at Google
circles on to a nD grid, as illustrated in Figure 6 in 2D. ... Figure 6: Illustration of the simultaneous rasterization of ..... 335373), and gifts from Adobe Research.

Condor - Research at Google
1. INTRODUCTION. During the design of a datacenter topology, a network ar- chitect must balance .... communication with applications and services located on.

practice - Research at Google
used software such as OpenSSL or Bash, or celebrity photographs stolen and ... because of ill-timed software updates ... passwords, but account compromise.