The Many Ways Your Monitoring Is Lying To You Sebastian Kirsch SRECon16 Europe, Dublin, July 11.-13. 2016

The Map Is Not The Territory



Time-Series Based Monitoring

environment

server 1

environment

server 2

requests: 8

requests: 14

requests: server 1: … 8 12 16 server 2: … 14 18 22 server 3: … 5 9 13 server 4: … 16 20 24 sum(requests): … 43 59 75 rate(requests): … 16 16 16

monitoring

aggregator

requests: 5 environment

environment

server 3

server 4

Time-Series Based Monitoring

requests: 16

other servers

other data sources

dashboard

Lies of Omission

job:request:rate = sum(rate(task:requests:total))

server 1

requests: 0

0

1

2

3

4

1

1

1

1

rate

1 server 2

server 3

server 4

requests: 0

requests: 0

requests: 0

1

1

1

0

1

2

3

4

1

1

1

1

0

1

2

3

4

1

1

1

1

0

1

2

3

4

1

1

1

1

4

4

4

4

sum

Lies of Omission

Lies of Omission

job:request:rate = sum(rate(task:requests:total))

server 1

requests: 0

0

1

X 3

4

1

X 2

1

rate

1 server 2

server 3

server 4

requests: 0

requests: 0

requests: 0

X 2

1

0

1

2

3

4

1

1

1

1

0

1

2

3

4

1

1

1

1

0

1

2

3

4

1

1

1

1

4

3

5

4

sum

Lies of Omission

Lies of Granularity

Lies of Granularity

Lies of Granularity

Lies of Granularity

Lies of Granularity

Lies of Granularity

Lies of Perspective

monitoring

Lies of Perspective

requests: 0 server-errors: 0

requests: 1 server-errors: 1

server

client

client

server

monitoring requests: 0 errors: 0

requests: 4 errors: 0

requests: 4 10 errors: 0 3

Lies of Perspective

Lying through Alignment

job:memory:mean = sum(task:memory) / job:num_tasks_up

Lying through Alignment

Lies of Presentation

Lies of Presentation

Lies of Presentation

Lies of Presentation

Lies of Presentation

Lying through Selection

top(5, task:errors:rate)

Lies through Selection

Lies, Damn Lies, and Misuse of Statistics

Lies, Damn Lies, and Misuse of Statistics

95th percentile ≈ 130ms

Lies, Damn Lies, and Misuse of Statistics

Lies, Damn Lies, and Misuse of Statistics

95th percentile ≈ 130ms

Lies, Damn Lies, and Misuse of Statistics

95th percentile ≈ 150ms

Lies, Damn Lies, and Misuse of Statistics

95th percentile ≈ 180ms

Lies, Damn Lies, and Misuse of Statistics

95th percentile ≈ 250ms

Lies, Damn Lies, and Misuse of Statistics

Summary

Acknowledgments Alexander Jolk, Etienne Pierre, Jules Anderson, Gráinne Sheerin, Jukka Laurila, Mike Han, Pawel Stradomski, Ralf Wildenhues

Q&A

The Many Ways Your Monitoring Is Lying To You - Research at Google

requests: server 1: … 8 12 16 server 2: … 14 18 22 server 3: … 5 9 13 server 4: … 16 20 24 sum(requests): … 43 59 75 rate(requests): … 16 16 16 requests: 8 requests: 14 requests: 5 requests: 16. Time-Series Based Monitoring. Page 5. Lies of Omission. Page 6. server 1. 1 1 1 1. 0 1 2 3 4. 1 1 1 1. 0 1 2 3 4. 1 1 1 1. 0 1 2 3 ...

1MB Sizes 1 Downloads 65 Views

Recommend Documents

The curse of three dimensions: Why your brain is lying to you - GitHub
represent the underlying data accurately should be avoided. In this paper, we examine a second ..... graphical methods for analyzing scientific data. Science,.

The curse of three dimensions: Why your brain is lying to you - GitHub
we examine a second level of graph distortion that occurs during the perceptual ... Author's address: S. VanderPlas, Snedecor Hall, Iowa State University, Ames, IA 50010; email: ..... This comparison holds with scatterplots (such as figure 6) ..... t

man-113\so-many-ways-to-love-you-usher.pdf
man-113\so-many-ways-to-love-you-usher.pdf. man-113\so-many-ways-to-love-you-usher.pdf. Open. Extract. Open with. Sign In. Main menu.

Do you know your IQ? A research agenda for ... - Research at Google
Apr 1, 2009 - ... alternatives. Tracking IQ for data sources through the system can provide users .... network performance metrics; energy consumed; system logs and application .... 2. explore alternative processing pipelines/DAGs, using the.

You have booked at Dichungtaxi.com, to meet your ...
Contact: Your driver contact was sent 2 hours before departure via SMS. Dichungtaxi Support (from 8a.m to 9p.m. monday to sunday): [email protected]ng.vn.

You have booked at Dichungtaxi.com, to meet your ...
In case of delayed Flight: If you have send your correct flight number, the driver will track the flight status and wait for you after luggage pick-up and immigration.

What is the Computational Value of Finite ... - Research at Google
Aug 1, 2016 - a substantial constant overhead against physical QA: D-Wave 2X again runs up ... ization dynamics of a system in contact with a slowly cooling.

The-CRA-s-Guide-To-Monitoring-Clinical-Research-Third-Edition.pdf
Retrying... Whoops! There was a problem previewing this document. Retrying... Download. Connect more apps... Try one of the apps below to open or edit this item. The-CRA-s-Guide-To-Monitoring-Clinical-Research-Third-Edition.pdf. The-CRA-s-Guide-To-Mo

The-CRA-s-Guide-To-Monitoring-Clinical-Research-Third-Edition.pdf
Retrying... Whoops! There was a problem previewing this document. Retrying... Download. Connect more apps... Try one of the apps below to open or edit this item. The-CRA-s-Guide-To-Monitoring-Clinical-Research-Third-Edition.pdf. The-CRA-s-Guide-To-Mo

The Many Ways To Go APES (120 facts).pdf
Whoops! There was a problem loading more pages. The Many Ways To Go APES (120 facts).pdf. The Many Ways To Go APES (120 facts).pdf. Open. Extract.