The Failure of Poisson Modeling John Blesswin

Outline Introduction Traces data TCP connection interarrivals TELNET packet interarrivals Fully modeling TELNET originator traffic FTPDATA connection arrivals Large-scale correlations and possible connections to selfsimilarity • Implications • • • • • • •

1. Introduction (1) • In many studies, both local-area and wide-area network traffic, the distribution of packet interarrivals clearly differs from exponential. – [JR86,G90,FL91,DJCME92]

• For self-similar traffic, there is no natural length for a “burst”; traffic bursts appear on a wide range of time scales. • Poisson processes are valid only for modeling the arrival of user sessions – TELNET connections, FTP control connections – WAN packet arrival processes appear better modeled using selfsimilar process

1. Introduction (2) • This paper show that, in some cases commonly-used Poisson models seriously underestimate the burstiness of TCP traffic over a wide range of time scales. (time scales >= 0.1 sec) • Using the empirical TCPlib distribution for TELNET packet interarrivals instead results in packet arrival process significantly burstier than Poisson arrivals. • For small machine-generated bulk transfers such as SMTP(email) and NNTP(network news), connection arrivals are not well modeled as Poisson.

1. Introduction (3) • For large bulk transfer, FTPDATA traffic structure is quite different than suggested by Poisson models. – FTPDATA in bytes in each burst has a very heavy upper tail – A small fraction of the largest bursts carries almost all of the FTPDATA bytes. • Poisson arrival processes are quite limited in their burstiness, especially when multiplexed to a high degree. • Wide-area traffic is much burstier than Poisson models predict over many time scales.

Autocorrelation Coefficient

Autocorrelation Function +1

Typical long-range dependent process

0 Typical short-range dependent process -1

0

lag k

100

2. Traces used

Packet drop less than<=5*10-6

Packet drop less than<=0.00025

3. TCP connection interarrivals • DEC1-3 24-hour pattern – One-hour intervals all protocols are well-modeled by a Poisson process – Ten-minute intervals only FTP session and TELNET session arrivals are statistically consistent with Poisson arrivals. – The arrivals of NNTP,FTPDATA, and WWW connections are not Poisson processes.

Appendix A Methodology for testing for Poisson arrivals • Poisson arrivals have two key characteristics: – Exponentially distributed, and independent

• Using the Anderson-Darling test (A2) – Empirical distribution test

4. TELNET packet interarrivals • They will usually include both echoes of the user’s keystrokes and larger bursts of bulktransfer consisting of output generated by the user’s remote commands. • Unlike the exponential distribution, the empirical distribution of TELNET packet interarrival times is heavy-tailed.

Geometric mean Arithmetic mean

• Shorter interarrivals will be overestimated • Longer interarrivals will be underestimated • For exponential distribution models – Full 25% of the interarrivals as being less than 8 msec, 2% being longer than 1 sec

• For actual data under 2% were less than 8 msec, over 15% more than 1 sec

• The interarrival, the main body of the observed distribution fits very well to a Pareto distribution – Shape parameter β ~= 0.9~0.95

Appendix B Pareto distributions • Shape parameter • Location parameter • Power-law distribution, double-exponential distribution, and the hyperbolic distribution • To model distributions of incomes exceeding a minimum value, and size of asteroids, islands, cities and extinction events

Pareto distribution

a: location parameter β : shape parameter β <= 2 has infinite variance,β <= 1 has infinite mean

Pareto distribution • For heavy-tailed defines a distribution of heavy-tailed

Pareto distribution in NS2 • • • • • • • • • •

set rng [new RNG] $rng seed 2 puts “Testing Pareto Distribution” set r1 [new RandomVariable/Pareto] $r1 use-rng $rng $r1 set avg_ 10.0 $r1 set shape_ 1.2 for {set i 1} {$i <=3} {incr i} { puts [$r1 value] }

More clustered

The same mean 1.1 seconds for both

Multiplexing packet arrival processes • 10 mins simulations with 100 active TELNET connections • All connections were active fro the entire duration of the simulation. • Multiplexing packet arrival processes • Tcplib – Mean 92, variance of 240

• Exponential – Mean 92, variance of 97

Aggregation size

Comparisons of actual and exponential TELNET packet interarrival times

5. Fully modeling TELNET originator traffic • Telnet connection arrivals are well-modeled as Poisson process • Telnet packet interarrival times can be modeled by Tcplib • The connection size in bytes has been modeled by log-normal distribution[P94a] • Construct a complete model of TELNET – Only by the connection arrival rate parameter

Appendix E. Log-normal distributions

Log-normal distribution • 當觀測的數據為右傾(skew to the right), 常可 以對數常態為其模式。例如, 國民所得之分 佈通常為右傾: 高收入的人較少, 低收入的 人較多。

Appendix E. M/G/∞ and log-normal distribution • If F is a Pareto distribution, then the count process from the M/G/∞ model is asymptotically self-similar • If the lifetime have a log-normal distribution, the count process from M/G/∞ model is not long-range dependent

Log-normal distributions • the Pareto, log-normal, Weibull distributions are all defined as long-tailed.

6. FTPDATA connection arrivals • FTPDATA connections within a session are clustered in bursts, – Burst size in bytes is quite heavy-tailed – Half of the FTP traffic volume comes from the largest 0.5% of the FTPDATA bursts. – These bursts completely dominate FTP traffic dynamics

• The FTPDATA packet arrival process for an FTPDATA connection is largely determined by network factors – Available bandwidth, congestion, TCP congestion control

FTPDATA • FTPDATA packet interarrivals are far from exponential[DJCME92]

Better approximated using log-normal

2%

(bursts,connections) 0.5%

• The distribution of the number of connections per burst is well-modeled as Pareto distribution

7. Large-scale correlations and possible connections to self-similarity

• kr(k) =  (long range dependence) 

 



For models with only short range dependence, H is almost always 0.5 For self-similar processes, 0.5 < H < 1.0 This discrepancy is called the Hurst Effect, and H is called the Hurst parameter Single parameter to characterize self-similar processes

7. Producing self-similar traffic(1) • There are several methods for producing self-similar traffic – Multiplexing ON/OFF sources, fixed rate in the ON periods, ON/OFF period lengths are heavy-tailed – M/G/  • Xt is the number of customers in the system at time t • Count process {Xt}t=0,1,2… • Multiplexing constant-rate connections that have Poisson connection arrivals and a heavy-tailed distribution for connection lifetimes • Result in self-similar traffic

Producing self-similar traffic(2) • Using i.i.d Pareto interarrivals with β~=1

Relating the methods to traffic models-TELNET • On smaller time scales – i.i.d Pareto

• On large time scales – M/G/∞ –

Relating the methods to traffic models-FTP • Per FTP traffic fits in some respects to the M/G/∞ model of Poisson arrivals with heavy-tailed lifetimes. • FTP sessions have Poisson arrivals. • Mul plexed FTP traffic differs from the M/G/∞ model of self-similar traffic with constant-rate connection – TCP congestion control

• Modify M/G/∞-> M/G/k – Limited capacity

Large-scale correlations in general wide-area traffic

Fractional Gaussian process

• Fractional Gaussian noise (FGN) [22] – Gaussian process with mean , variance 2, and – Autocorrelation function r(k)=(|k+1|2H-|k|2H+|k-1|2H), k>0 – Exactly second-order self-similar with 0.5
8. Implications (1) • Modeling TCP traffic using Poisson or other models that do not accurately reflect the long-range dependence in actual traffic. – Underestimate the delay and maximum queue size

• Linear increases in buffer size do not result in large decreases in packet drop rates • Slight increase in the number of active connections can results in large increase in the packet loss rate • In reality “traffic spikes” ride on longer-term “ripples”. – Detect the low-frequency congestion

8. Implications (2) • For FTP, a wide area link might have only one or two such bursts an hour, but they dominate that hour’s FTP traffic • Suggest that any one interested in accurate modeling of wide-area traffic should begin by studying self-similarity.

The Failure of Poisson Modeling -

The Failure of Poisson Modeling. John Blesswin. Page 2. Outline. • Introduction. • Traces data. • TCP connection interarrivals. • TELNET packet interarrivals.

469KB Sizes 0 Downloads 235 Views

Recommend Documents

Modeling Failure in Composite Materials with the ...
In Eq. (10) C is the constitutive matrix for an isotropic linear elastic material, and Bu ...... software. Note that some of the apparent stress irregularities around the ...

Hierarchical Poisson Registration of Longitudinal ...
Hirschi and Gottfredson (1983) injected controversy in criminology when they ... multiple latent classes, specifies individual trajectories to be polynomial in age, ...

Local Semi-Parametric Efficiency of the Poisson Fixed ...
Jun 7, 2016 - tor which takes advantage of the assumptions of Poisson distribution and independent draws over time to derive a conditional distribution of the dependent variable that does not depend on the distribution of unobserved heterogeneity. In

Failure Rate Modeling Using Equipment Inspection Data - IEEE Xplore
Page 1 ... of customizing failure rates using equipment inspection data. This ... that the incorporation of condition data leads to richer reliability models.

Percolation in the vacant set of Poisson cylinders
∗The Weizmann Institute of Science, Faculty of Mathematics and Computer Science, .... B2 and turns out not to be problematic by a suitable choice of an). Of course, ... For A ⊆ Rd and t > 0, we define the t-neighborhood of A as the open set.

Poisson 1827.pdf
Page 1 of 4. ANNALES_. DE. CHIMIE ET DE PHYSIQUE,. Par MM. GAY-LUSSAC et AHAGO. TOME TRENTE-CINQULEME. A PARIS_. Chez CROCIJ:ARD. Libra ire, cloitre Saint-Benoit. D0 I(>' . pres Ia rue des Mathnrins. 18 2 7·. Page 1 of 4 ...

Information Aggregation in Poisson-Elections
Nov 28, 2016 - The modern Condorcet jury theorem states that under weak conditions, elections will aggregate information when the population is large, ...

Stojan, Privatisation failure and failure to privatise the slovene ...
... this official figure understates the true involvement of the state in the Slovene. economy. It accounts neither for investments held by state-owned enterprises ...

Stein's method and Normal approximation of Poisson ... - Project Euclid
tion) kernel on Zp+q−r−l, which reduces the number of variables in the product ..... Now fix z ∈ Z. There are three possible cases: (i) z /∈ A and z /∈ B, (ii) z ∈ A,.

On Outage Capacity of MIMO Poisson Fading Channels
Free space optics is emerging as an attractive technology for several applications ... inexpensive components, seamless wireless extension of the optical fiber ...

The φ Accrual Failure Detector
May 10, 2004 - The particularity of the ϕ failure detector is that it dynamically adjusts to current network conditions .... The protocol uses arrival times sampled in the recent past to compute an estimation of the arrival time of the ..... [18] pr

Validation of Poisson-Boltzmann Electrostatic Potential ...
Relationships) study was performed on five diverse data sets that have been .... The first data set is the steroid data set of Cramer et al., upon which the CoMFA ...

A singularly perturbed Dirichlet problem for the Poisson ...
[8] M. Dalla Riva and M. Lanza de Cristoforis, A singularly perturbed nonlinear trac- tion boundary value problem for linearized elastostatics. A functional analytic approach. Analysis (Munich) 30 (2010), 67–92. [9] M. Dalla Riva, M. Lanza de Crist

On Outage Capacity of MIMO Poisson Fading Channels
∗Department of Electrical and Computer Engineering. University of ... mile connectivity, fiber backup, RF-wireless backhaul and enterprise connectivity [10].

The failure of the radical democratic imaginary
ideology that Zˇizek, deeply critical, associates with the shortcomings of multi-culturalism and .... system of signifiers (Laclau and Mouffe, 1985: 105). ... 'administration of things'. ..... courses of today all implicitly do – is a stoking of t

“The Secret of all failure is failure in Secret prayer ... -
in dealing with the changes. Please pray for healing and her salvation. 6. Please keep praying for Lisa's (Peter) Ting (Canaan) aunt's 5th daugther, Angel Lai,.

Poisson approximations on the free Wigner chaos
where the intervals (ai,bi), i = 1, ..., q, are pairwise disjoint; extend linearly the ..... Http adress: www.iecn.u-nancy.fr/∼nourdin/steinmalliavin.html. [11] I. Nourdin ...

Remarks on Poisson actions - Kirill Mackenzie
Jan 29, 2010 - Abstract. This talk sketches an overview of Poisson actions, developing my paper 'A ... And there is an infinitesimal action of g∗ on G. Namely, any ξ ∈ g∗ ..... That is, it is T∗M pulled back to P over f . As a Lie algebroid