Energy-Proportional Networked Systems Dejan Kostić EPFL, Switzerland Networked Systems Laboratory

Networked Systems Laboratory

2

Our Mission • Make distributed systems more reliable and easier to develop and manage

• Build networked systems that mimic the energyproportionality of biological systems

3

Networking Energy inEnergy ICT 20% of total server energy consumption (3 TWh in US in 2006) Tens of TWh/year by 2015 for broadband equipment

ACCESS NETWORK

DATACENTER NETWORK

CORE NETWORK

Datacenter

Several TWh/year for major telcos (Telefonica 4.5 TWh, Verizon 9.9 TWh) 4

Causes of networking energy consumption • Network redundancy – Achieving high availability

• Bandwidth overprovisioning – Tolerate traffic variations (address lack of QoS)

[SWITCH] 5

Energy-(un)proportionality 100

Core Router

Power (% of peak)

80

Home gateway 60

Typical utilization levels

40

Existing networking hardware

20

Ideal energy-proportionality 0

0

20

40

60

Utilization (%)

80

100 6

Networking energy outlook • More demands will result in further increases – Video streaming, Cloud computing

• CMOS reaching a plateau in power-efficiency – Cooling costs of new equipment will increase • 1 MW for latest Cisco platform, CRS-1

• Power is not limitless – 60 Amps per rack Rate of traffic increase > rate in which underlying technologies improve their energy efficiency 7

Datacenter network (yesterday’s tree)

8

Datacenter network (today’s fat tree)

9

Reduced or no cooling

10

Threats to Internet’s growth Energy deliver/ Power consumption, Coolingdelivery/ Power problems Cooling problems

DATACENTER NETWORK

CORE NETWORK ACCESS NETWORK

Datacenter

Excessive energy consumption 11

Goal: Energy-Proportional Networked Systems 100

Power (% of peak)

80

60

40

20

Goal Ideal energy-proportionality

0

0

20

40

60

Utilization (%)

80

100 12

Make all devices energy-proportional? 100

… but it is hard: • CMOS energy-efficiency limits • Performance penalties • Always-on components

Power (% of peak)

80

60

CPU

DRAM

Disk

Other

100.00

40

Power (% of peak)

90.00 80.00 70.00 60.00 50.00

20

40.00 30.00 20.00

Ideal energy-proportionality

10.00 0.00

0 Idle 0

7

14

21

20

29

36

43

50

57

64

71

40Compute load (%) 60

“Highly efficient” Google Utilization (%)server:

79

86

80

93

100

100 13

Network-wide energy-proportionality Sleeping saves energy Dynamically match resources to the demand  make the network energy-proportional

14

A simple optimization problem? 100

Power (% of peak)

80

Computational intensity 60

Maintaining SLOs

Avoiding oscillations Responsiveness to traffic variations

40

Ease of deployment 20

Goal Ideal energy-proportionality

0

0

20

40

60

Utilization (%)

80

100 15

Overview REsPoNse

Energy deliver/ Power consumption, Coolingdelivery/ Power problems Cooling problems

DATACENTER NETWORK

CORE NETWORK ACCESS NETWORK

Datacenter

BH2 Excessive energy consumption 16

Access dominates energy consumption

Backbone/ Metro/ Transport

20-30%

70-80%

ACCESS

17

A typical DSL access network ACCESS ISP PART USER PART Gateway CORE

METRO

18

A typical DSL access network ACCESS ISP PART USER PART Gateway CORE

METRO

Cable bundle Central Office

DSL Access Multiplexers (DSLAMs) 19

WHY DOES THE ACCESS CONSUME SO MUCH?

20

#1: Huge number of devices Individually, they do not consume a lot

But collectively …

2 orders of magnitude more gateways than DSLAMs 1 order of magnitude more DSLAMs than metro devices PhotoBlackburn 2 orders of magnitude more DSLAMs than backbone devices 21

2#: High per bit energy consumption Backbone/ Metro

ACCESS

At full load, access devices 2-3 orders of magnitude higher than metro/backbone 22

3#: Utilization < 10%

Average utilization [%]

Daily utilization of 10K access links in a commercial ADSL provider 10%

uplink

8%

downlink

6% 4% 2%

0% 0

5

10

Time [h]

15

20 23

Sleeping saves energy

Sleep-on-Idle (SoI) Devices enter sleep mode upon periods of inactivity 24

SoI fails in access networks ACCESS ISP PART USER PART

O

An ADSL line needs 1 minute to wake up … but cannot enjoy a minute’s sleep 25

What if we can put 80% of gateways to sleep? 100 W

15 W

1 W per modem Save big fraction at the user side

ISPs … not so much 26

Line cards very unlikely to sleep by SoI Line cards

DSLAM

Modem on Modem off

Static assignment of lines to DSLAM ports is a problem

27

OUR APPROACH [SIGCOMM ‘11] ⟹Greening the user part: aggregation ⟹Greening the ISP part: line switching

28

Greening the user part – Aggregation

Broadband Hitch-Hiking (BH2) Threshold-based heuristic algorithm: direct traffic to neighbor gateways during light traffic conditions

On average 5-6 WiFi networks overlap in typical urban settings 29

Broadband Hitch-Hiking 2 (BH )

Load on neighbor home gateway gateway is low is high low  direct  look golight back for traffic to to a neighbor another home gateway neighbor gateway or goand back lettohome homegateway gatewaysleep 30

Greening the ISP part – Line switching

DSLAM

Line cards

40-way switch Full switching maximizes savings … but cost quickly grows with the number of ways 31

Small 4-way switches are enough 4-way switches

DSLAM

Line cards

Put line cards to sleep Each micro-electro-mechanical k-switch packs active linesswitches to the top Simple with near-zero power consumption 32

How much energy can we save? Energy savings vs no-sleep [%]

[trace-based simulation] 100

Optimal

80

BH2 + k-switch

60 40

SoI

20 0 0

5

10

Time [h]

15

20

24

BH2 + k-switch saves 66% Optimal savings are 80% 33

A performance bonus Bonus: reduced crosstalk 14

12

4

24 11

1

23 10

3

22

9

15

6 17

2 8

16

5

7

21 20 19

18

50

Avg. speedup [%]

13

62 Mbps; loop lengths 50-600 m

40 30 20 10 0 0

2

4

6

8

10

12

16

Number of inactive lines

20

Powering off lines makes the remaining … go faster due to reduced crosstalk! 34

Overview REsPoNse

Energy deliver/ Power consumption, Coolingdelivery/ Power problems Cooling problems

DATACENTER NETWORK

CORE NETWORK ACCESS NETWORK

Datacenter

BH2 Excessive energy consumption 35

Routing table computation • Goal: match network resources to traffic • Routing that minimizes energy consumption – Multi-commodity flow problem, but with additional constraints for energy objective: Links + routers (switches) on/off

– Problem is computationally intensive – Heuristics take 5-15 minutes for small topologies

When traffic demand changes, optimal routing changes! 36

r ]

6

[Geant2 - European academic network, 15-day trace]

o

5

e

r

4

[ p

3

t e

t r a c e g r a n u la r it y ( u p p e r b o u n d )

M a y -2 8

J u n -1

J u n -5

J u n -9

t i o o c

2

e

Routing table recomputed 3-4 times per hour! (state-of-the-art)

3

m

p

T im e

4

t a

t r a f f ic d e m a n d s

u

1

0

5

n

r a

2

R

D

e

m

a

n

d

h

[ G

u

b

p

s

]

How often is recomputation needed?

1

0

M a y -2 8

J u n -1

J u n -5 T im e

J u n -9

37

Issues with recomputation Traffic Volume

1) Recomputation wastes energy or causes congestion 2) Oscillations 3) Complexity

Time

Recomputation causing congestion

Recomputation causing energy waste 38

Can we precompute routing tables? One routing table used 60% of the time

Fraction of time an energy-optimal routing configuration is used (Geant2 trace) Too many routing configurations 39

Insight CDF of Optimal paths included

120 100 80

Geant FatTree

60 40

20 0 1

2

3

4

5

Number of alternative paths

Just a few precomputed paths offer near-optimal energy savings 40

REsPoNse (Responsive Energy-Proportional Networks)

Always-on paths provide a routing that can carry low to medium amounts of traffic at the lowest energy consumption On-demand paths start carrying traffic when the load is beyond the capacity offered by the always-on paths Failover paths are designed to minimize the impact of single failures

41

Service-Level Objectives

REsPoNse Overview Energy-aware Traffic Engineering (EATe) [e-Energy ‘10]

Runtime Traffic Measurement [COMSNETS ‘09] Online components Offline components

Energy-proportional Routing Table Computation [CoNEXT ‘11]

Traffic Estimation

42

REsPoNse routing example G

D A

K

B

C

E

F

H

J

43

Traffic Volume

REsPoNse in action

Time

Online adaptation Recomputation causing congestion

Recomputation causing energy waste 44

REsPoNse benefits • • • •

Energy savings match state-of-the-art Quick, stable adaptation to traffic changes Deployable Power/cooling provisioning for common case

45

r k

120

• Replayed a 15-day trace • 2 power models

100

o

r i g

i n

a

l

n

e

t w

o

Responsiveness/Energy-Proportionality (Geant)

– Today – Future (static power is significantly reduced)

60

40

P

o

w

e

r

[ %

80

20

0

ospf R E sP oN se R E s P o N s e ( A lt e r n a t iv e H W m o d e l) M a y -2 8

J u n -1

J u n -9

s ]

T im e 6

REsPoNse saves 30% - 45% with adding only 1 carefully precomputed routing table

5

4

3

D

e

m

a

n

d

[ G

b

p

J u n -5

2

1

0

tr a ffic d e m a n d s M a y -2 8

J u n -1

J u n -5 T im e

J u n -9

46

Responsiveness/stability 10 Click open-source routers in a diamond topology (16 ms per-hop latency ) 8

middle lower upper

7 6

Rate (Mbps)

5 4 3 2 1 0 4

4.5

5

5.5

6

6.5

Time elapsed (s)

EATe starts running

Link failure

EATe quickly and in a stable manner shifts traffic as needed (either to save energy or to avoid failed links) 47

resources1

Workload 1 volume/type

Map cloud workloads to minimal resources for power and cooling

Time resources 2

Workload 2 volume/type

State-of-the-art performance tuning takes several minutes, every time the workload changes Time

resources3

Workload 3 volume/type

DejaVu [ASPLOS ’12]: 10-15 seconds to adapt

Time

Time

Conclusions • Energy can substantially limit growth of networked systems • BH2 (Aggregation) + switching saves 66% of access energy • Turning DSL modems off increases performance • REsPoNse: hybrid approach in backbone and data centers • Enables provisioning power/cooling for the common case

49

Thanks! BH2:

Marco Canini, Eduard Goma, Alberto Lopez, Nikolaos Laoutaris, Pablo Rodriguez, Rade Stanojević, Pablo Yague

REsPoNse:

Nedeljko Vasić, Dejan Novaković, Satyam Shekhar, Prateek Bhurat, Marco Canini

DejaVu:

Nedeljko Vasić, Dejan Novaković, Svetozar Miucin, Ricardo Bianchini

50

Energy-Proportional Networked Systems - EPFL

Video streaming, Cloud computing. • CMOS reaching a plateau in power-efficiency ... NETWORK. Threats to Internet's growth. Power deliver/. Cooling problems.

3MB Sizes 0 Downloads 132 Views

Recommend Documents

pdf-1294\handbook-of-networked-and-embedded-control-systems ...
... apps below to open or edit this item. pdf-1294\handbook-of-networked-and-embedded-control-systems-control-engineering-from-brand-birkhauser.pdf.

pdf-1595\networked-control-systems-theory-and-applications-from ...
Connect more apps... Try one of the apps below to open or edit this item. pdf-1595\networked-control-systems-theory-and-applications-from-brand-springer.pdf.

Hyperlink - EPFL - PostDoc.pdf
control strategies for large-scale transportation systems remains a big challenge,. due to the high unpredictability and heterogeneity of traveler decisions, the.

Scalable Component Abstractions - LAMP | EPFL
Classes on every level can create objects ... level might be a simple element on the next level of scale. ...... Department of Computer Science, EPFL, Lausanne,.

Scalable Component Abstractions - LAMP - EPFL
software components with static data and hard references, resulting in a ... aspect-oriented programming (indeed, the fragment system .... An important issue in component systems is how to ab- ... this section gives an introduction to object-oriented

A WIDEBAND DOUBLY-SPARSE APPROACH ... - Infoscience - EPFL
a convolutive mixture of sources, exploiting the time-domain spar- sity of the mixing filters and the sparsity of the sources in the time- frequency (TF) domain.

Efficiently Maintaining Distributed Model-Based ... - Infoscience - EPFL
their own local streams in different local networks. s2. 10.2. 11.1. : raw data stream model-based view. 3.1. 4.5. : 8.5. 8.2. : s4 s5 s'2. 10.1. 11.1. : s3. 0.9. 2.3. : 1.0.

accelerometer - enhanced speed estimation for ... - Infoscience - EPFL
A further increase in position resolution limits the maximum axis speed with today's position encoders. This is not desired and other solutions have to be found.