Kay Sripanidkulchai, Sambit Sahu, Yaoping Ruan, Anees Shaikh, and Chitra Dorai IBM T.J. Watson Research Center

Are Clouds Ready for Large Distributed Applications?

© 2009 IBM Corporation

Outline

 What are users expecting from the cloud? –Establish a base-line for requirements  Is the cloud meeting user requirements? –Service deployment –Service availability –Service problem resolution  Where are opportunities?

2

LADIS 2009

© 2009 IBM Corporation

Enterprise vs. individual customers have different requirements Typical Enterprise Application Architecture ITIL System Management Eco-system

We study three primary requirements

Security and Network Components Scalable/High-Availability/DR Architectures Enterprise-Class Application Building Blocks (3-Tiered + Messaging + etc.) Enterprise-Class Hardware

Typical Small/Individual Application Architecture ? ? ? Application Building Blocks (3-Tiered ) Commodity Hardware 3

LADIS 2009

• How to deploy largescale distributed services on the cloud, • How to deliver high availability services using clouds, and • What to do when there are problems with services running on the cloud. • For others, see [AFG et. al 08], [WSRV09] © 2009 IBM Corporation

Are there sufficient building blocks available to enterprise users to quickly deploy their services on the cloud? March 23, 2009

Base OS

Middleware

Application

AMI

4

21

26 530

92

552

VMWare 0%

20%

40%

60%

80%

100%

Base OS and middle-ware images dominate the landscape. Where are the complex applications? Where are the multi-tier distributed applications with multiple images?

4

LADIS 2009

© 2009 IBM Corporation

Towards supporting deployment of large-scale distributed applications….  Service composition to support complex applications beyond single VMs. – Express relationships among these VMs denoting the dependencies at configuration time and at running time – Compose complex deployment from single and already built set of VMs, and – Instantiate the deployment based on the above stated dependencies. Current status: Already headed this way with third-party services such as 3Tera and RightScale, but will eventually need a common standard.  Transformation of existing enterprise service deployment into a cloud-based deployment – Discovery of application configuration and dependency of the enterprise services to be migrated to the cloud – Determine the amount of infrastructure resources needed on the cloud and map application components to the resources – Support for provisioning the service and migrating to the cloud in an easy and quick manner, without incurring service down time. Can we do this live? Current status: Discovery techniques and dependency graphs have been explored in other contexts such as problem determination. The rest is open.

5

LADIS 2009

© 2009 IBM Corporation

6 96

LADIS 2009 99 .9 99 81 .99 4 99 .9 99 97 .96 2 99 .9 99 97 .99 6 99 .9 99 68 .99 9

www.tobaks fakta.org search. yahoo.com www. amazon.com www.cnn.com

www.ebay.com

99 .9 99 83 .99 3

99 .6 9923 .90 6

www.navyfcu.org

Individual/Small 99.368% (~55 hours downtime/year)

www. walmart.com

99 .9 99 93 .84 6

4

99 .7 99 57 .92 3

98

www.matematiker samfundet.org.se

.46

100

www.karlsborg.se

98

2007 2008

99 .8 99 97 .91 8

0

97 .35

99

onkelborg.com

96

State-ofthe-art cloud SLA at 99.95% or ~4 hours downtime/ year. Availability (%)

There are gaps in service availability requirements for enterprise users Enterprise 99.987% (~1 hour downtime/year)

© 2009 IBM Corporation

Bridging the gap in service availability requirements  Implementing scaling architectures in the cloud – Templates and rules to determine based on system conditions to automatically leverage the appropriate architectural solution – Commoditize the expertise so that it can be reused by different cloud users Current status: components such as content delivery networks, load-balancing and automatic scaling (elasticity) are available, but best practices for how to use these components have not been established. Can the cloud just automatically do this for me?  Extending availability beyond one cloud – API or framework to commoditize the construction of high availability services delivered across multiple clouds Current status: few service providers -- too early but already concerned about lock-in  Using the latest and greatest virtualization capabilities – Live migration to avoid down time Current status: non-existent inside one cloud and across clouds. Who gets to decide when/why to migrate? The user or the cloud provider?

7

LADIS 2009

© 2009 IBM Corporation

Best practice in service problem resolution faces scaling challenges Feature Request

HowTo/ Info

Problem Cloud Error User Error

Unknown

10%

56%

25%

11%

64%

Amazon EC2 Forum: April 1-7, 2009

Observations • • • •

Top problems: Instance, EBS, Security The same symptom presented to the user has many underlying root causes Resolution process is highly manual and ad-hoc; manual information sharing is error-prone and not scalable Users do not know what is happening in the underlying infrastructure and cloud provider does not know what happening in the users applications

Where to go next •

8

Define an API for information sharing between users and providers that addresses privacy concerns • Is a minimum of a binary “your problem” vs. “my problem” query sufficient? • Can all of a user’s instances be managed together?

LADIS 2009

© 2009 IBM Corporation

Summary  Explored three requirements from the perspective of cloud users – Compared individual/small users vs. enterprise users – Established a base-line using publicly available data

ITIL System Management Eco-system Security and Network Components Scalable/High-Availability/DR Architectures Enterprise-Class Application Building Blocks (3-Tiered + Messaging + etc.) Enterprise-Class Hardware

 Service deployment – Current practice focuses on monolithic systems, with some initial support for more complex distributed applications underway. – Future work to support large-scale distributed architectures is needed.  Service availability – SLA’s are in place and high enough to meet individuals’ needs. – Future work to increase availability is crucial to attract enterprise users and would also benefit individual users.  Problem resolution – Current manual process faces scaling challenges – Future work to reduce the load on the cloud support staff such as providing cloud users with enough visibility into the cloud infrastructure to independently identify the root cause of problems is needed to scale up. © 2009 IBM Corporation 9

LADIS 2009

Are Clouds Ready for Large Distributed Applications?

Page 1 ... Security and Network Components ... Transformation of existing enterprise service deployment into a cloud-based deployment. – Discovery of ...

148KB Sizes 4 Downloads 301 Views

Recommend Documents

Are Clouds Ready for Large Distributed Applications?
software procurement, base OS installation, middle-ware and ... For example, infrastructure as a service providers ... For example, RightScale [6] and 3Tera [5].

Are Clouds Ready for Large Distributed Applications?
What are users expecting from the cloud? –Establish a base-line for requirements. Is the cloud meeting user requirements? –Service deployment. –Service ...

Are you ready for IPv6? - GitHub
Page 5 .... IPv6 Support in Boost.Asio. Resolver: ○ Obtain endpoints corresponding to host and service names. ○ Usually uses DNS ...

Distributed Kd-Trees for Retrieval from Very Large ... - Semantic Scholar
covers, where users can take a photo of a book with a cell phone and search the .... to supply two functions: (1) Map: takes an input pair and produces a set of ...

A Distributed Clustering Algorithm for Voronoi Cell-based Large ...
followed by simple introduction to the network initialization. phase in Section II. Then, from a mathematic view of point,. derive stochastic geometry to form the algorithm for. minimizing the energy cost in the network in section III. Section IV sho

Large-scale Incremental Processing Using Distributed ... - USENIX
collection of machines, meaning that this search for dirty cells must be distributed. ...... to create a wide variety of infrastructure but could be limiting for application ...

Prepping 2018 Are You Ready For Grid Down Survival.pdf ...
Page 1 of 2. Joe Ready. Prepping 2018 : Are You Ready For Grid Down Survival? readylifestyle.com/prepping-2018/. What are your plans for preparedness in 2018? We all have something that we can do to be. more prepared. Don't fall into the procrastinat

Are You Ready for the Holidays.pdf
Steubens Bread Winners Mighty Joes. Yak & Yeti The Bluegrass School House. Fazoli's Arvada Tavern Klines. 303 Ramen GB Fish & Chips Bada Bing. Ready to Walk the Runway. It's not the holiday season without the. holiday parties. Be pepared by finding t

Are Stationary Fuel Cells Ready for Market?
than expected rate for technology development. While prices have definitely fallen in recent years, uncertainty lingers regarding whether significant market ...

Thr Temple above the clouds Large. Collegium 2013.pdf ...
Thr Temple above the clouds Large. Collegium 2013.pdf. Thr Temple above the clouds Large. Collegium 2013.pdf. Open. Extract. Open with. Sign In.

How Large are Search Frictions?
Oct 20, 2006 - not observe all factors that are relevant for worker skill and job complexity, and ... identical across the six countries that we consider, but we show that the required ... Recent contributions apply matched worker–firm data to ....

Large Scale Distributed Semi-Supervised Learning Using Streaming ...
Figure 1 shows an illustration of the various graph types. We focus ..... Tutorial, June 2008. [7] A. Carlson, J. .... gation from imagenet to 3d point clouds. In Pro-.

Dapper, a Large-Scale Distributed Systems Tracing Infrastructure
ure 1 shows a service with 5 servers: a front-end (A), two middle-tiers (B and C) .... RPC response headers – can affect application network dynamics. In many of ...

Distributed Large-scale Natural Graph ... - Research at Google
Natural graphs, such as social networks, email graphs, or instant messaging ... cated values in order to perform most of the computation ... On a graph of 200 million vertices and 10 billion edges, de- ... to the author's site if the Material is used

College Ready for Some or Career Ready for All
than half actually graduate.2 These students need to be taught marketable 21st century skills ... k12/career-technical-education.edu.html. Programs of. Study are ...

Large Scale Distributed Acoustic Modeling With ... - Research at Google
Jan 29, 2013 - 10-millisecond steps), which means that about 360 million samples are ... From a modeling point of view the question becomes: what is the best ...

Large Scale Distributed Deep Networks - Research at Google
second point, we trained a large neural network of more than 1 billion parameters and .... rameter server service for an updated copy of its model parameters.

ONLINE applications are invited from eligible candidates for ... - BARC
Nov 3, 2015 - 5 years Bachelor degree in Architecture (B-Arch) discipline with ... 01 **HH - (PD) B.Sc. (Computer science) with Minimum 60% marks. PWD/03.

Efficient Large-Scale Distributed Training of Conditional Maximum ...
computer vision [12] over the last decade or more. ..... online product, service, and merchant reviews with a three-label output (positive, negative .... Our analysis and experiments give significant support for the mixture weight method for training

Efficient Large-Scale Distributed Training of ... - Research at Google
Training conditional maximum entropy models on massive data sets requires sig- ..... where we used the convexity of Lz'm and Lzm . It is not hard to see that BW .... a large cluster of commodity machines with a local shared disk space and a.