Unbundling Transaction Services in the Cloud D. Lomet, A. Fekete, G. Weikum, M. Zwilling CIDR 2009 Presented by Vinod Venkataraman CS 395T – Spring 2010

Introduction  Traditional DBMS transactional storage managers have  Lock manager for concurrency control  Recovery log manager  Buffers for database I/Os  Disk access methods

 Tightly coupled for high performance  This paper – separate transactional services (TC) from data

services (DC)

Motivations  Cloud computing – separated components allow for greater   



flexibility Improving parallelism in multicore architectures Building application-specific data management engines Leveraging the processing power of I/O device controllers Simplifying extensibility - addition of new data types

Problems?

Application Perspective  Sample application –Web 2.0 photo sharing application  Data types  Images – Large persistent storage space  User accounts, ownerships, access rights, comments, groups,

friendships – high update rates  Traditional DBMS-style cloud can work  Indexing?

 Simpler storage service can work  Concurrency control, recovery?

 Unbundling can satisfy above requirements, and „simpler to

implement‟

Architecture

Reproduced from paper

Transactional Component (TC)  Unaware of data page structure  Provides transactional locking for isolation  Provides transaction atomicity  Commit after DC performs all the required logical operations  Abort after forcing DC to roll back operations

 Provides transaction logging – undo and redo

Data Component (DC)  Organizes, caches, searches, updates data  Solely responsible for mapping records to disk pages  Provides data atomicity  Maintains indexes and storage structures  Provides cache management

 Provides idempotence on requested operations

TC – DC Interactions  Many interactions are not made stable immediately  Extensive use of caching

 Unique request IDs (LSNs)  Resend requests  Combination of the above and idempotence in the DC

ensures exactly-once execution, even if requests are set multiple times – fault tolerance  Contract termination  Corresponds to checkpointing

Challenges – Concurrency Control  Locking individual records, named by record identifiers –

easy  Locking ranges of records – harder  Traditional DBMS – key range locking  Unbundled – locking to be done before sending requests to DC

 Solutions  Fetch ahead protocol  Explicit range locks ****

Challenges - Recovery  TC knows nothing about pages, therefore TC log records

cannot contain page identifiers  Logical redo needed

 Out-of-order executions  TC may assign LSNs before DC determines order of operations

 DC may perform internal „system transactions‟ upon

recovery  DC must ensure that redo operations still execute correctly

despite possible reordering

Out-of-Order Executions  Current technique for traditional systems  Operation‟s LSN compared with LSN stored in the page acted

upon: Operation LSN <= Page LSN  Logical log records are produced during the critical section in which a page is modified, therefore contains Page LSN  If test is true, redo is prohibited  Else, operation is re-executed and page is updated with LSN  Not suitable for unbundled system  Consider operation Oj with LSNj executes before operation Oi with LSNi

where LSNi < LSNj  If the page is stabilized, it will contain a Page LSN = LSNj  Test wrongly indicates that Oi is performed on the page

Out-of-Order Executions  New Technique  Abstract LSN (abLSN): Accurately captures all operations

executed and included in page state  Also indicates which operations are NOT included on the page  abLSN =  LSNlw – has value such that no operation with LSN <= LSNlw needs to be re-executed  {LSNin} – Set of LSNs of operations greater than LSNlw whose effects are also included on the page  New test: LSNi<=LSNlw or LSNi in {LSNin}  abLSN can be stored in cache until page is made stable in disk

System Transactions  Change in the internal data structures of the DC, such as B-

Tree page splits  Current Technique  System transactions have to be redone in original execution

order  Undo – first system transactions, then user level transaction  New Technique  DC maintains structural modification information on dLSNs  DC indexes must be well-formed by completing redo and undo

of system transactions from the DC-log, prior to the TC executing its redo recovery

Partial Failures  DC Failure  DC loses in-cache state, reverts to state on secondary store  TC is notified, and resends operations from last checkpoint

 TC Failure  TC has to reset the state of the DC to an earlier state  DC cache may contain pages which reflect the effects of TC

operations that have been lost  These must be reversed before the TC resends operations from its stable log to be re-applied in a DC

Multiple TCs per DC  DC Requirements  Multiple Abstract LSNs per page  TC Failure recovery  Simply asking the failed TC to resend its operations will not work  Paper “expect most pages to have updates from a single TC and optimizes for this case”

 Data sharing among TCs  Non-versioned data  Read-only – easy to implement  Dirty reads – uncommitted data, may be read, modified

 Versioned data – read committed access, write on new

uncommitted version

Case Study – Movie Information  4 transaction workloads    

W1: obtain all reviews for a particular movie W2: add a movie review written by a user W3: update profile information for a user W4: obtain all reviews written by a particular user

 4 tables to support these workloads:  Movies (primary key MId): contains general information about each

movie. Supports W1  Reviews (primary key MId, UId) contains movie reviews written by users. Updated by W2 to support W1  Users (primary key UId): contains profile information about users. Updated by W3  MyReviews (primary key UId, MId): contains a copy of reviews written by a particular user. Updated by W2 to support W4. Effectively this table is an index in the physical schema since it contains redundant data from the Reviews table

Case Study – Movie Information

Reproduced from paper

Conclusion  Paper suggests a significant paradigm shift in transaction

managers, concurrency control and recovery  Separation of TC and DC may result in easier design considerations for application developers  Does not provide an implementation, or even partial implementations of the concepts talked about

Discussion  How will the performance of an unbundled system compare

with a traditional DBMS with a transaction manager that is tightly coupled with the internal data structures?  Are the claims towards the „benefits‟ of this system for cloud, multicore architectures justified?

Unbundling Transaction Services in the Cloud

Introduction. Traditional DBMS transactional storage managers have. Lock manager ... Cloud computing – separated components allow for greater flexibility.

279KB Sizes 0 Downloads 195 Views

Recommend Documents

Unbundling Transaction Services in the Cloud
Buffers for database I/Os. Disk access methods. Tightly coupled for high performance. This paper – separate transactional services (TC) from data services (DC) ... Data types. Images – Large persistent storage space. User accounts, ownerships, ac

High visibility for services in the Cloud Services
Reinventing the workplace. 6LQFH %R[ KDV KHOSHG PRUH WKDQ È´ YH PLOOLRQ LQGLYLGXDOV VPDOO EXVLQHVVHV and Fortune 1000 companies reinvent how they store, share and work together on content. The team at Box has accomplished this impressive feat by addr

Modernize in place and grow in the cloud - Services
Companies are turning to the cloud to build new digital experiences, enabled by fast development cycles. But for enterprises with decades of investment in legacy infrastructure, moving your entire data center footprint to the cloud overnight is rarel

Rethink Technology In The Age Of The Cloud ... Services
browser has become a central access point for communication and collaboration in the cloud. ... comprise 26% of today's information workers and are likely to be a high growth segment as .... Senior Market Impact Consultant. Contributing ...

Rethink Technology In The Age Of The Cloud ... Services
with a standard-issue laptop or workstation. For example, the ... frontline workers, to warehouse/logistics teams, to digital business professionals. ... can use their universal account information to log into any device. ITDMs appreciate the ...

Essays on the unbundling of electricity networks in ...
unbundling the holding company can give its integrated buyer direct instructions to bid .... company that owned both distributors and generators. This liberty did ...

Automate Configuration of Application Networking Services in the Cloud
Page 1 ... and monitoring of applications and services in the cloud is not easy. ... networking services management solution for F5 BIG-IP®–enabled cloud ...

Cloud Whitepaper Services
3.2 What data will be processed by the service provider on behalf of the financial institution? 3.3 How do we seek to address some of ... Prudential Standard CPS 231 · Outsourcing, and does not consider any other laws that may be applicable. .... sup

Cloud Sprint - Services
Cloud Sprint is an intensive hands-on workshop that accelerates a customer's application migration to Google. Cloud Platform (GCP). Google's experts will lead a customer team through highly interactive discussions and whiteboarding sessions, reviewin

Cloud Deploy Services
Cloud Deploy: Machine Learning for Minimum Viable Model (MVM) helps customers develop an initial machine learning model running on TensorFlow and Google Cloud Platform to demonstrate proof-of-concept modeling of a specific business use case. Google w

Cloud Whitepaper Services
Disclaimer. Introduction. 1. The Canada PIPEDA. 1.1 Google Cloud and the Canada PIPEDA. 2 . Security and Trusted Infrastructure. 2.1 Google data centre infrastructure redundancy. 2.2 Google data centre security. 2.3 Data in transit. 2.3.1 Between a c

Google Cloud - Services
Feb 13, 2018 - Paypal: Rapidly innovating and quickly developing and scaling new apps. SaaS. Marketo: Migrating from on-premise. Salesforce: Preferred public cloud provider. Healthcare &. Life Sciences. Colorado Center for Personalized Medicine: Dise

Cloud Whitepaper Services
Identity Management. Services. Manage the security of and access to cloud assets, supported by Google's own protection of its infrastructure. Machine Learning. Fast, scalable and easy to use modern machine learning services, with pre-trained models a

Cloud Start - Services
dependencies, and lessons learned. Identify open questions, action items, and recommended next steps. Deliverables. • Design and architecture white board recommendations. • Executive Cloud Start report with insights and recommended next steps. Av

Cloud Whitepaper Services
The Canada PIPEDA. 1.1 Google Cloud and the Canada PIPEDA. 2 . Security and Trusted Infrastructure. 2.1 Google data centre infrastructure redundancy .... Certificate. • ISO 27018, Cloud Privacy, is an international standard of practice for protecti

Cloud Plan: Infrastructure - Services
Create an inventory of potential workloads targeted for migration to Google Cloud. Document key characteristics of these workloads to evaluate in preparation for future deployment. First Movers Assessment. Prioritize and select “first mover” appl

Google Cloud Platform Services
Dec 21, 2017 - Platform, nor have we considered the impact of any security concerns on a specific workflow or piece of software. The assessment ... similar to a traditional file system, including fine-grained access control lists for each object. ...

Cloud Plan: Infrastructure Services
Build an initial plan for deployment and migration activities given customer's requirements and timelines. As part of the planning, a training plan will be delivered.

Cloud Discover: Security Services
Cloud Discover: Security helps customers understand security controls and considerations in Google Cloud. Platform (GCP) and identify key business ...

Cloud Services onepager.pdf
Cloud Services onepager.pdf. Cloud Services onepager.pdf. Open. Extract. Open with. Sign In. Main menu. Displaying Cloud Services onepager.pdf.

Cloud Services onepager.pdf
Cloud Services. Focus op de business in plaats van op IT. Page 2 of 2. Cloud Services onepager.pdf. Cloud Services onepager.pdf. Open. Extract. Open with.

Google Cloud Platform Services
Dec 21, 2017 - Because the circumstances and types of deployments in GCP can range so ... with the ability to manage the Cloud Platform and other Google ... network services and security features—such as routing, firewalling, ... storage system, Da