Unbundling Transaction Services in the Cloud D. Lomet, A. Fekete, G. Weikum, M. Zwilling CIDR 2009 Presented by Vinod Venkataraman CS 395T – Spring 2010
Introduction Traditional DBMS transactional storage managers have Lock manager for concurrency control Recovery log manager Buffers for database I/Os Disk access methods
Tightly coupled for high performance This paper – separate transactional services (TC) from data
services (DC)
Motivations Cloud computing – separated components allow for greater
flexibility Improving parallelism in multicore architectures Building application-specific data management engines Leveraging the processing power of I/O device controllers Simplifying extensibility - addition of new data types
Problems?
Application Perspective Sample application –Web 2.0 photo sharing application Data types Images – Large persistent storage space User accounts, ownerships, access rights, comments, groups,
friendships – high update rates Traditional DBMS-style cloud can work Indexing?
Simpler storage service can work Concurrency control, recovery?
Unbundling can satisfy above requirements, and „simpler to
implement‟
Architecture
Reproduced from paper
Transactional Component (TC) Unaware of data page structure Provides transactional locking for isolation Provides transaction atomicity Commit after DC performs all the required logical operations Abort after forcing DC to roll back operations
Provides transaction logging – undo and redo
Data Component (DC) Organizes, caches, searches, updates data Solely responsible for mapping records to disk pages Provides data atomicity Maintains indexes and storage structures Provides cache management
Provides idempotence on requested operations
TC – DC Interactions Many interactions are not made stable immediately Extensive use of caching
Unique request IDs (LSNs) Resend requests Combination of the above and idempotence in the DC
ensures exactly-once execution, even if requests are set multiple times – fault tolerance Contract termination Corresponds to checkpointing
Challenges – Concurrency Control Locking individual records, named by record identifiers –
easy Locking ranges of records – harder Traditional DBMS – key range locking Unbundled – locking to be done before sending requests to DC
Solutions Fetch ahead protocol Explicit range locks ****
Challenges - Recovery TC knows nothing about pages, therefore TC log records
cannot contain page identifiers Logical redo needed
Out-of-order executions TC may assign LSNs before DC determines order of operations
DC may perform internal „system transactions‟ upon
recovery DC must ensure that redo operations still execute correctly
despite possible reordering
Out-of-Order Executions Current technique for traditional systems Operation‟s LSN compared with LSN stored in the page acted
upon: Operation LSN <= Page LSN Logical log records are produced during the critical section in which a page is modified, therefore contains Page LSN If test is true, redo is prohibited Else, operation is re-executed and page is updated with LSN Not suitable for unbundled system Consider operation Oj with LSNj executes before operation Oi with LSNi
where LSNi < LSNj If the page is stabilized, it will contain a Page LSN = LSNj Test wrongly indicates that Oi is performed on the page
Out-of-Order Executions New Technique Abstract LSN (abLSN): Accurately captures all operations
executed and included in page state Also indicates which operations are NOT included on the page abLSN = LSNlw – has value such that no operation with LSN <= LSNlw needs to be re-executed {LSNin} – Set of LSNs of operations greater than LSNlw whose effects are also included on the page New test: LSNi<=LSNlw or LSNi in {LSNin} abLSN can be stored in cache until page is made stable in disk
System Transactions Change in the internal data structures of the DC, such as B-
Tree page splits Current Technique System transactions have to be redone in original execution
order Undo – first system transactions, then user level transaction New Technique DC maintains structural modification information on dLSNs DC indexes must be well-formed by completing redo and undo
of system transactions from the DC-log, prior to the TC executing its redo recovery
Partial Failures DC Failure DC loses in-cache state, reverts to state on secondary store TC is notified, and resends operations from last checkpoint
TC Failure TC has to reset the state of the DC to an earlier state DC cache may contain pages which reflect the effects of TC
operations that have been lost These must be reversed before the TC resends operations from its stable log to be re-applied in a DC
Multiple TCs per DC DC Requirements Multiple Abstract LSNs per page TC Failure recovery Simply asking the failed TC to resend its operations will not work Paper “expect most pages to have updates from a single TC and optimizes for this case”
Data sharing among TCs Non-versioned data Read-only – easy to implement Dirty reads – uncommitted data, may be read, modified
Versioned data – read committed access, write on new
uncommitted version
Case Study – Movie Information 4 transaction workloads
W1: obtain all reviews for a particular movie W2: add a movie review written by a user W3: update profile information for a user W4: obtain all reviews written by a particular user
4 tables to support these workloads: Movies (primary key MId): contains general information about each
movie. Supports W1 Reviews (primary key MId, UId) contains movie reviews written by users. Updated by W2 to support W1 Users (primary key UId): contains profile information about users. Updated by W3 MyReviews (primary key UId, MId): contains a copy of reviews written by a particular user. Updated by W2 to support W4. Effectively this table is an index in the physical schema since it contains redundant data from the Reviews table
Case Study – Movie Information
Reproduced from paper
Conclusion Paper suggests a significant paradigm shift in transaction
managers, concurrency control and recovery Separation of TC and DC may result in easier design considerations for application developers Does not provide an implementation, or even partial implementations of the concepts talked about
Discussion How will the performance of an unbundled system compare
with a traditional DBMS with a transaction manager that is tightly coupled with the internal data structures? Are the claims towards the „benefits‟ of this system for cloud, multicore architectures justified?