Reducing OLTP Instruction Misses with Thread Migration Islam Atta Pınar Tözün Anastasia Ailamaki Andreas Moshovos University of Toronto École Polytechnique Fédérale de Lausanne
OLTP on a Intel Xeon5660 Shore‐MT Hyper‐threading disabled 100% 90% 80% 70% 60% 50% 40% 30% 20% 10% 0%
0.8
Breakdown of Core Stalls
Instructions per Cycle
better
0.9
Resource (includes data) Instructions
0.7 0.6 0.5 0.4 0.3 0.2 0.1 0
TPC‐C
TPC‐E
TPC‐C
TPC‐E
IPC < 1 on a 4‐issue machine 70‐80% of stalls are instruction stalls
2
OLTP L1 Instruction Cache Misses Misses per k‐Instruction
better
60 50
Most common today!
40
Trace Simulation 4‐way L1‐I Cache Shore‐MT
30 TPC‐C
20
TPC‐E 10 0 16
32
64
128
256
512
1024
Cache Size (KB)
~512KB is enough for OLTP instruction footprint
3
Reducing Instruction Stalls
at the hardware level
• Larger L1‐I cache size Higher access latency
• Different replacement policies Does not really affect OLTP workloads
• Advanced prefetching Has too much space overhead (40KB per core)
Alternative: Thread Migration • Enables usage of aggregate L1‐I capacity – Large cache size without increased latency
• Can exploit instruction commonality – Localizes common transaction instructions
• Dynamic hardware solution – More general purpose
5
Transactions Running Parallel Instruction parts that can fit into L1‐I Threads Transaction
T1
T2
T3
T3 T2 T1
Common instructions among concurrent threads
6
Scheduling Threads Threads T1 Total Misses
1 T2
time
3 6
T3 9 10
0 T1
Traditional
TMi
CORES 1 2
CORES 1 2
3
0 T1
3
1
L1I T1
T2
T1
T2
T1
T2
T2
T1
T3
T3
T2
T3
T1
T3
Total Misses
2 T1 T3
3 T2 T3
4 4 7
TMi Transaction A T1 T2
CORES 0 1 L1I
Transaction B T3 T4
time
T1
• Group threads • Wait till L1‐I is almost full – Count misses – Record last N misses – Misses > threshold => Migrate
8
TMi Transaction A T1 T2
CORES 0 1
time
T1 T2 T1 T1
Where to migrate?
• Check the last N misses recorded L1I in other caches 1) No matching cache => Move to an idle core if exists T1 2) Matching cache => Move to that core T2 3) None of above => Do not move T2
9
Experimental Setup • Trace Simulation – – – – –
PIN to extract instructions & data accesses per transaction 16 core system 32KB 8‐way set‐associative L1 caches Miss‐threshold is 256 Last 6 misses are kept
• Shore‐MT as the storage manager – Workloads: TPC‐C, TPC‐E
10
Impact on L1‐I Misses Misses per k‐Instruction
better
45 40 35
Instruction
30 25 20 15 10 5 0 No Migration
TMi TPC‐C
TMi Blind
No Migration
TMi
TMi Blind
TPC‐E
Instruction misses reduced by half
11
Impact on L1‐D Misses Misses per k‐Instruction
better
45 40 35
Write Data Read Data Instruction
30 25 20 15 10 5 0 No Migration
TMi TPC‐C
TMi Blind
No Migration
TMi
TMi Blind
TPC‐E
Cannot ignore increased data misses
12
TMi’s Challenges • Dealing with the data left behind – Prefetching
• OS support needed – Disabling OS control over thread scheduling
13
Conclusion • ~50% of the time OLTP stalls on instructions • Spread computation through thread migration • TMi – Halves L1‐I misses – Time‐wise ~30% expected improvement – Data misses should be handled
Reducing OLTP Instruction Misses with Thread Migration
Transactions Running Parallel. 6. T1. T2. T3. Instruction parts that can fit into L1-I. Threads. Transaction. T123. Common instructions among concurrent threads ...
Feb 1, 2002 - 712/209,. 712/210. See application ?le for complete search history. ..... the programmer speci?es the sorting order is to pass the address of a ...
Feb 1, 2002 - ABSTRACT. Data processing apparatus comprising: a processor core hav ing means for executing successive program instruction. Words of a ...
Identify main thread. 3. Suspend main thread. 4. Obtain thread context. 5. Create and write the code-cave. 6. Spoof instruction pointer to execute the code-cave.
... with Thread: Stitching a Whimsical World with Hand Embroidery ,digital book Tula Pink ..... with Hand Embroidery ,upload epub to kindle Tula Pink Coloring with Thread: Stitching a ..... and stitch her signature designs with needle and tread!
Page 1 ... WAP Gateway 2.0 Offload. The biggest challenges communications service providers (CSPs) face when supporting their networks continue to be optimizing network architecture and reducing costs. Wireless. Access Protocol (WAP) ...
cystic carcinoma type [2] Squamous cell carcinomas typically. occur later in life and more frequently in men and smokers, while. adenoid cystic carcinomas are ...
Transfer System, three different financial institutions reap the benefits from their cloud-based payments solution. â offloading IT to focus on delivering mission- ...
Since 1997, the company had been using RISC servers for applications including Bankinter's ... three other hardware and software combi- nations, including ...
consolidation, performance isolation and ease of management. Migration is one of the most important features .... is preferable to explore an intelligent way that minimizes the contention on network bandwidth, while utilizing ..... grants and equipme
... of the apps below to open or edit this item. pdf-0734\tracking-animal-migration-with-stable-isotopes-volume-2-terrestrial-ecology-from-academic-press.pdf.
migrated key applications to servers based on the Intel Xeon processor E5 family, running. Red Hat* ... results for our own business needs, and we found that many of our systems were locked in ... three other hardware and software combi-.
Jan 14, 2011 - views of the OECD or of the governments of its member countries. ...... seeking to punish downloaders of copyright material, against the .... to focus more on the process of analysing risk rather than simply having a long list ... abou
One of the top five banks in the country ... on whichever platform offers the best price/performance ratio.â Spanish ... ing website was migrated next, achieving.
The designer must also deal with data warehouse administrative processes, which are complex in structure, large in number and hard to code; deadlines must ...
MacDonald, John M., and Robert J. Sampson. "Don't Shut the Golden Door. ... Bush Center, George W. Bush Institute, 2016, ... Page 3 of 3. Migration - Darryl.pdf.
An example of positive socio-economie change would be Hazleton in Pennsylvania. Formerly known for a population with strong anti-immigration views has changed completely,. endorsing Latinos.ias they have helped reverse the local economic decline. Wit