Question? Leave a message!




Advanced Superscalar Microprocessors

Advanced Superscalar Microprocessors
Dr.ShaneMatts Profile Pic
Dr.ShaneMatts,United States,Teacher
Published Date:23-07-2017
Website URL
Comment
1 Advanced Superscalar Microprocessors Joel Emer Computer Science and Artificial Intelligence Laboratory Massachusetts Institute of Technology Based on the material prepared by Krste Asanovic and Arvind 6.823 L14- 2 Emer O-o-O Execution with ROB R1 R1 1 1 R1 R1 tag tag t t 0 0 ii Reg Regiiste ster r Rename Rename R2 R2 2 2 t t 0 0 R2 R2 j j va valid bit lid bit R3 R3 3 3 File File Table Table t t 2 2 1 1 R3 R3 : : t t 1 1 1 1 R4 R4 :: : : Ins Ins Ins use use use exec exec exec op op op p1 p1 p1 sr sr src1 c1 c1 p2 p2 p2 src2 src2 src2 pd pd pd dest dest dest d d da a ata ta ta Ne Nex xt t t to o 0 0 X X X X ad add d X X 1 1 X X 2 2 X X R R4 4 4 4 t t 1 1 com comm mit it t t 8 8 X X ld ld X X 2 25 56 6 R R3 3 2 2 Nex Next t . . ava avaiilla ab blle e . . Reorder Reorder t t n n buffer buffer Commit Commit Load Load Store Store FU FU FU FU FU FU Uni Unit t Un Unit it t t, , r re esul sult t Basic Operation: • Enter op and tag or data (if known) for each source • Replace tag with data as it becomes available • Issue instruction when all sources are available • Save dest data when operation finishes • Commit saved dest data when instruction commits October 31, 2005 6.823 L14- 3 Emer Unified Physical Register File (MIPS R10K, Alpha 21264, Pentium 4) t 1 Reg Snapshots for t 2 r t 1 i mispredict recovery . File r t 2 j t n Rename Load Store FU FU FU FU Table Unit Unit (ROB not shown) t, result • One regfile for both committed and speculative values (no data in ROB) • During decode, instruction result allocated new physical register, source regs translated to physical regs through rename table • Instruction reads data from regfile at start of execute (not in decode) • Write-back updates reg. busy bits on instructions in ROB (assoc. search) • Snapshots of rename table taken at every branch to recover mispredicts • On exception, renaming undone in reverse order of issue (MIPS R10000) October 31, 2005 6.823 L14- 4 Emer Speculative & Out-of-Order Execution Update predictors Branch kill Resolution Branch kill kill Prediction kill Out-of-Order In-Order Decode & PC Fetch Reorder Buffer Commit Rename In-Order Physical Reg. File Branch Store ALU MEM D Unit Buffer Execute October 31, 2005 6.823 L14- 5 Emer Lifetime of Physical Registers • Physical regfile holds committed and speculative values • Physical registers decoupled from ROB entries (no data in ROB) ld r1, (r3) ld P1, (Px) add r3, r1, 4 add P2, P1, 4 sub r6, r7, r9 sub P3, Py, Pz add r3, r3, r6 add P4, P2, P3 Rename ld r6, (r1) ld P5, (P1) add r6, r6, r3 add P6, P5, P4 st r6, (r1) st P6, (P1) ld r6, (r11) ld P7, (Pw) When can we reuse a physical register? When next write of same architectural register commits October 31, 2005 6.823 L14- 6 Emer Physical Register Management Rename Physical Regs Free List Table P0 P0 R0 P1 P1 ld r1, 0(r3) R1 P2 P8 P3 R2 P3 P2 add r3, r1, 4 R3 P4 P7 P4 R4 P5 R6 p sub r6, r7, r6 R5 P6 R7 p R6 P7 add r3, r3, r6 P5 R3 p R7 P8 P6 R1 p ld r6, 0(r1) Pn ROB use ex op p1 PR1 p2 PR2 Rd LPRd PRd (LPRd requires third read port on Rename Table for each instruction) October 31, 2005 6.823 L14- 7 Emer Physical Register Management Rename Physical Regs Free List Table P0 P0 R0 P1 P1 ld r1, 0(r3) R1 P0 P2 P8 P3 R2 P3 P2 add r3, r1, 4 R3 P4 P7 P4 R4 P5 R6 p sub r6, r7, r6 R5 P6 R7 p R6 P7 add r3, r3, r6 P5 R3 p R7 P8 P6 R1 p ld r6, 0(r1) Pn ROB use ex op p1 PR1 p2 PR2 Rd LPRd PRd x ld p P7 r1 P8 P0 October 31, 2005 6.823 L14- 8 Emer Physical Register Management Rename Physical Regs Free List Table P0 P0 R0 P1 P1 ld r1, 0(r3) R1 P0 P2 P8 P3 R2 P3 P2 add r3, r1, 4 R3 P1 P4 P7 P4 R4 P5 R6 p sub r6, r7, r6 R5 P6 R7 p R6 P7 add r3, r3, r6 P5 R3 p R7 P8 P6 R1 p ld r6, 0(r1) Pn ROB use ex op p1 PR1 p2 PR2 Rd LPRd PRd x ld p P7 r1 P8 P0 x add P0 r3 P7 P1 October 31, 2005 6.823 L14- 9 Emer Physical Register Management Rename Physical Regs Free List Table P0 P0 R0 P1 P1 ld r1, 0(r3) R1 P0 P2 P8 P3 R2 P3 P2 add r3, r1, 4 R3 P1 P4 P7 P4 R4 P5 R6 p sub r6, r7, r6 R5 P6 R7 p R6 P3 P7 add r3, r3, r6 P5 R3 p R7 P8 P6 R1 p ld r6, 0(r1) Pn ROB use ex op p1 PR1 p2 PR2 Rd LPRd PRd x ld p P7 r1 P8 P0 x add P0 r3 P7 P1 x sub p P6 p P5 r6 P5 P3 October 31, 2005 6.823 L14- 10 Emer Physical Register Management Rename Physical Regs Free List Table P0 P0 R0 P1 P1 ld r1, 0(r3) R1 P0 P2 P8 P3 R2 P3 P2 add r3, r1, 4 R3 P1 P2 P4 P7 P4 R4 P5 R6 p sub r6, r7, r6 R5 P6 R7 p R6 P3 P7 add r3, r3, r6 P5 R3 p R7 P8 P6 R1 p ld r6, 0(r1) Pn ROB use ex op p1 PR1 p2 PR2 Rd LPRd PRd x ld p P7 r1 P8 P0 x add P0 r3 P7 P1 x sub p P6 p P5 r6 P5 P3 x add P1 P3 r3 P1 P2 October 31, 2005 6.823 L14- 11 Emer Physical Register Management Rename Physical Regs Free List Table P0 P0 R0 P1 P1 ld r1, 0(r3) R1 P0 P2 P8 P3 R2 P3 P2 add r3, r1, 4 R3 P1 P2 P4 P7 P4 R4 P5 R6 p sub r6, r7, r6 R5 P6 R7 p R6 P3 P4 P7 add r3, r3, r6 P5 R3 p R7 P8 P6 R1 p ld r6, 0(r1) Pn ROB use ex op p1 PR1 p2 PR2 Rd LPRd PRd x ld p P7 r1 P8 P0 x add P0 r3 P7 P1 x sub p P6 p P5 r6 P5 P3 x add P1 P3 r3 P1 P2 x ld P0 r6 P3 P4 October 31, 2005 6.823 L14- 12 Emer Physical Register Management Rename Physical Regs Free List Table P0 R1 P0 p R0 P1 P1 ld r1, 0(r3) R1 P0 P2 P8 P3 R2 P3 P2 add r3, r1, 4 R3 P1 P2 P4 P7 P4 R4 P5 R6 p P8 sub r6, r7, r6 R5 P6 R7 p R6 P3 P4 P7 add r3, r3, r6 P5 R3 p R7 P8 P6 R1 p ld r6, 0(r1) Pn ROB use ex op p1 PR1 p2 PR2 Rd LPRd PRd Execute & x x x ld ld p p P7 P7 r1 r1 P8 P0 P0 Commit x add P0 r3 P7 P1 p x sub p P6 p P5 r6 P5 P3 x add P1 P3 r3 P1 P2 x ld P0 r6 P3 P4 p October 31, 2005 6.823 L14- 13 Emer Physical Register Management Rename Physical Regs Free List Table P0 R1 P0 p R0 P1 R3 p P1 ld r1, 0(r3) R1 P0 P2 P8 P3 R2 P3 P2 add r3, r1, 4 R3 P1 P2 P4 P7 P4 R4 P5 R6 p P8 sub r6, r7, r6 R5 P6 R7 p P7 R6 P3 P4 P7 add r3, r3, r6 P5 R3 p R7 P8 P6 ld r6, 0(r1) Pn ROB use ex op p1 PR1 p2 PR2 Rd LPRd PRd x x ld p P7 r1 P8 P0 Execute & x x x add add P0 P0 r3 r3 P7 P1 P1 p Commit x sub p P6 p P5 r6 P5 P3 x add P1 P3 r3 P1 P2 p x ld P0 r6 P3 P4 p October 31, 2005 6.823 L14- 14 Emer Reorder Buffer Holds Active Instruction Window … … (Older instructions) Commit ld r1, (r3) ld r1, (r3) add r3, r1, r2 add r3, r1, r2 sub r6, r7, r9 sub r6, r7, r9 Execute add r3, r3, r6 add r3, r3, r6 ld r6, (r1) ld r6, (r1) add r6, r6, r3 add r6, r6, r3 st r6, (r1) Fetch st r6, (r1) ld r6, (r1) ld r6, (r1) … (Newer instructions) … Cycle t + 1 Cycle t October 31, 2005 6.823 L14- 15 Emer Issue Timing i1 Add R1,R1,1 Issue Execute 1 1 i2 Sub R1,R1,1 Issue Execute 2 2 How can we issue earlier? i1 Add R1,R1,1 Issue Execute 1 1 i2 Sub R1,R1,1 Issue Execute 2 2 What makes this schedule fail? October 31, 2005 6.823 L14- 16 Emer Issue Queue with latency prediction Inst use exec op p1 lat1 src1 p2 lat2 src2 dest ptr 2 next to commit BEQZ Speculative Instructions ptr 1 next available Issue Queue (Reorder buffer) • Fixed latency: latency included in queue entry (‘bypassed’) • Predicted latency: latency included in queue entry (speculated) • Variable latency: wait for completion signal (stall) October 31, 2005 6.823 L14- 17 Emer Data-in-ROB vs. Single Register File Data-in-ROB style FU Read Read Write Decode/ Commit Reg ROB ROB Rename File Source Dest Cache Single-register-file style FU Read Write Decode/ Issue Commit Reg Reg Queue Rename File File Cache How does issue speculation differ? October 31, 2005 6.823 L14- 18 Emer Superscalar Register Renaming • During decode, instructions allocated new physical destination register • Source operands renamed to physical register with newest value • Execution unit only sees physical register numbers Inst 1 Inst 2 Op Dest Src1 Src2 Op Dest Src1 Src2 Read Addresses Update Register Rename Table Mapping Free List Read Data Op PDest PSrc1 PSrc2 Op PDest PSrc1 PSrc2 Does this work? October 31, 2005 Write Ports 6.823 L14- 19 Emer Superscalar Register Renaming Inst 1 Inst 2 Op Dest Src1 Src2 Op Dest Src1 Src2 Read Addresses Update Register =? =? Rename Table Mapping Free List Read Data Must check for RAW hazards between instructions issuing in same cycle. Can be done in parallel with rename Op PDest PSrc1 PSrc2 Op PDest PSrc1 PSrc2 lookup. MIPS R10K renames 4 serially-RAW-dependent insts/cycle) October 31, 2005 Write Ports20 Five-minute break to stretch your legs