# 计算机组成与设计 原书第5版 答案

865

s-4 Chapter 1 Solutions b. cycles(P1)=10×3×109=30×10s cycles(P2)=10×25×109=25×109s cycles(P3)=10×4×109=40×109s C.No. instructions(P1)=30×109/1.5=20×109 No. instructions(P2)=25×10/1=25×109 No. instructions(P3)=40×1092.2=18.18×109 CPI=CPI X 1. 2, then CPI(P1)=1.8, CPI(P2)=1.2, CPI(P3)=2.6 f= No instr. X CPI/time, then f(P1)=20×109×1.8/7=5.14GHz f(P2)=25×109×1.2/7=4.28GHz f(P1)=18.18×109×26/7=675GHz 16 a. Class A 105 instr. Class B: 2 x 105 instr. Class C:5 x 105 instr Class d:2×105 instr Time= No instr. X CPI/clock rate Total time p1=(105+2×105×2+5×105×3+2×105×3)/(2.5× 10)=10.4×10-4 Total time p2=(105×2+2×105×2+5×105×2+2×105×2) (3×10)=6.66×10-4s CPI(P1)=10.4×10-4×2.5×10°/10°=2.6 CPI(P2)=666×10-4×3×10/106=2.0 b. clock cycles(P1)=105×1+2×105×2+5×105×3+2×105×3 26×10 clock cycles(P2)=105×2+2×105×2+5×10×2+2×105×2 20×105 a.CPI=T×f/No. instr Compiler A cPi= 1.1 Compiler b cpi= 1.25 b.fn/f4=(No. instr:(B)×CPI(B)/(No. instr.、A)×CPI(A)=1.37 T/T 2.27 Chapter 1 Solutions 18 1.8.1C=2×DP/(V2xF) Pentium 4: C=3.2E-8F Core i5 Ivy Bridge: C=2.9E-8F 18.2 Pentium4:10/100=10% Core i5 Ivy Bridge: 30/70=42.9% 18.3(S+D)(S1,+D 0.90 new C×V2×F S,,=V,,×I V×I [D/(C×F)2 0.90×(S+D new new S=V×(S/V new Pentium 4 V×(10/1.25)=V×8 0.90×100一V×8=90-V×8 new V=[(90-V×8)/(3.2E8×3.6E9)]12 0.85V Core i5 S=V×(30/0.9)=V×333 new new 0.90×70-V×33.3=63-V×33.3 new V=[(63-V×33.3)(2.9E8×3.4E9) =0.64V 19 191 p# arith inst. #L/S inst. branch inst. cycles ex time speedup 1 2.56E9 1.28E9 2.56E8 7.94E10 39.7 1.83E9 9.14E8 2.56E8 5.67E10 28.3 1.4 4 9.12E8 4.57E8 2.56E8 2.83E10 14.2 2.8 8 4.57E8 2.29E8 2.56E8 1.42E10 7.10 5.6 Chapter 1 Solutions 1.9.2 p ex time 1 41.0 29.3 14.6 8 7.33 19.33 110 1.10.1 die area, 5cm wafer area/dies per wafer=pi*7.5/84-2.10 cm yield1sn=1/(1+(0.020*210/2)2=0.9593 die area,cm= wafer area/dies per wafer= pi*10 /100=3.14 cm 1/(1+(0.031*3.14/2)2=0.9093 1.10.2cost/die1scn=12/(8420.9593)=0.1489 cost/die2on=15/(100-0.9093)=0.1650 1.10. 3 die area,scm wafer area/dies per wafer= pi*7.5/(84*1. 1)=1.91 cm 1/(1+(0.020*1.15191/2)2=0.9575 die area2ocm wafer area/dies per wafer= pi*102/(100*1.1)=2.86 cm yield2omn=1/(1+(0.03*1.15*286/2)2=0.9082 1.10.4 defects per are 63 defects/cm2 (1-y^.5)/(y^.5 die area/2)=(1-0.92^.5)/ (0.92^.5*2/2)=0.0 defects per areao.95 (1-y^5)/(y^.5 die area/2)=(1-0.95^5)/ (0.95^.5*2/2)=0.026 defects/cm2 1.11 1.11.1 CPI= clock rate X CPU time/instr. count clock rate 1/cycle time 3 GHz CPI(bzip2)=3×10×750/(2389×10)=0.94 1. 11.2 SPEC ratio ref time/execution time SPEC ratio(bzip2)=9650/750= 12.86 1.11.3. CPU time No instr. X CPI/clock rate If CPi and clock rate do not change, the cpu time increase is equal to the increase in the of number of instructions, that is 10% Chapter 1 Solutions S-7 1. 11.4 CPU time(before)= No instr. X CPI/clock rate CPU time( after)=1.1×No. instr.×1.05×CPI/ clock rate CPU time(after)/CPU time(before)=1.1 1.05=1.155. Thus, CPU time is increased by 15.5% 1.11.5 SPECratio reference time/CPU time SPECratio(after)SPECratio(before)= CPU time(before)/CPU time(after) 1/1.1555=0.86. The SPECratio is decreased by 14% 1. 11.6 CPI =(CPU time X clock rate)/No instr CPI=700×4×10°/(0.85×2389×10)=1.37 1.11.7 Clock rate ratio=4 GHZ/3 GHz 1.33 CPI 4 GHz = 1.37, CPi @3 GHz =0.94, ratio= 1.45 They are different because, although the number of instructions has been reduced by 15%, the CPU time has been reduced by a lower percentage 1.11.8700/750=0.933. CPU time reduction:6.7% 1.11.9 No instr.= CPU time x clock rate/CPI No.inst.=960×0.9×4×109/1.61=2146×10 1.11.10 Clock rate=No instr. X CPI/CPU time Clock rate No. instr.×CPI/0.9× CPU time=1/0.9 clock rate 3.33 GHZ 1.11.11 Clock rate= No instr. X CPI/CPU time Clock rate= No instr.X085X CPI/0.80 CPU time =0.85/0.80, clock rated =3. 18 GHz 1.12 1.121T(P1)=5×109×0.9/(4×109)=1.125s T(P2)=109×0.75/(3×109)=0.25s clock rate(P1)> clock rate(P2), performance(P1)< performance(P2) 1.12.2 T(P1)=No instr. X CPI/clock rate T(P1)=2.2531021s T(P2)5N×0.75/(3×109), then n=9×10 1. 12. 3 MIPS Clock rate X10-6/CPI MIPS(P1)=4×109×10-6/0.9=444×103 S-8 Chapter 1 Solutions MIPS(P2)=3×109×10-6/0.75=40×103 MIPS(P1)> MIPS(P2), performance(P1)< performance( P2)(from 1la 1. 12.4 MFLOPS=No FP operations X 10/T MFLOPS(P1)=4×5E9×1E-6/1.125=1.78E3 MFLOPS(P2)=4×1E9×1E-6/.25=1.60E3 MFLOPS(P1)> MFLOPS(P2), performance(P1)< performance(P2 (from 11a) 113 1.13.1T 70×0.8=56s 56+85+55+40=236s. Reduction:5.6% 1.13.2T=250×0.8=200s,T+T+T new branch 165S,T=35 S, Reduction time int: 58.8% 113.3T=250×0.8=200s,T+T+T=210s.NO 114 1. 14.1 Clock cycles= CPI ×No. FP instr.+CPI×No. INT instr.+CPI, No L/S instr.+ CPI X No branch instr clock cycles/ clock rate k cycles/2×10 clock cycles= 512X 106; T CPU 0.256s To have the number of clock cycles by improving the CPI of FP instructions CPI ×No. FP instr.+CPI×No. INT instr.+CPI,×No.L/S improved fp instr.+ CPIbranch X No. branch instr. clock cycles/2 CPI improved f (clock cycles/2-(CPI X No INT instr.+ CPI X No L/S instr. CPIranch X No branch instr ))/No. FP instr CP (256-462)/50<0==> not possible 1. 14.2 Using the clock cycle data from a To have the number of clock cycles improving the cpi of L/s instructions CPl×NO. FP instr.+ CPI×No. INT instr.+CP X NO L/S mproved l/s instr. CPI branch X No branch instr.= clock cycles/2 CPI roved 1/ (clock cycles/2-(CPI X No FP instr. CPI X No INT instr.+ CPIbranch X No branch instr )/No L/S instr CPI (256-198)/80=0.725 114.3 Clock cycles=CP×No. FP instr.+CPIa×No. INT instr+CPL× No L/S instr.+ CPl X No branch instr Chapter 1 Solutions S-9 lock cycles/clock rate clock cycles2 10 CPIn=0.6×1=0.6;CPI=0.6×1=0.6;C0.7×4=28 CPI 0.7×2=1.4 Top(before improv. )=0.256 S; Tcp (after improv. )=0.171s 1-15 exec time/ actual speedup/ideal processors processor w/overhead speedup speedup 1 100 50 54 100/54=1.85 185/2=.93 4 29 100/29=3.44 3.44/4=0.86 8 12.5 16.5 100/165=6.06 606/8=0.75 16 6.25 10.25 100/10.25=9.76 9.76/16=0.61 Solutions Patterson-1610874978-0-12-407726-3 Chapter 2 Solutions S3 21 addi f, h,-5(note, no subi) add f. f 22f h+ i 23sub\$t0,\$S3,\$S4 add sto. s6. sto 好像有问题。 ad成后t0存储着10进制的 lw\$t1,16(\$t0) 下标。下一步应该是*4转为偏 SW\$t1,32(\$s7) 移量再加上6即B的地址。 24B[g]=A[f]+A[1+f]; 2.5ad\$t0,\$S6,\$s0 add\$1,\$S7,\$S1 1W\$s0,0(\$t0) 1W\$七0,4(\$t0) add to, sto. \$sO SW\$t0,0(\$七1) 2.6 2.6.1 temp Array[oj temp2= Array[l] Array[o]= Array[4] Array temp: Array[4= Array[3] Array[3]= temp2; 2.621w\$t0,0(\$s6) lw\$t1,4(\$s6) lW\$七2,16(\$S6) SW\$t2,0(\$S6 SW\$t0,4(\$s6) lW\$t0,12(\$S6) SW\$t0,16(\$s6) SW\$t1,12(\$s6)

...展开详情

1/90

50积分/C币 立即下载