计算机组成与设计 原书第5版 答案

所需积分/C币:50 2018-12-25 09:07:47 2.78MB PDF
865
收藏 收藏
举报

计算机组成与设计 原书第5版 答案 这本书第五版出了几个不同的edition,这是其中最常见一版的电子版答案
s-4 Chapter 1 Solutions b. cycles(P1)=10×3×109=30×10s cycles(P2)=10×25×109=25×109s cycles(P3)=10×4×109=40×109s C.No. instructions(P1)=30×109/1.5=20×109 No. instructions(P2)=25×10/1=25×109 No. instructions(P3)=40×1092.2=18.18×109 CPI=CPI X 1. 2, then CPI(P1)=1.8, CPI(P2)=1.2, CPI(P3)=2.6 f= No instr. X CPI/time, then f(P1)=20×109×1.8/7=5.14GHz f(P2)=25×109×1.2/7=4.28GHz f(P1)=18.18×109×26/7=675GHz 16 a. Class A 105 instr. Class B: 2 x 105 instr. Class C:5 x 105 instr Class d:2×105 instr Time= No instr. X CPI/clock rate Total time p1=(105+2×105×2+5×105×3+2×105×3)/(2.5× 10)=10.4×10-4 Total time p2=(105×2+2×105×2+5×105×2+2×105×2) (3×10)=6.66×10-4s CPI(P1)=10.4×10-4×2.5×10°/10°=2.6 CPI(P2)=666×10-4×3×10/106=2.0 b. clock cycles(P1)=105×1+2×105×2+5×105×3+2×105×3 26×10 clock cycles(P2)=105×2+2×105×2+5×10×2+2×105×2 20×105 a.CPI=T×f/No. instr Compiler A cPi= 1.1 Compiler b cpi= 1.25 b.fn/f4=(No. instr:(B)×CPI(B)/(No. instr.、A)×CPI(A)=1.37 T/T 2.27 Chapter 1 Solutions 18 1.8.1C=2×DP/(V2xF) Pentium 4: C=3.2E-8F Core i5 Ivy Bridge: C=2.9E-8F 18.2 Pentium4:10/100=10% Core i5 Ivy Bridge: 30/70=42.9% 18.3(S+D)(S1,+D 0.90 new C×V2×F S,,=V,,×I V×I [D/(C×F)2 0.90×(S+D new new S=V×(S/V new Pentium 4 V×(10/1.25)=V×8 0.90×100一V×8=90-V×8 new V=[(90-V×8)/(3.2E8×3.6E9)]12 0.85V Core i5 S=V×(30/0.9)=V×333 new new 0.90×70-V×33.3=63-V×33.3 new V=[(63-V×33.3)(2.9E8×3.4E9) =0.64V 19 191 p# arith inst. #L/S inst. branch inst. cycles ex time speedup 1 2.56E9 1.28E9 2.56E8 7.94E10 39.7 1.83E9 9.14E8 2.56E8 5.67E10 28.3 1.4 4 9.12E8 4.57E8 2.56E8 2.83E10 14.2 2.8 8 4.57E8 2.29E8 2.56E8 1.42E10 7.10 5.6 Chapter 1 Solutions 1.9.2 p ex time 1 41.0 29.3 14.6 8 7.33 19.33 110 1.10.1 die area, 5cm wafer area/dies per wafer=pi*7.5/84-2.10 cm yield1sn=1/(1+(0.020*210/2)2=0.9593 die area,cm= wafer area/dies per wafer= pi*10 /100=3.14 cm 1/(1+(0.031*3.14/2)2=0.9093 1.10.2cost/die1scn=12/(8420.9593)=0.1489 cost/die2on=15/(100-0.9093)=0.1650 1.10. 3 die area,scm wafer area/dies per wafer= pi*7.5/(84*1. 1)=1.91 cm 1/(1+(0.020*1.15191/2)2=0.9575 die area2ocm wafer area/dies per wafer= pi*102/(100*1.1)=2.86 cm yield2omn=1/(1+(0.03*1.15*286/2)2=0.9082 1.10.4 defects per are 63 defects/cm2 (1-y^.5)/(y^.5 die area/2)=(1-0.92^.5)/ (0.92^.5*2/2)=0.0 defects per areao.95 (1-y^5)/(y^.5 die area/2)=(1-0.95^5)/ (0.95^.5*2/2)=0.026 defects/cm2 1.11 1.11.1 CPI= clock rate X CPU time/instr. count clock rate 1/cycle time 3 GHz CPI(bzip2)=3×10×750/(2389×10)=0.94 1. 11.2 SPEC ratio ref time/execution time SPEC ratio(bzip2)=9650/750= 12.86 1.11.3. CPU time No instr. X CPI/clock rate If CPi and clock rate do not change, the cpu time increase is equal to the increase in the of number of instructions, that is 10% Chapter 1 Solutions S-7 1. 11.4 CPU time(before)= No instr. X CPI/clock rate CPU time( after)=1.1×No. instr.×1.05×CPI/ clock rate CPU time(after)/CPU time(before)=1.1 1.05=1.155. Thus, CPU time is increased by 15.5% 1.11.5 SPECratio reference time/CPU time SPECratio(after)SPECratio(before)= CPU time(before)/CPU time(after) 1/1.1555=0.86. The SPECratio is decreased by 14% 1. 11.6 CPI =(CPU time X clock rate)/No instr CPI=700×4×10°/(0.85×2389×10)=1.37 1.11.7 Clock rate ratio=4 GHZ/3 GHz 1.33 CPI 4 GHz = 1.37, CPi @3 GHz =0.94, ratio= 1.45 They are different because, although the number of instructions has been reduced by 15%, the CPU time has been reduced by a lower percentage 1.11.8700/750=0.933. CPU time reduction:6.7% 1.11.9 No instr.= CPU time x clock rate/CPI No.inst.=960×0.9×4×109/1.61=2146×10 1.11.10 Clock rate=No instr. X CPI/CPU time Clock rate No. instr.×CPI/0.9× CPU time=1/0.9 clock rate 3.33 GHZ 1.11.11 Clock rate= No instr. X CPI/CPU time Clock rate= No instr.X085X CPI/0.80 CPU time =0.85/0.80, clock rated =3. 18 GHz 1.12 1.121T(P1)=5×109×0.9/(4×109)=1.125s T(P2)=109×0.75/(3×109)=0.25s clock rate(P1)> clock rate(P2), performance(P1)< performance(P2) 1.12.2 T(P1)=No instr. X CPI/clock rate T(P1)=2.2531021s T(P2)5N×0.75/(3×109), then n=9×10 1. 12. 3 MIPS Clock rate X10-6/CPI MIPS(P1)=4×109×10-6/0.9=444×103 S-8 Chapter 1 Solutions MIPS(P2)=3×109×10-6/0.75=40×103 MIPS(P1)> MIPS(P2), performance(P1)< performance( P2)(from 1la 1. 12.4 MFLOPS=No FP operations X 10/T MFLOPS(P1)=4×5E9×1E-6/1.125=1.78E3 MFLOPS(P2)=4×1E9×1E-6/.25=1.60E3 MFLOPS(P1)> MFLOPS(P2), performance(P1)< performance(P2 (from 11a) 113 1.13.1T 70×0.8=56s 56+85+55+40=236s. Reduction:5.6% 1.13.2T=250×0.8=200s,T+T+T new branch 165S,T=35 S, Reduction time int: 58.8% 113.3T=250×0.8=200s,T+T+T=210s.NO 114 1. 14.1 Clock cycles= CPI ×No. FP instr.+CPI×No. INT instr.+CPI, No L/S instr.+ CPI X No branch instr clock cycles/ clock rate k cycles/2×10 clock cycles= 512X 106; T CPU 0.256s To have the number of clock cycles by improving the CPI of FP instructions CPI ×No. FP instr.+CPI×No. INT instr.+CPI,×No.L/S improved fp instr.+ CPIbranch X No. branch instr. clock cycles/2 CPI improved f (clock cycles/2-(CPI X No INT instr.+ CPI X No L/S instr. CPIranch X No branch instr ))/No. FP instr CP (256-462)/50<0==> not possible 1. 14.2 Using the clock cycle data from a To have the number of clock cycles improving the cpi of L/s instructions CPl×NO. FP instr.+ CPI×No. INT instr.+CP X NO L/S mproved l/s instr. CPI branch X No branch instr.= clock cycles/2 CPI roved 1/ (clock cycles/2-(CPI X No FP instr. CPI X No INT instr.+ CPIbranch X No branch instr )/No L/S instr CPI (256-198)/80=0.725 114.3 Clock cycles=CP×No. FP instr.+CPIa×No. INT instr+CPL× No L/S instr.+ CPl X No branch instr Chapter 1 Solutions S-9 lock cycles/clock rate clock cycles2 10 CPIn=0.6×1=0.6;CPI=0.6×1=0.6;C0.7×4=28 CPI 0.7×2=1.4 Top(before improv. )=0.256 S; Tcp (after improv. )=0.171s 1-15 exec time/ actual speedup/ideal processors processor w/overhead speedup speedup 1 100 50 54 100/54=1.85 185/2=.93 4 29 100/29=3.44 3.44/4=0.86 8 12.5 16.5 100/165=6.06 606/8=0.75 16 6.25 10.25 100/10.25=9.76 9.76/16=0.61 Solutions Patterson-1610874978-0-12-407726-3 Chapter 2 Solutions S3 21 addi f, h,-5(note, no subi) add f. f 22f h+ i 23sub$t0,$S3,$S4 add sto. s6. sto 好像有问题。 ad成后t0存储着10进制的 lw$t1,16($t0) 下标。下一步应该是*4转为偏 SW$t1,32($s7) 移量再加上6即B的地址。 24B[g]=A[f]+A[1+f]; 2.5ad$t0,$S6,$s0 add$1,$S7,$S1 1W$s0,0($t0) 1W$七0,4($t0) add to, sto. $sO SW$t0,0($七1) 2.6 2.6.1 temp Array[oj temp2= Array[l] Array[o]= Array[4] Array temp: Array[4= Array[3] Array[3]= temp2; 2.621w$t0,0($s6) lw$t1,4($s6) lW$七2,16($S6) SW$t2,0($S6 SW$t0,4($s6) lW$t0,12($S6) SW$t0,16($s6) SW$t1,12($s6)

...展开详情
试读 90P 计算机组成与设计 原书第5版 答案
立即下载 身份认证后 购VIP低至7折
一个资源只可评论一次,评论内容不能少于5个字
您会向同学/朋友/同事推荐我们的CSDN下载吗?
谢谢参与!您的真实评价是我们改进的动力~
上传资源赚钱or赚积分
最新推荐
计算机组成与设计 原书第5版 答案 50积分/C币 立即下载
1/90
计算机组成与设计 原书第5版 答案第1页
计算机组成与设计 原书第5版 答案第2页
计算机组成与设计 原书第5版 答案第3页
计算机组成与设计 原书第5版 答案第4页
计算机组成与设计 原书第5版 答案第5页
计算机组成与设计 原书第5版 答案第6页
计算机组成与设计 原书第5版 答案第7页
计算机组成与设计 原书第5版 答案第8页
计算机组成与设计 原书第5版 答案第9页
计算机组成与设计 原书第5版 答案第10页
计算机组成与设计 原书第5版 答案第11页
计算机组成与设计 原书第5版 答案第12页
计算机组成与设计 原书第5版 答案第13页
计算机组成与设计 原书第5版 答案第14页
计算机组成与设计 原书第5版 答案第15页
计算机组成与设计 原书第5版 答案第16页
计算机组成与设计 原书第5版 答案第17页
计算机组成与设计 原书第5版 答案第18页

试读结束, 可继续读2页

50积分/C币 立即下载