# Lab4 CacheLab
The original tar file is `cachelab-handout.tar` and the writeup is `attacklab.pdf`. My solution is in `cachelab-handout`.
Read `6 Working on the Lab` in the writeup file before starting Part A, and Part B: these notes are at the end of the file and you might not notice them early enough. However, they provide useful information about the two tasks.
### Part A
PartA is relatively easy, if you remember the following:
- When reading the address, use `long long` for possible long address, and use `%llx`, to read a `long long int` hex form.
- As `s`, `E`, `b` are all constants determined at run time, you cannot construct an array directly. You have to use `malloc` to dynamically allocate memory on the heap. Note that, actually the compiler would not stop you from doing so, but the result is undefined.
- You should be clear about the operations: `L`, `S`, and `M`.
- `L`: load the data into cache. Possible outcome: `hit`, `miss`, `miss eviction`.
- `S`: store/write data from cache back to memory. If the data is already in the cache, write back. If the data is not originally in the cache, load the data into cache, and then write back. Possible outcome: `hit`, `miss`, `miss eviction`.
- `M`: modify some data (load the data into cache, modify, and write back to memory), can be considered as a combination of `L` and `S`. Because after `L`, the data must be in the cache, so the outcome of `S` must be `hit`.
### Part B
Part B is much harder. To solve this, you should recall how cache works: by extracting the set bits and tag bits from the address. Thus, the cache eviction behaves differently when dealing with Matrix of different size. For example, in 32x32 problem, the cache can contain at most 8 rows, so the 9th row would evict the 1st row; but in 64x64 problem, the cache can contain at most 4 rows, so the 5th row would evict the 1th row.
Blocking technique is used for all problems in Part B.
Estimating/Computing cache miss is an important skill you pick up and get familar with when working on this part.
There are a lot of notes and explanationn in the `trans.c` file.
*Sidenote*: I found the 64x64 problem very hard and turned to Google for help. When searching in English, most results only contain code but no explanation at all. When searching in Chinese "cache lab 解答", a lot of useful blogs and answers come out.
- 32 x 32
This part is still handlable. The cache can contain at most 8 rows, so the block size cannot be larger than 8, otherwise cache misses occur even within the same block. It turns out that 8x8 blocking gives satisfying result.
- 64 x 64
This part is much harder. The cache can contain at most 4 rows, so the block size cannot be larger than 4. Function `blocksize_4_64_64` applies the blocking naively but the result is 1699 misses, the reason is that since the blocking size is smaller than the size of the if cache line, cache is not fully utilizd. I improved the function to `blocksize_8_4_64_64`, where we consider four 4x4 blocks within a 8x8 block at the same time. The performance is better, with 1475 misses and gives me 6/8 points. I did not pursue further improvement, as I already spend a lot of time on it.
- 67 x 61
Much simpler than 64 x 64 case, as the number is not good. Simply apply the blocking naively and experiment with different blocking size. Block size of 8 and 16 both give satisfying result.
### Thoughts
I spent around two hours on Part A, when I did not use `malloc` to allocate the cache data structure, but use `cache_line_t cache[S][E]` directly, even though the writeup explicity requires the code to use `malloc`. When using `cache_line_t cache[S][E]`, after I set the valid bits to some value, they become random value when I try to access them in other functions. Then I swithch to `malloc`, no such problem occurs, and I finish the problem in one or two hours.
The lesson from this is that: First, when you need an array whose size is not known at compile time, use `malloc` to allocate memory dynamically. Second, and more importantly, do not be afraid to use `malloc` and pointers in C.
I also wasted some time on Part B. Somehow at some point, I began to think that each row in the matrix would be mapped to a cache set and blablabla, which is obviously wrong. But, coincidently, I finished the 32 x 32 case successfylly, which is even worse because it took a very long time for me to find this great mistake in concept...
The process is a bit painful, but really help me learn cache mechanism, blocking and caching performance compution.
没有合适的资源?快使用搜索试试~ 我知道了~
著名的CMU CSAPP:3e的Lab源码与答案 cmu 15-213
共504个文件
c:183个
h:43个
txt:30个
需积分: 5 1 下载量 177 浏览量
2024-01-12
11:40:34
上传
评论
收藏 58.13MB 7Z 举报
温馨提示
著名的CMU CSAPP:3e的实验,里面包含了原始的Lab源码, 也有相关的书籍,非CS:APP 而是需要准备的知识,例如: c programming language (2nd edition)
资源推荐
资源详情
资源评论
收起资源包目录
著名的CMU CSAPP:3e的Lab源码与答案
cmu 15-213 (504个子文件)
adder 9KB
bomb 26KB
btest 20KB
mdriver.c 32KB
mdriver.c 32KB
csapp.c 24KB
csapp.c 24KB
csapp.c 24KB
csapp.c 24KB
csapp.c 24KB
csapp.c 23KB
mm.c 21KB
tsh.c 21KB
bits.c 20KB
mm_copy.c 20KB
console.c 16KB
btest.c 15KB
qtest.c 15KB
mm.c 15KB
proxy.c 15KB
mm.c 12KB
trans.c 12KB
mm.c 11KB
tiny.c 9KB
tiny.c 9KB
report.c 9KB
tiny.c 9KB
tiny.c 8KB
test-trans.c 8KB
clock.c 7KB
clock.c 7KB
ch2_note.c 6KB
csim.c 6KB
ch1_examples.c 6KB
harness.c 6KB
queue.c 6KB
fcyc.c 5KB
fcyc.c 5KB
echoservers.c 4KB
echoservers.c 4KB
ch5_note.c 4KB
ch1_note.c 4KB
bomb.c 4KB
mm.c 4KB
ch6_note.c 3KB
mountain.c 3KB
fshow.c 3KB
ftimer.c 3KB
ftimer.c 3KB
mymalloc.c 3KB
psum-local.c 3KB
psum-local.c 3KB
tracegen.c 3KB
psum-mutex.c 3KB
psum-mutex.c 3KB
farm.c 3KB
show-bytes.c 3KB
psum-array.c 3KB
psum-array.c 3KB
shellex.c 3KB
ex1_20_21.c 2KB
ch4_note_function_prototype.c 2KB
memlib.c 2KB
memlib.c 2KB
memlib.c 2KB
cachelab.c 2KB
memlib.c 2KB
decl.c 2KB
vec.c 2KB
sbuf.c 2KB
sbuf.c 2KB
url_parser.c 2KB
netpfragments.c 2KB
netpfragments.c 2KB
ishow.c 2KB
tests.c 1KB
main.c 1KB
select.c 1KB
select.c 1KB
procmask2.c 1KB
echoserverp.c 1KB
echoservert_pre.c 1KB
echoservert_pre.c 1KB
fsecs.c 1KB
fsecs.c 1KB
hostinfo_i.c 1KB
echoserverp.c 1KB
procmask1.c 1KB
ex1_16.c 1KB
badcnt.c 1KB
badcnt.c 1KB
hostinfo.c 1KB
hostinfo.c 1KB
adder.c 1KB
adder.c 1KB
echoserveri.c 1KB
echoservert.c 1KB
echoservert.c 1KB
adder.c 1001B
waitpid1.c 972B
共 504 条
- 1
- 2
- 3
- 4
- 5
- 6
资源评论
圆圆胖胖的车车
- 粉丝: 197
- 资源: 5
上传资源 快速赚钱
- 我的内容管理 展开
- 我的资源 快来上传第一个资源
- 我的收益 登录查看自己的收益
- 我的积分 登录查看自己的积分
- 我的C币 登录后查看C币余额
- 我的收藏
- 我的下载
- 下载帮助
最新资源
- XILINXFPGA源码Xilinxspratan3xcs100E(VGAPS2)
- XILINXFPGA源码XilinxSPARTAN-3E入门开发板实例
- XILINXFPGA源码XilinxSdramVerilog和VHDL版本文档
- 物联网智能家居方案-基于Nucleo-STM32L073&机智云(大赛作品,文档齐全,可直接运行)(文档加Matlab源码)
- XILINXFPGA源码XilinxISE9.xFPGACPLD设计源码
- 成都市地图含高新区(高新南区,高新西区),天府新区,东部新区虚拟行政区划
- XILINXFPGA源码XilinxEDK设计试验
- XILINXFPGA源码XilinxEDKMicroBlaze内置USB固件程序
- 基于 django 的视频点播后台管理系统源代码+数据库
- 基于Java的网上医院预约挂号系统的设计与实现(部署视频)-kaic.mp4
资源上传下载、课程学习等过程中有任何疑问或建议,欢迎提出宝贵意见哦~我们会及时处理!
点击此处反馈
安全验证
文档复制为VIP权益,开通VIP直接复制
信息提交成功