【免费】cse-lab-lab4资源-CSDN文库

共84个文件

h：27个

cc：25个

sh：12个

需积分: 0 159 浏览量更新于2022-12-10 收藏 710KB ZIP 举报

"cse-lab-lab4"可能是指计算机科学与工程（Computer Science and Engineering）课程中的实验四。在IT领域，实验通常用于让学生通过实际操作来理解和掌握理论知识。这个实验可能涵盖了一些特定的技术主题，比如编程、数据结构、算法、操作系统、网络或者数据库等。中提到的"no"可能表示没有提供详细的实验描述，这可能是因为它是一个简化的版本，或者详细说明在其他文档或课程资料中。在实际的课程中，实验描述通常会包含目标、预期结果、步骤指南以及可能遇到的问题和解决方案。 "a"可能代表实验类别、难度级别或其他分类标准。由于只有一个字母，具体含义需要更多的上下文信息来确定。通常，标签可以用来区分不同类型的实验，例如"A"可能代表基础或入门级别的实验。【压缩包子文件的文件名称列表】"cse-lab-lab4"可能是实验资料的打包文件，可能包含了源代码、数据文件、报告模板、阅读材料等。解压这个文件，学生通常能够找到完成实验所需的所有资源。在这个实验中，学生可能会学习到以下知识点： 1. **编程语言基础**：实验可能要求使用某种编程语言（如Python、Java或C++），强化语法、控制流、函数和数据结构的理解。 2. **数据结构**：实验可能涉及链表、树、图、栈、队列等基本数据结构的使用和实现，以及它们在问题解决中的应用。 3. **算法分析**：可能需要设计和分析算法的效率，包括时间复杂度和空间复杂度的计算。 4. **操作系统概念**：如果涉及到进程、线程、内存管理或I/O操作，学生将深化对操作系统工作原理的理解。 5. **计算机网络**：实验可能涵盖网络协议、TCP/IP模型、HTTP请求、Socket编程等内容。 6. **数据库操作**：可能需要使用SQL查询语言进行数据操作，理解数据库的基本概念，如关系模型、索引、事务等。 7. **软件工程实践**：包括版本控制（如Git）、代码风格、测试和调试技巧。 8. **问题解决**：培养解决问题的能力，包括分析问题、设计解决方案和调试代码。 9. **文档编写**：可能需要提交实验报告，锻炼书面表达和文档组织能力。没有更详细的信息，无法深入讨论具体的技术细节。但根据常规的教育模式，以上是可能涵盖的一些核心知识点。在实际的学习过程中，学生应结合实验指导书、课程材料和教师的讲解，逐步掌握这些技能并完成实验任务。

收起资源包目录

cse-lab-lab4.zip （84个子文件）

cse-lab-lab4

test-lab1-part2-a.pl 3KB

extent_protocol.h 901B

test-lab4-a.sh 636B

mr_coordinator.cc 5KB

inode_manager.h 2KB

chfs_client.h 2KB

GNUmakefile 4KB

extent_smain.cc 764B

lang

algorithm.h 272B

verify.h 240B

extent_sdist_main.cc 1KB

grade.sh 2KB

gettime.cc 4KB

test-lab3-part5-b.sh 400B

mr_sequential.cc 3KB

raft_protocol.h 3KB

inode_manager.cc 11KB

part2_tester.sh 3KB

extent_client.cc 2KB

extent_server_dist.h 975B

test-lab1-part2-d.sh 853B

gettime.h 243B

test-lab1-part2-f.sh 993B

stop.sh 335B

assets

installsnapshot.png 145KB

snapshot1.png 62KB

leader_election.png 70KB

state.png 131KB

appendEntries.png 116KB

part5.png 42KB

test-lab1-part2-e.sh 680B

chfs_state_machine.cc 4KB

extent_server.cc 2KB

raft_test_utils.h 15KB

handle.cc 2KB

raft_protocol.cc 2KB

handle.h 2KB

rpc

thr_pool.h 3KB

pollmgr.cc 7KB

gettime.cc 4KB

fifo.h 2KB

jsl_log.h 694B

gettime.h 243B

connection.cc 10KB

method_thread.h 4KB

marshall.h 6KB

pollmgr.h 2KB

connection.h 2KB

rpctest.cc 11KB

rpc.cc 28KB

rpc.h 16KB

slock.h 334B

jsl_log.cc 109B

thr_pool.cc 1KB

fuse.cc 17KB

mytest_lab3.sh 382B

chfs_state_machine.h 2KB

raft_storage.h 4KB

raft_test.cc 21KB

test-lab4-b.sh 1KB

.gitignore 106B

mr_worker.cc 5KB

extent_client.h 761B

raft_state_machine.h 909B

mr_protocol.h 1KB

start.sh 1KB

test-lab1-part2-c.pl 4KB

chfs_client.cc 14KB

raft.h 28KB

README.md 5KB

extent_server_dist.cc 5KB

raft_test_utils.cc 6KB

test-lab3-part5-a.pl 5KB

test-lab3-part5.sh 2KB

extent_server.h 755B

novels

being_ernest.txt 19KB

dorian_gray.txt 28KB

frankenstein.txt 7KB

mr-wc-correct 38KB

sherlock_holmes.txt 19KB

lcoal-mr-wc-correct 38KB

metamorphosis.txt 37KB

tom_sawyer.txt 7KB

test-lab3-part5-b.cc 4KB

身份认证购VIP最低享 7 折!

30元优惠券

资源推荐

资源预览

资源评论

# Lab 4: Map Reduce on Fault-tolerant Distributed Filesystem ### Getting started Before starting this lab, please back up all of your prior labs' solutions. ```bash $ cd cse-lab $ git commit -a -m "upload lab3-sol" # Then, pull this lab from the repo: $ git pull # Next, switch to the lab4 branch: $ git checkout lab4 # Notice: lab4 is based on lab3. # Please merge with branch lab3, and solve the conflicts. $ git merge lab3 # After merging the conflicts, you should be able to compile the new project successfully: $ chmod -R o+w `pwd` $ sudo docker run -it --rm --privileged --cap-add=ALL -v `pwd`:/home/stu/cse-lab lqyuan980413/cselab_env:2022lab4 /bin/bash $ cd cse-lab $ make clean && make ``` (Reference: MIT 6.824 Distributed Systems) In this lab, you are asked to build a MapReduce framework on top of your Distributed Filesystem implemented in Lab1-3. **Make sure that you can pass all the tests in lab3 before you get start.** You will implement a worker process that calls Map and Reduce functions and handles reading and writing files, and a coordinator process that hands out tasks to workers and copes with failed workers. You can refer to [the MapReduce paper](https://www.usenix.org/legacy/events/osdi04/tech/full_papers/dean/dean.pdf) for more details (Note that this lab uses "coordinator" instead of the paper's "master"). There are four files added for this part: `mr_protocol.h`, `mr_sequential.cc`, `mr_coordinator.cc`, `mr_worker.cc`. ### Task 1 - `mr_sequential.cc` is a sequential mapreduce implementation, running Map and Reduce once at a time within a single process. - You task is to implement the Mapper and Reducer for Word Count in `mr_sequential.cc`. ### Task 2 - Your task is to implement a distributed MapReduce, consisting of two programs, `mr_coordinator.cc` and `mr_worker.cc`. There will be only one coordinator process, but one or more worker processes executing concurrently. - The workers should talk to the coordinator via the `RPC`. One way to get started is to think about the RPC protocol in `mr_protocol.h` first. - Each worker process will ask the coordinator for a task, read the task's input from one or more files, execute the task, and write the task's output to one or more files. - The coordinator should notice if a worker hasn't completed its task in a reasonable amount of time, and give the same task to a different worker. - In a real system, the workers would run on a bunch of different machines, but for this lab you'll run them all on a single machine. - MapReduce relies on the workers sharing a file system. This is why we ask you to implement a global distributed ChFS in the first place. #### Hints - The number of Mappers equals to the number of files be to processed. Each mapper only processes one file at one time. - The number of Reducers equals is a fixed number defined in `mr_protocol.h`. - The basic loop of one worker is the following: ask one task (Map or Reduce) from the coordinator, do the task and write the intermediate key-value into a file, then submit the task to the coordinator in order to hint a completion. - The basic loop of the coordinator is the following: assign the Map tasks first; when all Map tasks are done, then assign the Reduce tasks; when all Reduce tasks are done, the `Done()` loop returns true indicating that all tasks are completely finished. - Workers sometimes need to wait, e.g. reduces can't start until the last map has finished. One possibility is for workers to periodically ask the coordinator for work, sleeping between each request. Another possibility is for the relevant RPC handler in the coordinator to have a loop that waits. - The coordinator, as an RPC server, should be concurrent; hence please don't forget to lock the shared data. - The Map part of your workers can use a hash function to distribute the intermediate key-values to different files intended for different Reduce tasks. - A reasonable naming convention for intermediate files is mr-X-Y, where X is the Map task number, and Y is the reduce task number. The worker's map task code will need a way to store intermediate key/value pairs in files in a way that can be correctly read back during reduce tasks. - Intermediate files will be operated on your distributed file system implemented in lab 3. If the file system's performance is bad, it *shall not pass the test* ! ### Grading Before grading, we will first check your lab3 implementation using the grade script in lab3. If lab3's tests fail, you can only get half of the score of lab4. After you have implement part1 & part2, run the grading script: ``` % ./grade.sh Passed part A (Word Count) Passed part B (Word Count with distributed MapReduce) Lab4 passed Passed all tests! Score: 100/100 ``` We will test your MapReduce following the evaluation criteria above. ## Handin Procedure After all above done: ``` % make handin ``` That should produce a file called lab4.tgz in the directory. Change the file name to your student id: ``` % mv lab4.tgz lab4_[your student id].tgz ``` Then upload **lab4_[your student id].tgz** file to [Canvas](https://oc.sjtu.edu.cn/courses/49245/assignments/197178) before the deadline. You'll receive full credits if your code passes the same tests that we gave you, when we run your code on our machines.