mincemeat.py: MapReduce on Python
=================================
Introduction
------------
mincemeat.py is a Python implementation of the [MapReduce](http://en.wikipedia.org/wiki/Mapreduce) distributed computing framework.
mincemeat.py is:
* Lightweight - All of the code is contained in a single Python file (currently weighing in at <13kB) that depends only on the Python Standard Library. Any computer with Python and mincemeat.py can be a part of your cluster.
* Fault tolerant - Workers (clients) can join and leave the cluster at any time without affecting the entire process.
* Secure - mincemeat.py authenticates both ends of every connection, ensuring that only authorized code is executed.
* Open source - mincemeat.py is distributed under the [MIT License](http://en.wikipedia.org/wiki/Mit_license), and consequently is free for all use, including commercial, personal, and academic, and can be modified and redistributed without restriction.
Download
--------
* Just [mincemeat.py](https://raw.github.com/michaelfairley/mincemeatpy/master/mincemeat.py) (v 0.1.2)
* The full 0.1.2 release (includes documentation and examples)
* Clone this git repository: `https://github.com/michaelfairley/mincemeatpy.git`
Example
-------
Let's look at the canonical MapReduce example, word counting:
example.py:
```python
#!/usr/bin/env python
import mincemeat
data = ["Humpty Dumpty sat on a wall",
"Humpty Dumpty had a great fall",
"All the King's horses and all the King's men",
"Couldn't put Humpty together again",
]
def mapfn(k, v):
for w in v.split():
yield w, 1
def reducefn(k, vs):
result = 0
for v in vs:
result += v
return result
s = mincemeat.Server()
# The data source can be any dictionary-like object
s.datasource = dict(enumerate(data))
s.mapfn = mapfn
s.reducefn = reducefn
results = s.run_server(password="changeme")
print results
```
Execute this script on the server:
```bash
python example.py
```
Run mincemeat.py as a worker on a client:
```bash
python mincemeat.py -p changeme [server address]
```
And the server will print out:
```python
{'a': 2, 'on': 1, 'great': 1, 'Humpty': 3, 'again': 1, 'wall': 1, 'Dumpty': 2, 'men': 1, 'had': 1, 'all': 1, 'together': 1, "King's": 2, 'horses': 1, 'All': 1, "Couldn't": 1, 'fall': 1, 'and': 1, 'the': 2, 'put': 1, 'sat': 1}
```
This example was overly simplistic, but changing the datasource to be a collection of large files and running the client on multiple machines will work just as well. In fact, mincemeat.py has been used to produce a word frequency lists for many gigabytes of text using a slightly modified version of this code.
没有合适的资源?快使用搜索试试~ 我知道了~
web intelligence and big data
共273个文件
pdf:9个
py:3个
zip:3个
4星 · 超过85%的资源 需积分: 10 9 下载量 21 浏览量
2013-06-03
09:32:53
上传
评论
收藏 30.78MB RAR 举报
温馨提示
https://www.coursera.org/ 上面印度理工大学开设的公开课讲义和作业,包括mapreduce,贝叶斯分类,贝叶斯信念网等
资源推荐
资源详情
资源评论
收起资源包目录
web intelligence and big data (273个子文件)
c0001 17KB
1.csv 4.69MB
2.csv 702KB
hw3_result_sorted 8.45MB
LICENSE 1KB
README.md 3KB
p0001 663B
8-Predict Lecture Slides.pdf 3.8MB
3-Load-Lecture-Slides.pdf 1.73MB
2-Listen Lecture Slides.pdf 1.49MB
1-Look Lecture Slides.pdf 1.17MB
0-Introduction Lecture Slides.pdf 842KB
4-Load Lecture Slides.pdf 820KB
5-Learn Lecture Slides.pdf 542KB
6-Connect Lecture Slides.pdf 318KB
HW6.pdf 147KB
mincemeat.py 12KB
hw3.py 4KB
example.py 854B
mincemeat.pyc 13KB
genestrain.tab 58.34MB
genesblind.tab 8.44MB
作业说明文档.txt 3KB
w0001 112B
x000 22KB
x0000 23KB
x0001 24KB
x0002 25KB
x0003 24KB
x0004 24KB
x0005 24KB
x0006 24KB
x0007 27KB
x0008 25KB
x0009 22KB
x001 25KB
x0010 22KB
x0011 23KB
x0012 26KB
x0013 26KB
x0014 27KB
x0015 25KB
x0016 20KB
x0017 24KB
x0018 25KB
x0019 24KB
x002 23KB
x0020 27KB
x0021 26KB
x0022 24KB
x0023 26KB
x0024 26KB
x0025 25KB
x0026 25KB
x0027 25KB
x0028 25KB
x0029 25KB
x003 23KB
x0030 26KB
x0031 25KB
x0032 24KB
x0033 23KB
x0034 25KB
x0035 25KB
x0036 25KB
x0037 23KB
x0038 24KB
x0039 25KB
x004 22KB
x0040 25KB
x0041 24KB
x0042 24KB
x0043 26KB
x0044 27KB
x0045 24KB
x0046 25KB
x0047 25KB
x0048 23KB
x0049 24KB
x005 25KB
x0050 25KB
x0051 25KB
x0052 24KB
x0053 23KB
x0054 27KB
x0055 27KB
x0056 27KB
x0057 25KB
x0058 25KB
x0059 25KB
x006 25KB
x0060 27KB
x0061 28KB
x0062 23KB
x0063 25KB
x0064 27KB
x0065 28KB
x0066 28KB
x0067 25KB
x0068 25KB
共 273 条
- 1
- 2
- 3
资源评论
- wy68682014-06-02有程序演示,适合初学
- guoxze2014-05-29一般吧,不是很深入
Felven
- 粉丝: 2924
- 资源: 174
上传资源 快速赚钱
- 我的内容管理 展开
- 我的资源 快来上传第一个资源
- 我的收益 登录查看自己的收益
- 我的积分 登录查看自己的积分
- 我的C币 登录后查看C币余额
- 我的收藏
- 我的下载
- 下载帮助
安全验证
文档复制为VIP权益,开通VIP直接复制
信息提交成功