实用架构教程-谷歌架构英文版GoogleArchitecture资源-CSDN文库

共1个文件

pdf：1个

课程资源

145 浏览量 2024-08-31 05:26:26 上传评论收藏 47KB ZIP 举报

展开

资源推荐

资源详情

资源评论

收起资源包目录

Google Architecture.zip （1个子文件）

Google Architecture.pdf 66KB

Google Architecture

Saturday, November 22, 2008 at 10:01AM

Todd Hoff in BigTable, C, Cluster File System, Example, Geo-distributed Clusters, Java,

Linux, Map Reduce, Python

Update 2: Sorting 1 PB with MapReduce. PB is not peanut-butter-and-

jelly misspelled. It's 1 petabyte or 1000 terabytes or 1,000,000 gigabytes.

It took six hours and two minutes to sort 1PB (10 trillion 100-byte

records) on 4,000 computers and the results were replicated thrice on

48,000 disks.

Update: Greg Linden points to a new Google article MapReduce:

simplified data processing on large clusters. Some interesting stats:

100k MapReduce jobs are executed each day; more than 20 petabytes of

data are processed per day; more than 10k MapReduce programs have

been implemented; machines are dual processor with gigabit ethernet and

4-8 GB of memory.

Google is the King of scalability. Everyone knows Google for their large,

sophisticated, and fast searching, but they don't just shine in search. Their

platform approach to building scalable applications allows them to roll out

internet scale applications at an alarmingly high competition crushing

rate. Their goal is always to build a higher performing higher scaling

infrastructure to support their products. How do they do that?

Information Sources

1. Video: Building Large Systems at Google

2. Google Lab: The Google File System

3. Google Lab: MapReduce: Simplified Data Processing on

http://highscalability.com/google-architecture

http://weibo.com/developerworks 2012-11-11 整理

第 1／9页

Large Clusters

4. Google Lab: BigTable.

5. Video: BigTable: A Distributed Structured Storage System.

6. Google Lab: The Chubby Lock Service for Loosely-Coupled

Distributed Systems.

7. How Google Works by David Carr in Baseline Magazine.

8. Google Lab: Interpreting the Data: Parallel Analysis with

Sawzall.

9. Dare Obasonjo's Notes on the scalability conference.

Platform

1. Linux

2. A large diversity of languages: Python, Java, C++

What's Inside?

The Stats

1. Estimated 450,000 low-cost commodity servers in 2006

2. In 2005 Google indexed 8 billion web pages. By now, who knows?

3. Currently there over 200 GFS clusters at Google. A cluster can have

1000 or even 5000 machines. Pools of tens of thousands of

machines retrieve data from GFS clusters that run as large as 5

petabytes of storage. Aggregate read/write throughput can be as

high as 40 gigabytes/second across the cluster.

4. Currently there are 6000 MapReduce applications at Google and

hundreds of new applications are being written each month.

5. BigTable scales to store billions of URLs, hundreds of terabytes of

satellite imagery, and preferences for hundreds of millions of users.

The Stack

http://highscalability.com/google-architecture

http://weibo.com/developerworks 2012-11-11 整理

第 2／9页

Google visualizes their infrastructure as a three layer stack:

1. Products: search, advertising, email, maps, video, chat, blogger

2. Distributed Systems Infrastructure: GFS, MapReduce, and BigTable.

3. Computing Platforms: a bunch of machines in a bunch of different

data centers

4. Make sure easy for folks in the company to deploy at a low cost.

5. Look at price performance data on a per application basis. Spend

more money on hardware to not lose log data, but spend less on

other types of data. Having said that, they don't lose data.

Reliable Storage Mechanism with GFS (Google

File System)

1. Reliable scalable storage is a core need of any application. GFS is

their core storage platform.

2. Google File System - large distributed log structured file system in

which they throw in a lot of data.

3. Why build it instead of using something off the shelf? Because they

control everything and it's the platform that distinguishes them from

everyone else. They required:

- high reliability across data centers

- scalability to thousands of network nodes

- huge read/write bandwidth requirements

- support for large blocks of data which are gigabytes in size.

- efficient distribution of operations across nodes to reduce

bottlenecks

4. System has master and chunk servers.

- Master servers keep metadata on the various data files. Data are

stored in the file system in 64MB chunks. Clients talk to the master

servers to perform metadata operations on files and to locate the

chunk server that contains the needed they need on disk.

http://highscalability.com/google-architecture

http://weibo.com/developerworks 2012-11-11 整理

第 3／9页

评论收藏

内容反馈

#完美解决问题
#运行顺畅
#内容详尽
#全网独家
#注释完整

cesske

粉丝: 1160
资源: 40

实用架构教程-谷歌架构英文版Google Architecture

最新资源

实用架构教程-谷歌架构英文版Google Architecture

软件架构 英文版教材

Architecture Patterns with Python-2020-英文版 笔记

架构真经-英文版本第二版

nvidia-ampere-architecture-whitepaper.pdf

android-architecture-components, Android架构组件示例.zip

architecture-samples-todo-mvp_Android官方MVP架构示例_android_balllyd_源

posa 面向模式的软件架构卷1-卷5,英文原版

软件架构---构建可持续软件架构的模式语言

NET-Microservices-Architecture-for-Containerized-NET-Applications

Clean Architecture：软件架构与设计匠艺（英文版）

Federal-Enterprise-Architecture-Framework-v2-as-of-Jan-29-2013[1].pdf

软件架构简述---An Introduction to Software Architecture

NIST零信任架构SP 800-207 标准草案（中文版）.pdf

3-Tier-Architecture-using-dapper-master.zip

软件架构文档Software-Architecture-Document

flutter-architecture-blueprints-源码.rar

android-architecture-sample,使用kotlin、coroutines、架构组件等的示例应用程序。进行单元测试和仪器测试。.zip

介绍了N层架构(N-Tier Architecture)

AWS-Architecture-Icons

面向模式的体系结构全集（英文版）.zip

Android架构文章Awesome-Android-Architecture.zip

NIST-SP-800-207-Zero-Trust-Architecture（中文翻译）.zip

Clean Architecture 干净的架构

最新资源

软件架构英文版教材

Architecture Patterns with Python-2020-英文版笔记