introduction-to-gpfs-3-5.zip_gpfs_site:www.pudn.com资源-CSDN文库

共1个文件

pdf：1个

版权申诉

gpfs

51 浏览量 2022-09-24 20:00:36 上传评论收藏 815KB ZIP 举报

资源推荐

资源详情

资源评论

收起资源包目录

introduction-to-gpfs-3-5.zip （1个子文件）

introduction-to-gpfs-3-5.pdf 948KB

IBM Systems and Technology Group

By Scott Fadden, IBM Corporation

[email protected]

August 2012

An Introduction to GPFS Version 3.5

Technologies that enable the management of big data.

An introduction to GPFS

Page 2

Contents

INTRODUCTION ........................................................................ 2

WHAT IS GPFS? ...................................................................... 3

THE FILE SYSTEM ...................................................................... 3

Application interfaces ..................................................... 4

Performance and scalability ............................................ 4

Administration ................................................................ 5

DATA AVAILABILITY ................................................................... 7

Data Replication .............................................................. 8

GPFS NATIVE RAID (GNR) ........................................................ 8

INFORMATION LIFECYCLE MANAGEMENT (ILM) TOOLSET.................. 9

CLUSTER CONFIGURATIONS ....................................................... 11

Shared disk .................................................................... 12

Network-based block IO ................................................ 13

Mixed Clusters ............................................................... 15

Sharing data between clusters ...................................... 16

WHAT’S NEW IN GPFS VERSION 3.5 ......................................... 19

Active File Management ............................................... 20

High Performance Extended Attributes ......................... 20

Independent Filesets ..................................................... 20

Fileset Level Snapshots .................................................. 21

Fileset Level Quotas....................................................... 21

File Cloning .................................................................... 21

IPv6 Support .................................................................. 21

GPFS Native RAID .......................................................... 21

SUMMARY ............................................................................ 22

An introduction to GPFS

Page 2

Introduction

Big Data, Cloud Storage it doesn’t matter what you call it, there is certainly

increasing demand to store larger and larger amounts of unstructured data.

The IBM General Parallel File System (GPFS

) has always been considered a

pioneer of big data storage and continues today to lead in introducing industry

leading storage technologies

. Since 1998 GPFS has lead the industry with many

technologies that make the storage of large quantities of file data possible. The

latest version continues in that tradition, GPFS 3.5 represents a significant

milestone in the evolution of big data management. GPFS 3.5 introduces some

revolutionary new features that clearly demonstrate IBM’s commitment to

providing industry leading storage solutions.

This paper does not just throw out a bunch of buzzwords it explains the features

available today in GPFS that you can use to manage your file data. This includes

core GPFS concepts like striped data storage, cluster configuration options

including direct storage access, network based block I/O storage automation

technologies like information lifecycle management (ILM) tools and new

features including file cloning, more flexible snapshots and an innovative global

namespace feature called Active File Management (AFM).

This paper is based on the latest release of GPFS though much of the

information applies to prior releases. If you are already familiar with GPFS you

can take a look at the “What’s new” section for a quick update on the new

features introduced in GPFS 3.5.

2011 Annual HPCwire Readers' Choice Awards

http://www.hpcwire.com/specialfeatures/2011_Annual_HPCwire_Readers_Choice_Awa

rds.html

An introduction to GPFS

Page 3

What is GPFS?

GPFS is more than clustered file system software; it is a full featured set of file

management tools. This includes advanced storage virtualization, integrated

high availability, automated tiered storage management and the performance

to effectively manage very large quantities of file data.

GPFS allows a group of computers concurrent access to a common set of file

data over a common SAN infrastructure, a network or a mix of connection types.

The computers can run any mix of AIX, Linux or Windows Server operating

systems. GPFS provides storage management, information life cycle

management tools, centralized administration and allows for shared access to

file systems from remote GPFS clusters providing a global namespace.

A GPFS cluster can be a single node, two nodes providing a high availability

platform supporting a database application, for example, or thousands of nodes

used for applications like the modeling of weather patterns. The largest existing

configurations exceed 5,000 nodes. GPFS has been available on since 1998 and

has been field proven for more than 14 years on some of the world's most

powerful supercomputers

to provide reliability and efficient use of

infrastructure bandwidth.

GPFS was designed from the beginning to support high performance parallel

workloads and has since been proven very effective for a variety of applications.

Today it is installed in clusters supporting big data analytics, gene sequencing,

digital media and scalable file serving. These applications are used across many

industries including financial, retail, digital media, biotechnology, science and

government. GPFS continues to push technology limits by being deployed in

very demanding large environments. You may not need multiple petabytes of

data today, but you will, and when you get there you can rest assured GPFS has

already been tested in these enviroments. This leadership is what makes GPFS a

solid solution for any size application.

Supported operating systems for GPFS Version 3.5 include AIX, Red Hat, SUSE

and Debian Linux distributions and Windows Server 2008.

The file system

A GPFS file system is built from a collection of arrays that contain the file system

data and metadata. A file system can be built from a single disk or contain

thousands of disks storing petabytes of data. Each file system can be accessible

from all nodes within the cluster. There is no practical limit on the size of a file

See the top 100 list from November, 2011 - Source: Top 500 Super Computer Sites:

http://www.top500.org/

An introduction to GPFS

Page 4

system. The architectural limit is 2

bytes. As an example, current GPFS

customers are using single file systems up to 5.4PB in size and others have file

systems containing billions of files.

Application interfaces

Applications access files through standard POSIX file system interfaces. Since all

nodes see all of the file data applications can scale-out easily. Any node in the

cluster can concurrently read or update a common set of files. GPFS maintains

the coherency and consistency of the file system using sophisticated byte range

locking, token (distributed lock) management and journaling. This means that

applications using standard POSIX locking semantics do not need to be modified

to run successfully on a GPFS file system.

In addition to standard interfaces GPFS provides a unique set of extended

interfaces which can be used to provide advanced application functionality.

Using these extended interfaces an application can determine the storage pool

placement of a file, create a file clone and manage quotas. These extended

interfaces provide features in addition to the standard POSIX interface.

Performance and scalability

GPFS provides unparalleled performance for unstructured data. GPFS achieves

high performance I/O by:

 Striping data across multiple disks attached to multiple nodes.

 High performance metadata (inode) scans.

 Supporting a wide range of file system block sizes to match I/O

requirements.

 Utilizing advanced algorithms to improve read-ahead and write-behind

IO operations.

 Using block level locking based on a very sophisticated scalable token

management system to provide data consistency while allowing

multiple application nodes concurrent access to the files.

When creating a GPFS file system you provide a list of raw devices and they are

assigned to GPFS as Network Shared Disks (NSD). Once a NSD is defined all of

the nodes in the GPFS cluster can access the disk, using local disk connection, or

using the GPFS NSD network protocol for shipping data over a TCP/IP or

InfiniBand connection.

评论收藏

内容反馈

版权申诉

周楷雯

粉丝: 80
资源: 1万+

introduction-to-gpfs-3-5.zip_gpfs_site:www.pudn.com

c.zip_site:www.pudn.com

y.zip_site:www.pudn.com

cPP.zip_site:www.pudn.com

src.zip_site:www.pudn.com

gpfs-4.1.1.0集成3.10.0-693.21.1.el7.x86_64内核

gpfs安装包-4.2.1.0_linux_x64

gpfs安装包4.2.3.0_linux_x64

nfs-ganesha-gpfs-2.7.6-2.el7.x86_64.rpm

IIR.zip_site:www.pudn.com

eww.zip_site:www.pudn.com

cpld.zip_site:www.pudn.com

IDL.zip_site:www.pudn.com

B.zip_site:www.pudn.com

pcp-pmda-gpfs-5.2.5-4.el8.x86_64.rpm

nfs-ganesha-gpfs-2.7.4-1.el7.x86_64.rpm

lnvgy_dd_sraidmr35_7.700.26.00_windows_x86-64.exe

nfs-ganesha-gpfs-2.7.2-1.el7.x86_64.rpm

pcp-pmda-gpfs-5.3.1-5.el8.x86_64.rpm

pingjunfen.zip_site:www.pudn.com

tulun.zip_site:www.pudn.com

2_8.zip_site:www.pudn.com

MATLAB.zip_site:www.pudn.com

niuqun.zip_site:www.pudn.com

pcp-pmda-gpfs-4.3.2-13.el7_9.x86_64.rpm

pcp-pmda-gpfs-5.3.0-3.el8.x86_64.rpm

pcp-pmda-gpfs-5.3.1-3.el8.x86_64.rpm

nfs-ganesha-gpfs-2.7.6-1.el7.x86_64.rpm

nfs-ganesha-gpfs-2.7.5-1.el7.x86_64.rpm

最新资源