This paper is targeted toward kernel developers and architects interested in the details of enabling service drivers for PCI Express Ports. The i386 Linux implementation will be used as a reference model to provide insight into the implementation of the PCI Express Port Bus Driver and specific service drivers like the advanced error reporting root service driver and the native hot-plug root service driver.
sysfs is a feature of the Linux 2.6 kernel that allows kernel code to export information to user processes via an in-memory filesystem. The organization of the filesystem directory hierarchy is strict, and based the internal organization of kernel data structures. The files that are created in the filesystem are (mostly) ASCII files with (usually) one value per file. These features ensure that the information exported is accurate and easily accessible, making sysfs one of the most intuitive and useful features of the 2.6 kernel.
At the highest level of description, this book is about data mining. However, it focuses on data mining of very large amounts of data, that is, data so large it does not fit in main memory. Because of the emphasis on size, many of our examples are about the Web or data derived from the Web. Further, the book takes an algorithmic point of view: data mining is about applying algorithms to data, rather than using data to “train” a machine-learning engine of some sort.
Written by high performance computing (HPC) experts, Introduction to High Performance Computing for Scientists and Engineers provides a solid introduction to current mainstream computer architecture, dominant parallel programming models, and useful optimization strategies for scientific HPC. The book facilitates an intuitive understanding of performance limitations without relying on heavy computer science knowledge. It also prepares readers for studying more advanced literature.
This is the User’s Guide for the InnoDB storage engine 1.1 for MySQL 5.5.
Beginning with MySQL version 5.1, it is possible to swap out one version of the InnoDB storage engine and use
another (the “plugin”). This manual documents the latest InnoDB plugin, version 1.1, which works with MySQL 5.5 and
features cutting-edge improvements in performance and scalability.
Bigtable is a distributed storage system for managing
structured data that is designed to scale to a very large
size: petabytes of data across thousands of commodity
servers. Many projects at Google store data in Bigtable,
including web indexing, Google Earth, and Google Finance.
These applications place very different demands
on Bigtable, both in terms of data size (from URLs to
web pages to satellite imagery) and latency requirements
(from backend bulk processing to real-time data serving).
Despite these varied demands, Bigtable has successfully
provided a exible, high-performance solution for all of
these Google products. In this paper we describe the simple
data model provided by Bigtable, which gives clients
dynamic control over data layout and format, and we describe
the design and implementation of Bigtable.
We have designed and implemented the Google File System,
a scalable distributed file system for large distributed
data-intensive applications. It provides fault tolerance while
running on inexpensive commodity hardware, and it delivers
high aggregate performance to a large number of clients.
While sharing many of the same goals as previous distributed
file systems, our design has been driven by observations
of our application workloads and technological environment,
both current and anticipated, that reflect a marked
departure from some earlier file system assumptions. This
has led us to reexamine traditional choices and explore radically
different design points.
The file system has successfully met our storage needs.
It is widely deployed within Google as the storage platform
for the generation and processing of data used by our service
as well as research and development efforts that require
large data sets. The largest cluster to date provides hundreds
of terabytes of storage across thousands of disks on
over a thousand machines, and it is concurrently accessed
by hundreds of clients.
In this paper, we present file system interface extensions
designed to support distributed applications, discuss many
aspects of our design, and report measurements from both
micro-benchmarks and real world use.