SQLite Database System
Design and Implementation
Si bsankar Haldar
Motorola Mobility, Inc
809 11th A venue
Sunnyvale, CA 94089
United States of America
October 21, 2011
This page is intentionally kept blank.
Preface
This book aims to setting up a mental framework for readers to study, design, develop, maintain,
and enhance database management system softwares. At the very beginning of the book, I review
many fundamental database concepts that are formulated over the past several decades and that
prevail in many modern database systems. I am mostly brief in discussing these concepts. I put
more emphasize on the presentation of a particular SQL-based relational database management
system, namely SQLite.
In a major part of this book, I discuss design principles, engineering
trade offs, implementation issues, and operations of SQLite. The book presents a comprehensive
description of all important components of SQLite. That is, it presents an end-to-end picture on
SQLite.
Many different varieties of database management systems have been developed over the past
several decades. DB2, Informix, Ingres, MySQL, Oracle, PostgreSQL, SQL Server, Sybase are a few
to mention here. These systems are commercially very successful. There were some more database
systems that were commercially unsuccessful or became obsolete over time. These database systems
(successful or not) profess many theoretical and technological challenges.
All the theories and
technologies are primarily geared toward two goals: user convenience in the management of data,
and efficient ways of storing, retrieving, and manipulating data.
Because the database field is so large, it is impossible to accommodate all known theories and
technologies in a single book like this one, especially when I am primarily interested in presenting
a particular database management system. I do not attempt to cover everything from the field.
As far as theories are concerned, I discuss core concepts that prevail in many modern relational
database management systems. As far as technologies are concerned, I present internals of SQLite
database management system. SQLite database system is the only focus of this book.
It is a small,
embeddable, SQL-based relational database management system.
It has been widely used in low to
medium tier database applications, especially in embedded devices.
1
One can find some well known
users of SQLite at the webpage http:/ /www.sqlite.org/famous.html.
It is an open source software,
and is available in the public domain free of cost. There is no copyright claim on any part of the core
1
SQLite received the 2005 OSCON Google and O'Reilly Integrator category award.
11
SQLite source code.
2
One may download the source code from its homepage http://www.sqlite.org,
and experiment with it by modifying different parts of the code. SQLite runs on Linux, Windows,
MAC
OS X, and a few other operating systems. In this book I restrict myself to the Linux version
of SQLite 3.7.8 release that is the latest version as of September 19, 2011.
This book would be a valuable asset to database course students in academic institutions as
well as to professional developers of database management systems and SQLite applications in
the computer industry.
It can be used as a companion text book for advance database courses in
Computer Science and Information Technology disciplines.
It will be highly useful for those students
who want to quickly learn about internals of a typical database system or take short term projects
in database management systems. Project students will get access to some basic infrastructure
to experiment with a real database system; they will get acquainted with core implementation
concepts in a short time. There is no license issue in using SQLite in what so ever way possible.
It
would certainly be a valuable book for those who want to develop SQLite applications. This book
however does not teach how to diagnose or debug defects in database management system softwares
or applications. In essence, after reading this book, students and professional software developers
would be aware of principal challenges in designing and developing an embedded database system,
and solutions adopted by the SQLite development team in particular. And, they (readers) may get
emotional and start developing their own database management systems.
I would assume that readers are familiar with structured computer programming practices, and
they have some work experience in developing and testing softwares using the
C programming
language. SQLite itself is written in
ANSI C; the source distribution also comes with numerous
test cases in C. All example programs in this book are presented in the
C language syntax, pseudo
codes, and sometimes, in plain English sentences. Readers are also expected to have some basic
knowledge of computer operating system and database theory. I would also assume that they
are familiar with SQL that is widely used as the application programming language for relational
databases. All database applications in this book are written in SQL.
Designing a new database management system depends on what services (aka, features) the
database system would have for users. I will identify many fundamental features, and discuss
them in-depth in this book, geared toward how they are handled in SQLite. In the design process,
a database management system is split into many component subsystems. Different subsystem
offers different services to database users and/or to other subsystems. Designers need to clearly
specify interfaces to these subsystems. Finally, the subsystem softwares are developed, integrated,
and tested to form a complete database management system that can be used in a production
environment. A database management system should be simple, efficient, and reliable. In addition,
2
Interesting copyright statement can be seen at http://www.sqlite.org/copyright.html.
lll
the system should also be easy to maintain and enhance, and easy to port across different computer
platforms. SQLite has these properties.
Contents
This book consists of 11 chapters. Chapter 1 discusses general concepts from the domains of
operating systems and databases, and Chapters
2-10 SQLite internals. Chapter 11 provides some
references to the literature.
Chapter 1 first presents a very general introduction to computer hardware and operating system.
It reviews some essential concepts from the operating system domain, which are required for the
development of database management systems.
It briefly explains why we need operating systems
in computers, and what they really do for computer users.
It presents a bird's-eye view of typical
operating systems, and provides readers an intuitive but overall understanding of working princi-
ples of the systems.
It then explains why we need databases and database management systems.
It explains what the systems really do for database users. It presents a bird's-eye view of typical
database management systems, and provides readers an intuitive understanding of working princi-
ples of the systems. It touches upon almost all major component subsystems of a typical database
management system.
It introduces several key concepts and terminologies related to databases,
especially those from the domain of relational database management. It talks about data model,
relation, schema, integrity constraint, index, transaction, ACID transactional properties, and so
forth.
It explains various ways of manipulating data in databases. It also presents a few simple
examples of SQL programming. These concepts are explored in later chapters, in the context of
the SQLite database system. Overall, it is a very short tour of operating systems and database
management systems.
Chapter 2 presents an overview of SQLite that is an SQL-based relational database manage-
ment system.
It introduces all components of SQLite in top-down fashion. These components are
elaborately discussed in bottom-up fashion in later chapters. The chapter also talks about a few
SQLite APis, and presents some very simple SQLite applications. Overall, the chapter is a short
tour of SQLite, and it sets up the ground plan for the rest of the book.
Chapter 3 presents the lowest level storage organization in SQLite.
It defines naming con-
ventions for database and journal files, and their formats.
It talks about how a database file is
partitioned into fixed size pages, who uses those pages, and organization of the pages.
It also
presents formats of journal files that are used to store log records produced by applications.
Chapter 4 discusses how SQLite manages user queries and updates via transactions. SQLite
executes each and every SQL statement in the abstraction of a transaction, and ensures ACID