Introduction
This chapter is an introduction to graph databases, Neo4j, and the Neo4j object-graph
mapping library (OGM).
What is a graph database?
A graph database is a storage engine that is specialised in storing and retrieving vast networks of
information. It efficiently stores data as nodes and relationships and allows high performance retrieval
and querying of those structures. Properties can be added to both nodes and relationships. Nodes
can be labelled by zero or more labels, relationships are always directed and named.
Graph databases are well suited for storing most kinds of domain models. In almost all domains,
there are certain things connected to other things. In most other modelling approaches, the
relationships between things are reduced to a single link without identity and attributes. Graph
databases allow to keep the rich relationships that originate from the domain equally well-
represented in the database without resorting to also modelling the relationships as "things". There is
very little "impedance mismatch" when putting real-life domains into a graph database.
What is an OGM?
An OGM (Object Graph Mapper) maps nodes and relationships in the graph to objects and references
in your domain model. Object instances are mapped to nodes while object references are mapped
using relationships, or serialized to properties (e.g. references to a Date). JVM primitives are mapped
to node or relationship properties. An OGM abstracts the database and provides a convenient way to
persist your domain model in the graph and query it without using low level drivers. It also provides
the flexibility to the developer to supply custom queries where the queries generated by the OGM are
insufficient.
About Neo4j
Neo4j (http://neo4j.com/) is an open source NOSQL graph database. It is a fully transactional database
(ACID) that stores data structured as graphs consisting of nodes, connected by relationships. Inspired
by the structure of the real world, it allows for high query performance on complex data, while
remaining intuitive and simple for the developer.
Neo4j is very well-established. It has been in commercial development for 15 years and in production
for over 12 years. Most importantly, it has an active and contributing community surrounding it, but it
also:
• has an intuitive, rich graph-oriented model for data representation. Instead of tables, rows, and
columns, you work with a graph consisting of nodes, relationships, and properties
(http://neo4j.com/docs/stable/graphdb-neo4j.html).
• has a disk-based, native storage manager optimised for storing graph structures with maximum
performance and scalability.
• is scalable. Neo4j can handle graphs with many billions of nodes/relationships/properties on a
single machine, but can also be scaled out across multiple machines for high availability.
• has a powerful graph query language called Cypher, which allows users to efficiently read/write
data by expressing graph patterns.
• has a powerful traversal framework and query languages for traversing the graph.
• can be deployed as a standalone server, which is the recommended way of using Neo4j
• can be deployed as an embedded (in-process) database, giving developers access to its core Java
2