uses association rules on extracted terms to find subsump-
tion relations between classes and properties of different on-
tologies. Again, these systems do not mine the kind of rules
we are interested in.
In [1] association rules and frequency analysis are used to
identify and classify common misusage patterns for relations
in DBpedia. In contrast to our work, this approach does not
mine logical rules, but association rules on the co-occurrence
of values. Since RDF data can be seen as a graph, mining
frequent subtrees [6, 18] is another related field of research.
However, as the URIs of resources in knowledge bases are
unique, these techniques are limited to mining frequent com-
binations of classes.
Several approaches, such as Markov Logic [31] or URDF
[27] use Horn rules to perform reasoning. These approaches
can be consumers of the rules we mine with AMIE.
3. PRELIMINARIES
RDF KBs. In this paper, we focus on RDF knowledge
bases
5
. An RDF KB can be considered a set of facts, where
each fact is a triple of the form hx, r, yi with x denoting the
subject, r the relation (or predicate), and y the object of the
fact. There are several equivalent alternative representations
of facts; in this paper we use a logical notation and represent
a fact as r(x, y). For example, we write father(Elvis,Lisa).
The facts of an RDF KB can usually be divided into an A-
Box and a T-Box. While the A-Box contains instance data,
the T-Box is the subset of facts that define classes, domains,
ranges for predicates, and the class hierarchy. Although T-
Box information can also be used by our mining approach,
we are mainly concerned with the A-Box, i.e., the set of facts
relating one particular entity to another.
In the following, we assume a given KB K as input. Let
R = π
relation
(K) denote the set of relations contained in K
and E = π
subject
(K) ∪ π
object
(K) the set of entities.
Functions. A function is a relation r that has at most one
object for every subject, i.e., ∀x : |{y : r(x, y)}| ≤ 1. A
relation is an inverse function if each of its objects has at
most one subject. Since RDF KBs are usually noisy, even
relations that should be functions (such as hasBirthdate)
may exhibit two objects for the same subject. Therefore,
we use the notion of functionality [33]. The functionality of
a relation r is a value between 0 and 1, that is 1 if r is a
function:
fun(r) :=
#x : ∃y : r(x, y)
#(x, y) : r(x, y)
with #x : X as an abbreviation for |{x : X ∈ K}|. The
inverse functionality is defined accordingly as ifun(r) :=
fun(r
−1
). Without loss of generality, we assume that ∀r ∈
R : f un(r) ≥ ifun(r) (FUN-Property). If that is not
the case for a relation r, we can replace all facts r(x, y)
with the inverse relation, r
−
(y, x), which entails f un(r
−
) ≥
ifun(r
−
). For example, if the KB contains the inverse
functional relation directed(person,movie), we can create the
functional relation isDirectedBy(movie,person) and use only
that one in the rule mining process. Manual inspection
shows, however, that relations in semantic KBs tend to be
more functional than inverse functional. Intuitively, this al-
lows us to consider a fact r(x, y) as a fact about x.
5
http://www.w3.org/TR/rdf-primer/
Rules. An atom is a fact that can have variables at the
subject and/or object position. A (Horn) rule consists of a
head and a body, where the head is a single atom and the
body is a set of atoms. We denote a rule with head r(x, y)
and body {B
1
, ..., B
n
} by an implication
B
1
∧ B
2
∧ ... ∧ B
n
⇒ r(x, y)
which we abbreviate as
~
B ⇒ r(x, y). One example of such
a rule is
hasChild(p, c) ∧ isCitizenOf (p, s) ⇒ isCitizenOf (c, s)
An instantiation of a rule is a copy of the rule, where all
variables have been substituted by entities. A prediction of
a rule is the head atom of an instantiated rule if all body
atoms of the instantiated rule appear in the KB. For ex-
ample, the above rule can predict isCitizenOf(Lisa,USA) if
the KB knows a parent of Lisa (hasChild(Elvis,Lisa)) who
is American (isCitizenOf(Elvis,USA)).
Language Bias. As most ILP systems, AMIE uses a lan-
guage bias to restrict the search space. We say that two
atoms in a rule are connected if they share a variable or an
entity. A rule is connected if every atom is connected tran-
sitively to every other atom of the rule. AMIE mines only
connected rules, i.e., it avoids constructing rules that con-
tain unrelated atoms. We say that a rule is closed if every
variable in the rule appears at least twice. Such rules do not
predict merely the existence of a fact (e.g. diedIn(x, y) ⇒
∃z : wasBornIn(x, z)), but also concrete arguments for it
(e.g. diedIn(x, y) ⇒ wasBornIn(x, y)). AMIE mines only
closed rules. We allow recursive rules that contain the head
relation in the body.
Parallels to Association Rule Mining. Association Rule
Mining discovers correlations in shopping transactions. Thus,
association rules are different in nature from the Horn rules
we aim at. Still, we can show some similarities between
the two approaches. Let us define one transaction for every
set of n entities that are connected in the KB. For exam-
ple, in Figure 1, we will define a transaction for the enti-
ties Elvis, Lisa and Priscilla, because they are connected
through the facts mother(Priscilla,Lisa), father(Elvis,Lisa),
marr(Elvis, Priscilla). We label the transaction with the
set of these entities. Each atom r(x
i
, x
j
) on variables in-
dexed by 1 ≤ i, j ≤ n corresponds to an item. A transaction
with label hC
1
, . . . , C
n
i contains an item r(x
i
, x
j
) if r(C
i
, C
j
)
is in the KB. For example, the transaction hElvis, Lisa,
Priscillai contains the items {mother(x
3
,x
2
), father(x
1
,x
2
),
marr(x
1
,x
3
)}, since the ground atoms mother(Priscilla,Lisa),
father(Elvis,Lisa) and marr(Elvis, Priscilla) are in the KB.
In this representation, association rules are Horn rules. In
the example, we can mine the association rule
{mother(x
3
, x
2
), marr(x
1
, x
3
)} ⇒ {f ather(x
1
, x
2
)}
which corresponds to the Horn rule
mother(x
3
, x
2
) ∧ marr(x
1
, x
3
) ⇒ father(x
1
, x
2
)
Transaction Label Transaction Items
hElvis,Lisa,Priscillai {mother(x
3
,x
2
),father(x
1
,x
2
),marr(x
1
,x
3
)}
hBarack,Mali,Mich.i {mother(x
3
,x
2
),father(x
1
,x
2
),marr(x
1
,x
3
)}
hFran¸cois,Flora,S´egoi {mother(x
3
,x
2
),father(x
1
,x
2
)}
Figure 1: Mining Rules with 3 Variables
Constructing such a table with all possible combinations
of entities is practically not very viable. Apart from that,