Learning Better Representations Using Auxiliary Knowledge
Saed Rezayi
University of Georgia
Department of Computer Science
saedr@uga.edu
Abstract
Representation Learning is the core of Machine Learning
and Artificial Intelligence as it summarizes input data points
into low dimensional vectors. This low dimensional vectors
should be accurate portrayals of the input data, thus it is cru-
cial to find the most effective and robust representation pos-
sible for given input as the performance of the ML task is
dependent on the resulting representations. In this summary,
we discuss an approach to augment representation learning
which relies on external knowledge. We briefly describe the
shortcoming of the existing techniques and describe how
an auxiliary knowledge source could result in obtaining im-
proved representations.
Introduction
Neural Network-based Representation Learning has gained
traction over the past few years for a variety of modalities
and applications. In Natural Language Processing, for in-
stance, RL allows us to distinguish between bank (a financial
institute) and bank (the land alongside a river) or to answer
questions such as how similar is “espresso” to “spaghetti”
which requires an underlying knowledge to understand they
are both of Italian origin.
Successful Representation Learning models require huge
amounts of training data and computational resources which
are very expensive to acquire. However, pretrained models
are reasonably available and can be augmented and fine-
tuned for specific applications. This is particularly helpful
when there is a limitation in the task at hand. For instance,
knowledge graph embedding methods (i.e., learning rep-
resentations for the entities and relations of a knowledge
graph) have several drawbacks such as ignoring contextu-
alized information and suffering from sparsity, or language
models has the shortcoming of being limited to the vocabu-
laries of the corpus they are trained on.
In such scenarios, an auxiliary source of knowledge can
improve the quality of the learned representations and help
with the inherent limitations of the vanilla models. For ex-
ample, (Xie et al. 2016) proposed a representation learning
method for knowledge graphs via embedding entity descrip-
tions, or (Malaviya et al. 2020) used pre-trained language
Copyright © 2023, Association for the Advancement of Artificial
Intelligence (www.aaai.org). All rights reserved.
models to improve the node embeddings for graph com-
pletion task. In what follows, we apply similar design idea
to propose improved solutions for various NLP and knowl-
edge graph Embedding tasks. We, furthermore, discuss ro-
bustness and explore ideas to enhance RL in presence of an
adversary.
Text as External Source of Knowledge
RQ1: How RL can benefit from incorporating knowledge
from an unstructured, external source of data?
Description: In (Rezayi et al. 2021b) we showed incorpo-
rating additional textual entities to a graph from an external
source such as WordNet could be advantageous in obtain-
ing more meaningful embeddings for the entities of knowl-
edge graphs which improves the performance of the down-
stream task, e.g., link prediction or node classification. Pre-
vious work (Kartsaklis, Pilehvar, and Collier 2018) has par-
tially addressed the issue of sparsity by enriching knowl-
edge graph entities based on “hard” co-occurrence of words
present in the entities of the knowledge graphs and exter-
nal text, while we achieve “soft” augmentation by propos-
ing a knowledge graph enrichment and embedding frame-
work. Given an original knowledge graph, we first generate
a rich but noisy augmented graph using external texts in se-
mantic and structural level. To distill the relevant knowledge
and suppress the introduced noise, we design a graph align-
ment term in a shared embedding space between the original
and augmented graph. This work was published in NAACL
2021.
Knowledge Graph as External Source of
Knowledge
RQ2: Can knowledge graphs be used as an external source
of knowledge to assist in obtaining improved emebddings?
Description: In (Rezayi et al. 2021a), we posed the ques-
tion of whether an external source of knowledge can guide
the search behavior of a user, and we proposed to find similar
entities to the user query with the aid of an external knowl-
edge graph, i.e., using the following pipeline: entity linking
+ customized link prediction which yields to the introduc-
tion of a new entity that satisfies the user’s information need.
This work was published in IEEE BigData-2021.
The Thirty-Seventh AAAI Conference on Artificial Intelligence (AAAI-23)
16133