NEUROSCIENCE
A principal odor map unifies diverse tasks in
olfactory perception
Brian K. Lee
1
†, Emily J. Mayhew
2,3
†, Benjamin Sanchez-Lengeling
1
, Jennifer N. Wei
1
,
Wesley W. Qian
4,1,5
, Kelsie A. Little
2
, Matthew Andres
2
, Britney B. Nguyen
2
, Theresa Moloy
2
,
Jacob Yasonik
4,1
,JaneK.Parker
6
, Richard C. Gerkin
4,1,7
, Joel D. Mainland
2,8
*, Alexander B. Wiltschko
4,1
*
Mapping molecular structure to odor perception is a key challenge in olfaction. We used graph
neural networks to generate a principal odor map (POM) that preserves perceptual relationships and
enables odor quality prediction for previously uncharacterized odorants. The model was as reliable as a
human in describing odor quality: On a prospective validation set of 400 out-of-sample odorants, the
model-generated odor profile more closely matched the trained panel mean than did the median
panelist. By applying simple, interpretable, theoretically rooted transformations, the POM outperformed
chemoinformatic models on several other odor prediction tasks, indicating that the POM successfully
encoded a generalized map of structure-odor relationships. This approach broadly enables odor
prediction and paves the way toward digitizing odors.
A
fundamental problem in neurosci ence
is mapping the physical properties of a
stimulus to perceptual characterist ics. In
vision, wavelength maps to color; in audi-
tion, frequency maps to pitch. By con-
trast, the mapping from chemical structures to
olfactory percepts is poorly understood. Detailed
and modality-specific maps such as the Com-
mission Internationale de l’Elcairage (CIE) color
space (1) and Fourier space (2) led to a bet ter
understanding of visual and auditory coding.
Similarly, to better understand olfactory cod-
ing, the field of olfaction needs a better map.
Pitch i ncreases monotonically with frequen-
cy. By contrast, the relationship between odor
perceptandodorantstructureisriddledwith
discontinuities; this is exemplified by Sell’s
triplets (3), which are trios of molecules in
which the structurally similar pair is not the
perceptually similar pair (Fig. 1A). These dis-
continuities in the structure-odor relation-
ship suggest that standard chemoinformatic
representations of molecules—functional group
counts, physical properties, molecular finger-
prints, and so on—that have been used in re-
cent odor modeling work (4–6)areinadequate
to map odor space.
The principal odor map represents perceptual
distances and hierarchies
To generate odor-relev ant representations of
molecules, we constructed a message passing
neural network (MPNN) (7), which is a spe-
cific type of graph neural network (GNN) (8),
to map chemical structures to odor percepts.
Each molecule was represented as a graph,
with each atom described by its valence, de-
gree, hydrogen count, hybridization, formal
charge, and atomic number. Each bond was
described by its degree, its aromaticity, and
whether it is in a ring. Unlike traditional fin-
gerprinting techniques (9), which assign equal
weight to all molecular fragments within a
set bond radius, a GNN can optimize fragment
weights for odor-specific applications. Neural
networks have unlocked predictive modeling
breakthroughs in diverse perceptual domains
[e.g., natural images (10), faces (11), and sounds
(12)] and naturally produce intermediate rep-
resentations of their input data that are func-
ti onally high-dimensional, data-driven maps.
We used the final layer of the GNN (henceforth,
“our model”) to directly predict odor qualities,
and the penultimate layer of the model as a prin-
cipal odor map (POM). The POM (i) faithfully
represented known perceptual hierarchies and
distances, (ii) extended to out-of-sample (here-
after, “novel”)odorants,(iii)wasrobusttodis-
continuities in structure-odor distances, and
(iv) generalized to other olfactory tasks.
We curated a reference dataset of ~5000 mol-
ecules, each described by multiple odor labels
(e.g., creamy, grassy), by combining the Good
Scents (13) and Leffingwell & Associates (14)
(GS-LF) flavor and fragrance databases (Fig.
1B). To train our model, we optimized model
parameters with a weighted-cross entropy loss
over 150 epochs using Adam (15)withalearn-
ing rate decaying from 5 × 10
−4
to 1 × 10
−5
and
a batch size of 128. The GS-LF dataset was split
80/20 training/test, and the 80% training set
further subdivided into five cross-validation
splits. These cross-validation splits were used
to optimize hyperparameters using Vizier (16),
a Bayesian optimization algorithm, by tuning
across 1000 trials. Details about model archi-
tecture and hyperparameters are given in
the supplementary methods. When properly
hyperparameter-tuned, performance was found
to be robust across many model architectures.
We present results for the model with the high-
est mean area under the receiver operating
characteristic curve (AUROC) on the cross-
validation set (AUROC = 0.89) (17).
How does the POM compare to perceptual
odor maps and conventional structure-based
maps of odorants? Empirical perceptual space
(Fig. 1D) intuitively represents perceptual dis-
tances (e.g., two molecules that smell of jas-
mine should be nearer to each other than to a
beefy-smelling molecule) and hierarchies (e.g.,
jasmine and lavender are subtypes of the floral
odor family). We show that whereas this struc-
ture is lost in Morgan fingerprint-based maps
of odorant space (Fig. 1E), the POM preserves
relative perceptual distances and hierarchies
(Fig. 1F and figs. S1 to S3).
Our model outperformed the median panelist
on the prospective validation task
To test whether our model extends to novel
odorants, we designed a prospective valida-
tion challenge (18)inwhichwebenchmarked
model predictive performance against indi-
vidual human raters. In olfaction, no reliable
instrumental method of measuring odor percep-
tion exists, and trained human sensory panels
are the gold standard for odor characterization
(19). Odor perception is variable across indi-
viduals (20, 21), but group-averaged odor rat-
ings are stable across repeated measurements
(22) and represent our best avenue to estab-
lish the ground-truth odor character for novel
odorants. We trained a cohort of subjects to
describe their perception of odorants using
the rate-all-that-apply method (RATA) and a
55-word odor lexicon. During training sessions,
each term in the lexicon was paired with visual
and odor references (fig. S4 and table S1). Only
subjects that met performance standards on
the pretest of 20 common odorants (data S2;
individual test-retest correlati on R > 0.35; rea-
sonable label selection for common odorants)
were invited to join the panel.
To avoid trivial test cases, we applied the fol-
lowing selection criteria for the set of 400 novel
odorants: (i) molecules must be structurally
distinct from each other (fig. S5), (ii) mole-
cules should cover the widest gamut of odor
labels (data S1), and (iii) molecules must be
structurally or perceptually distinct from any
training example (e.g., Fig. 1A and data S1).
Our prospective validation set consisted of 55–
odor label RATA data for 400 novel, intensity-
balanced odorants generated by our cohort of
≥15 panelists (two replicates). Summary sta-
tistics and the correlation structure of the hu-
man perceptual data are presented in figs. S6
to S8. Our panel’s mean ratings were highly
RESEARCH
Lee et al., Science 381, 999–1006 (2023) 1 September 2023 1of8
1
Google Research, Brain Team, Cambridge, MA, USA.
2
Monell
Chemical Senses Center, Philadelphia, PA, USA.
3
Department of
Food Science and Human Nutrition, Michigan State University,
East Lansing, MI, USA.
4
Osmo Labs, PBC, Cambridge, MA,
USA.
5
Department of Computer Science, University of Illinois at
Urbana-Champaign, Champaign, IL, USA.
6
Department of Food
and Nutritional Sciences, University of Reading, Reading,
Berkshire, UK.
7
School of Life Sciences, Arizona State
University, Tempe, AZ, USA.
8
Department of Neuroscience,
University of Pennsylvania, Philadelphia, PA, USA.
*Corresponding author. Email: jmainland@monell.org (J.D.M.);
alex@osmo.ai (A.B.W.)
†These authors contributed equally to this work.
Downloaded from https://www.science.org at University of Toronto on September 04, 2023