没有合适的资源?快使用搜索试试~ 我知道了~
Humor in Word Embeddings-Cockamamie Gobbledegook for Nincompoops
需积分: 0 0 下载量 99 浏览量
2022-08-03
15:14:09
上传
评论
收藏 550KB PDF 举报
温馨提示
试读
10页
Humor in Word Embeddings-Cockamamie Gobbledegook for Nincompoops幽默词汇的词嵌入1
资源详情
资源评论
资源推荐
Humor in Word Embeddings:
Cockamamie Gobbledegook for Nincompoops
WARNING: This paper contains words that people rated humorous including many that are offensive in nature.
Limor Gultchin
1
Genevieve Patterson
2
Nancy Baym
3
Nathaniel Swinger
4
Adam Tauman Kalai
3
Abstract
While humor is often thought to be beyond the
reach of Natural Language Processing, we show
that several aspects of single-word humor corre-
late with simple linear directions in Word Embed-
dings. In particular: (a) the word vectors capture
multiple aspects discussed in humor theories from
various disciplines; (b) each individual’s sense of
humor can be represented by a vector, which can
predict differences in people’s senses of humor
on new, unrated, words; and (c) upon clustering
humor ratings of multiple demographic groups,
different humor preferences emerge across the dif-
ferent groups. Humor ratings are taken from the
work of Engelthaler and Hills (2017) as well as
from an original crowdsourcing study of 120,000
words. Our dataset further includes annotations
for the theoretically-motivated humor features we
identify.
1. Introduction
Detecting and generating humor is a notoriously difficult
task for AI systems. While Natural Language Processing
(NLP) is making impressive advances in many frontiers
such as machine translation and question answering, NLP
progress on humor has been slow. This reflects the fact that
humans rarely agree upon what is humorous. Multiple types
of humor exist, and numerous theories were developed to
explain what makes something funny. Recent research sup-
porting the existence of single-word humor (Engelthaler &
Hills, 2017; Westbury et al., 2016) defines a more manage-
able scope to study with existing machine learning tools.
Word Embeddings (WEs) have been shown to capture nu-
merous properties of words (e.g., Mikolov et al., 2013a;b);
1
University of Oxford
2
TRASH
3
Microsoft Research
4
Lexington High School. Correspondence to: Limor Gultchin
<limor.gultchin@jesus.ox.ac.uk>.
Proceedings of the
36
th
International Conference on Machine
Learning, Long Beach, California, PMLR 97, 2019. Copyright
2019 by the author(s).
coupled with single-word humor as a possible research di-
rection, it is natural to study if and how WEs can capture
this type of humor. To assess the ability of WEs to explain
individual word humor, we draw on a long history of humor
theories and put them to the test.
To many readers, it may not be apparent that individual
words can be amusing in and of themselves, devoid of con-
text. However, Engelthaler & Hills (2017), henceforth re-
ferred to as EH, found some words consistently rated as
more humorous than others, through a crowdsourced study
of about five thousand nouns. We first use their publicly
available 5k mean word-humor ratings to identify a “humor
vector,” i.e., a linear direction, in several WEs that correlate
(over
0.7
) with these 5k mean humor ratings. While these
correlations establish statistical significance, little insight is
obtained into how the embeddings capture different aspects
of humor and differences between people’s senses of humor.
To complete this picture, we performed crowdsourcing stud-
ies to create additional datasets which we make publicly
available: (a) beginning with a set of 120k common words
and phrases chosen from a word embedding, a crowdsourc-
ing filtering process yielded a set of 8,120 and a further set of
216 words
1
(in Appendix B) rated most humorous, (b) over
1,500 crowd workers rated these latter 216 words through
six-way comparisons each yielding a personal first choice
out of dozens of other personally highly-ranked words, and
(c) 1,500 words (including highly-rated words drawn from
these sets) were each annotated by multiple workers accord-
ing to six humor features drawn from the aforementioned
theories of humor.
Our analysis suggests that individual-word humor indeed
possesses many aspects of humor that have been discussed
in general theories of humor, and that many of these as-
pects of humor are captured by WEs. For example, ‘incon-
gruity theory,’ which we discuss shortly, can be found in
words which juxtapose surprising combinations of words
1
These words, including gobbledegook and nincompoops, were
rated as more humorous than the words in the EH study, which
used common words from psychology experiments. The top-rated
EH words were booty, tit, booby, hooter, and nitwit.
臭人鹏
- 粉丝: 24
- 资源: 328
上传资源 快速赚钱
- 我的内容管理 展开
- 我的资源 快来上传第一个资源
- 我的收益 登录查看自己的收益
- 我的积分 登录查看自己的积分
- 我的C币 登录后查看C币余额
- 我的收藏
- 我的下载
- 下载帮助
安全验证
文档复制为VIP权益,开通VIP直接复制
信息提交成功
评论0