PromptSource:自然语言提示的集成开发环境与公共资源库资源-CSDN文库

版权申诉

11 浏览量 2024-12-02 11:59:24 上传评论收藏 2.03MB PDF 举报

资源推荐

资源详情

资源评论

PromptSource: An Integrated Development Environment

and Repository for Natural Language Prompts

Stephen H. Bach

∗1,2

Victor Sanh

∗3

Zheng-Xin Yong

Albert Webson

Colin Raffel

Nihal V. Nayak

Abheesht Sharma

Taewoon Kim

M Saiful Bari

Thibault Fevry

Zaid Alyafeai

Manan Dey

Andrea Santilli

Zhiqing Sun

Srulik Ben-David

Canwen Xu

Gunjan Chhablani

Han Wang

Jason Alan Fries

15,2

Maged S. Al-shaibani

Shanya Sharma

Urmish Thakker

Khalid Almubarak

Xiangru Tang

Dragomir Radev

Mike Tian-Jian Jiang

Alexander M. Rush

Brown University

Snorkel AI

Hugging Face

BITS Pilani

VU Amsterdam

NTU

BigScience

KFUPM

SAP

University of Rome

CMU

Technion

UCSD

NYU

Stanford University

Walmart Labs

SambaNova Systems

PSAU

Yale University

ZEALS

∗

Equal Contribution

Abstract

PromptSource is a system for creating,

sharing, and using natural language prompts.

Prompts are functions that map an example

from a dataset to a natural language input

and target output. Using prompts to train

and query language models is an emerging

area in NLP that requires new tools that

let users develop and reﬁne these prompts

collaboratively. PromptSource addresses

the emergent challenges in this new setting

with (1) a templating language for deﬁning

data-linked prompts, (2) an interface that

lets users quickly iterate on prompt develop-

ment by observing outputs of their prompts

on many examples, and (3) a community-

driven set of guidelines for contributing

new prompts to a common pool. Over

2,000 prompts for roughly 170 datasets are

already available in PromptSource. Prompt-

Source is available at https://github.

com/bigscience-workshop/

promptsource.

1 Introduction

Prompt engineering is emerging as a new focus in

NLP, particularly in zero- and few-shot learning

settings. Prompting is the practice of representing

a task as a natural language utterance in order to

query a language model for a response (Liu et al.,

2021). For example, if a language model is con-

ditioned on the text “She hit a home run. The

previous sentence is about ...”, then the model’s

subsequent generation would be interpreted as a

prediction of the topic of the preceding sentence,

e.g. by mapping a response such as “sports” to

a class label. In speciﬁc contexts, prompting has

been shown to have advantages over traditional

classiﬁcation, for example facilitating adaptation

of language models to ad-hoc tasks and improv-

ing sample efﬁciency in low-data settings (Brown

et al., 2020; Schick and Schütze, 2021b; Le Scao

and Rush, 2021; Gao et al., 2021). These advan-

tages motivate a practical challenge: How can we

enable users to create, reﬁne, and share prompts?

The process of prompt engineering is critical

for successful deployment as choices in prompt-

ing can affect downstream predictions signiﬁcantly,

particularly in the zero-shot setting (Perez et al.,

2021; Zhao et al., 2021; Webson and Pavlick, 2021).

Furthermore, training directly on collections of

prompts can enable large models to generalize to

new prompts more robustly (Sanh et al., 2021; Wei

et al., 2021; Min et al., 2021; Mishra et al., 2021).

There is therefore a growing need for tools that

support the creation of corpora of prompts.

PromptSource is an integrated development en-

vironment and repository for natural language

prompts to use in the context of zero-shot (or

gradient-based few-shot) learning. It provides a

Web-based GUI that enables developers to write

prompts in a templating language and immediately

view their outputs on different examples. The sys-

tem is integrated with the HuggingFace Datasets

library (Lhoest et al., 2021), so that users can load

any dataset automatically, browse existing prompts,

and create new ones. Through the course of writing

thousands of prompts, we converged on three key

arXiv:2202.01279v3 [cs.LG] 29 Mar 2022

aspects to the design of PromptSource:

• Flexible Templating Language.

We adapt

a templating language to represent prompts.

Prompt authors can deﬁne prompts in terms of

dataset ﬁelds, hard-coded text, and simple con-

trol logic. This choice provides the ﬂexibility

of a programming environment without the

mental overhead of having to write and read

arbitrary code. Prompt templates can easily

be distributed and used in other systems.

• Tools for Prompt Management.

Prompt-

Source has multiple view to address the needs

of prompt authors at different stages of the

prompt engineering cycle. A global view lets

authors browse datasets and existing prompt

templates. A local view facilitates iteration

on prompt wording and metadata, as well as

testing on individual examples.

• Community-Driven Quality Standards.

PromptSource includes a set of guidelines

for prompting based on a large-scale prompt

writing pilot. PromptSource’s collection

is meant to be useful for a wide range

of research, based on iterative reﬁnement

of a set of quality standards. Prompts in

PromptSource are also annotated with various

pieces of metadata to make ﬁnding and using

prompts easier.

The PromptSource system includes over 2,000

open-source prompts for roughly 170 datasets,

which have all been reviewed to meet the quality

standards. This collection, which we call the Public

Pool of Prompts (P3), allows users to materialize

prompted forms of datasets for hundreds of differ-

ent tasks. The T0 series of models (Sanh et al.,

2021) for zero-shot inference were ﬁne-tuned on

a subset of P3. Since then, PromptSource and P3

have been extended for research on multi-lingual

prompting (Lin et al., 2021) and priming, i.e., in-

context few-shot learning (Min et al., 2021). The

PromptSource system and associated content is a

ﬁrst step in the study of systems for prompt engi-

neering, an area that is likely to continue to grow.

2 Background and Related Work

PromptSource builds on recent work in prompting

and prompt engineering. It is also related to work

on systems for other types of annotations.

Prompting

Recently, prompting has emerged

as a new focus within NLP as it can dramati-

cally improve language models’ few-shot and zero-

shot performance in a wide range of downstream

tasks (Brown et al., 2020; Schick and Schütze,

2021a; Sanh et al., 2021; Wei et al., 2021). Prompts

and prompt engineering come in several vari-

eties (Liu et al., 2021). PromptSource is focused on

facilitating research with human-written prompts,

in which natural language is the medium for de-

scribing tasks. This approach has the advantage

that prompts can be understood, modiﬁed, and ap-

plied without being tied to a speciﬁc model. In

contrast, past work has also aimed to automatically

construct prompts by framing the search for a good

prompt as a learning problem. These prompts can

either be expressed in natural language (Gao et al.,

2021; Shin et al., 2020) or as arbitrary vectors (a.k.a.

“continuous” or “soft” prompts) not corresponding

to words in the model’s original vocabulary (Lester

et al., 2021; Qin and Eisner, 2021)

When using human-written prompts, there are

several possible approaches to learning. One is a

zero-shot setting, where the goal is to generalize to

prompts for which no training examples are given.

Prompts can also be used in a few-shot setting, in

which a model is either (1) trained on prompted ex-

amples of the target task via gradient updates, or (2)

priming (i.e. in-context learning), in which labeled

examples are included in an input sequence in or-

der to prime models to make predictions without

gradient updates (Brown et al., 2020).

PromptSource was originally designed for zero-

shot learning, so it emphasizes explicit task instruc-

tions and no priming examples. If needed, users

can extend PromptSource for few-shot learning

(e.g., as done in Lin et al., 2021 and Min et al.,

2021, described in §7).

Systems for Annotating Data

Most work on

collecting annotations has focused on labels and

other annotations at the level of individual exam-

ples (Neves and Ševa, 2021). GATE (Cunningham

et al., 2002) was an early system for annotating

text, and includes support for many data types such

as labels and entity tags. Since then, many Web-

based systems for annotating text have been devel-

oped (Stenetorp et al., 2012; Salgado et al., 2012;

Wei et al., 2013; Yimam et al., 2013; Chen and

Styler, 2013; Eckart de Castilho et al., 2016; Putra

et al., 2020). Other systems support collaboration

among multiple annotators (Yang et al., 2018; Stew-

art et al., 2019). More recently, many annotation

systems have begun to incorporate learned models

to improve workﬂow, using techniques such as ac-

S1: Exploration S2 + S3 + S4: Creation S5: Review

Browse

SNLI

The SNLI corpus (version 1.0) is a

collection of 570k human-written

English sentence pairs manually

labeled for the task of NLI…

{ premise: “The kids…”,

hypothesis: “All kids…”,

label: 2 }

{ premise: “A person…”,

hypothesis: “A person…”,

label: 1 }

Sourcing

SNLI

Browse

SNLI

The SNLI corpus (version 1.0) is a

collection of 570k human-written

English sentence pairs manually

labeled for the task of NLI…

“The kids…” Based on the previous

passage, is it true that “All kids…”?

Yes, no, or maybe? |||

“A person…” Based on the previous

passage, is it true that “A

person…”? Yes, no, or maybe? |||

Maybe

Based…

based on the previous passage

{{premise}} Based on the

previous passage, is it true

that "{{hypothesis}}"?

Yes, no, or maybe? |||

Original Task Choices in Prompt

Adapted from the BoolQ prompts in

Schick & Schütze 2021.

Yes ||| No ||| Maybe Accuracy

Figure 1: The ﬁve stages of creating prompts in PromptSource. The Browse view for Dataset Exploration (S1).

The Sourcing view for Prompt Writing (S2), Prompt Documentation (S3), and Iteration and Variation (S4). The

Browse view for performing a Global Review (S5).

tive learning (Lin et al., 2019; Li et al., 2021) and

example recommendation (Lee et al., 2020; Kiela

et al., 2021). These systems are possible because

the annotations to be collected are labels, for which

metrics like inter-annotator agreement and model

conﬁdence are available.

There has also been some work on collecting

annotations other than labels. AlvisAE (Papazian

et al., 2012) and TreeAnnotator (Helfrich et al.,

2018) support creating ontologies and other struc-

tured annotations. Prompts differ from these anno-

tations in that they are semi-structured functions,

requiring new tools for developers.

3 System Design and Workﬂow

Creating prompts differs from other types of data

collection and annotation. We focus on three chal-

lenging aspects on which prompting differs from

traditional NLP annotation:

• Functions, not Labels.

A single prompt is a

function that maps dataset examples (dictio-

naries of arbitrary ﬁelds) to natural language

input/target pairs. Creating a prompt is there-

fore more like programming than typical data

annotation. How should a prompt format trade

off between expressivity and simplicity?

• Dataset-Level Choices.

Prompts are associ-

ated with datasets, unlike label annotations

that are local to single examples. Prompt en-

gineering requires developers to evaluate their

choices across all examples. What interfaces

do authors need to inspect and debug their

prompts?

• Variation in Prompt Construction.

Unlike

with labels, it is often desirable to have varia-

tion within prompt construction, as different

prompt choices may lead to different results.

However, variation complicates quality judg-

ment, and makes it impossible to apply simple

metrics like inter-annotator agreement. How

can multiple authors collaborate to build a

high-quality corpus of prompts and associated

metadata?

To illustrate these distinct aspects, we start with

a concrete overview of the prompt creation process

of PromptSource. For this example, we imagine

that a user of PromptSource is creating prompts for

a natural language inference dataset, speciﬁcally

SNLI (Bowman et al., 2015). The goal is to de-

sign a prompt query such that the answer can be

mapped onto the SNLI classes. A prompt author

can accomplish this goal with PromptSource via

the following ﬁve steps (Figure 1):

S1: Dataset Exploration

The prompt author

starts in the Browse view to read the dataset de-

scription, including linked READMEs and papers,

and to browse through examples. In this case, they

would see that SNLI is a dataset for natural lan-

guage inference: assume a given premise sentence

is true, the goal is to determine whether a hypoth-

esis sentence is true (entailment), false (contradic-

tion), or undetermined (neutral).

S2: Prompt Writing

The prompt author uses

the Sourcing view to try out a prompt wording, and

then adjusts it by observing prompted examples

(Figure 1 middle, full example in Figures 3 and 4).

S3: Prompt Documentation

To facilitate using

the prompt, the author ﬁlls in various metadata in-

cluding possible metrics to evaluate the prompt,

valid outputs if applicable, whether the prompt ex-

presses the original intended task of the dataset,

and whether the template explicitly states the valid

outputs.

S4: Iteration and Variation

The prompt author

剩余11页未读，继续阅读

评论收藏

内容反馈

版权申诉

pk_xz123456

粉丝: 2275
资源: 2237

PromptSource: 自然语言提示的集成开发环境与公共资源库

最新资源

PromptSource: 自然语言提示的集成开发环境与公共资源库

涵盖1600+任务的巨型Benchmark来了！跑个测试花一周？？.pdf

stable-diffusion部署需要的包

大规模语言模型：从理论到实践

人工智能大模型介绍.pptx

diabetes糖尿病数据集

Notepad++ 8.5.6最新版 64位安装包

libomp140.x86-64.dll

21个免费无限制免登录chatgpt资源， OpenAI GPT-4\3.5 模型的智能对话链接

ChatGPT智能AI机器人微信小程序源码-带部署教程

transformer代码

线性代数-同济大学第七版

Matlab深度学习工具箱

基于Qwen2.5-7B-Instruct的大模型微调实战指南

最新AI软件系统源码+支持AI绘画(Midjourney)+文档分析+识图理解+电脑PC端+手机端H5+微信公众号对接

LM Studio windows版本安装

Python调用豆包大模型API及文本转语音TTS

一本关于ChatGPT的书《ChatGPT 革命：了解大型语言模型的力量》中文版

Matlab中安装MinGW电脑环境配置工具configuremingw

Speech Wav Resource

基于PyTorch实现的词向量模型

Build a Large Language Model (From Scratch)

文心一言、智谱清言、kimi，AI批量文章工具2.1版更新

SLAM导航机器人零基础实战系列-全部PDF文档整理.zip

chatgpt微信小程序源码

Ascend C算子开发能力认证（初级）题库.pdf

【chatGPT入门必须】ChatGPT4.0 逆天插件功能.pdf

matlab批量读取excel表格数据并处理画图

首助编辑高手V6.0 多功能版

ChatGPT教程.pdf

纯离线安装nvidia-container-toolkit（deb）完整部署包（v1.12.0）

最新资源