GooglePaLM2技术手册_python调用PALMapi资源-CSDN文库

5星 · 超过95%的资源 61 浏览量 2023-05-12 04:52:01 上传评论收藏 4.85MB PDF 举报

资源推荐

资源详情

资源评论

PaLM 2 Technical Report

Google

Abstract

We introduce PaLM 2, a new state-of-the-art language model that has better multilingual and reasoning capabilities

and is more compute-efﬁcient than its predecessor PaLM (Chowdhery et al., 2022). PaLM 2 is a Transformer-based

model trained using a mixture of objectives similar to UL2 (Tay et al., 2023). Through extensive evaluations on English

and multilingual language, and reasoning tasks, we demonstrate that PaLM 2 has signiﬁcantly improved quality on

downstream tasks across different model sizes, while simultaneously exhibiting faster and more efﬁcient inference

compared to PaLM. This improved efﬁciency enables broader deployment while also allowing the model to respond

faster, for a more natural pace of interaction. PaLM 2 demonstrates robust reasoning capabilities exempliﬁed by large

improvements over PaLM on BIG-Bench and other reasoning tasks. PaLM 2 exhibits stable performance on a suite of

responsible AI evaluations, and enables inference-time control over toxicity without additional overhead or impact on

other capabilities. Overall, PaLM 2 achieves state-of-the-art performance across a diverse set of tasks and capabilities.

See authorship section for a list of authors.

Contents

1 Introduction 3

2 Scaling law experiments 7

2.1 Scaling laws . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7

2.2 Downstream metric evaluations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7

3 Training dataset 9

4 Evaluation 10

4.1 Language proﬁciency exams . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10

4.2 Classiﬁcation and question answering . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10

4.3 Reasoning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13

4.4 Coding . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16

4.5 Translation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16

4.6 Natural language generation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19

4.7 Memorization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21

5 Responsible usage 23

5.1 Inference-time control . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24

5.2 Recommendations for developers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26

6 Conclusion 27

A Detailed results 42

A.1 Scaling laws . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42

A.2 Instruction tuning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42

A.3 Multilingual commonsense reasoning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43

A.4 Coding . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43

A.5 Natural language generation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44

B Examples of model capabilities 44

B.1 Multilinguality . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44

B.2 Creative generation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48

B.3 Coding . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49

C Language proﬁciency exams 61

D Dataset language composition 61

E Responsible AI 62

E.1 Dataset analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62

E.2 Evaluation approach . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65

E.3 Dialog uses . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66

E.4 Classiﬁcation uses . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 70

E.5 Translation uses . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72

E.6 Question answering uses . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 76

E.7 Language modeling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 79

E.8 Measurement quality rubrics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81

E.9 CrowdWorksheets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 84

E.10 Model Card . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 91

1 Introduction

Language modeling has long been an important research area since Shannon (1951) estimated the information in

language with next word prediction. Modeling began with

-gram based approaches (Kneser & Ney, 1995) but rapidly

advanced with LSTMs (Hochreiter & Schmidhuber, 1997; Graves, 2014). Later work showed that language modelling

also led to language understanding (Dai & Le, 2015). With increased scale and the Transformer architecture (Vaswani

et al., 2017), large language models (LLMs) have shown strong performance in language understanding and generation

capabilities over the last few years, leading to breakthrough performance in reasoning, math, science, and language tasks

(Howard & Ruder, 2018; Brown et al., 2020; Du et al., 2022; Chowdhery et al., 2022; Rae et al., 2021; Lewkowycz

et al., 2022; Tay et al., 2023; OpenAI, 2023b). Key factors in these advances have been scaling up model size (Brown

et al., 2020; Rae et al., 2021) and the amount of data (Hoffmann et al., 2022). To date, most LLMs follow a standard

recipe of mostly monolingual corpora with a language modeling objective.

We introduce PaLM 2, the successor to PaLM (Chowdhery et al., 2022), a language model unifying modeling advances,

data improvements, and scaling insights. PaLM 2 incorporates the following diverse set of research advances:

• Compute-optimal scaling

: Recently, compute-optimal scaling (Hoffmann et al., 2022) showed that data size is

at least as important as model size. We validate this study for larger amounts of compute and similarly ﬁnd that

data and model size should be scaled roughly 1:1 to achieve the best performance for a given amount of training

compute (as opposed to past trends, which scaled the model 3× faster than the dataset).

• Improved dataset mixtures

: Previous large pre-trained language models typically used a dataset dominated

by English text (e.g.,

∼

78% of non-code in Chowdhery et al. (2022)). We designed a more multilingual and

diverse pre-training mixture, which extends across hundreds of languages and domains (e.g., programming

languages, mathematics, and parallel multilingual documents). We show that larger models can handle more

disparate non-English datasets without causing a drop in English language understanding performance, and apply

deduplication to reduce memorization (Lee et al., 2021)

• Architectural and objective improvements

: Our model architecture is based on the Transformer. Past LLMs

have almost exclusively used a single causal or masked language modeling objective. Given the strong results of

UL2 (Tay et al., 2023), we use a tuned mixture of different pre-training objectives in this model to train the model

to understand different aspects of language.

The largest model in the PaLM 2 family, PaLM 2-L, is signiﬁcantly smaller than the largest PaLM model but uses

more training compute. Our evaluation results show that PaLM 2 models signiﬁcantly outperform PaLM on a variety

of tasks, including natural language generation, translation, and reasoning. These results suggest that model scaling

is not the only way to improve performance. Instead, performance can be unlocked by meticulous data selection

and efﬁcient architecture/objectives. Moreover, a smaller but higher quality model signiﬁcantly improves inference

efﬁciency, reduces serving cost, and enables the model’s downstream application for more applications and users.

PaLM 2 demonstrates signiﬁcant multilingual language, code generation and reasoning abilities, which we illustrate in

Figures 2 and 3. More examples can be found in Appendix B.

PaLM 2 performs signiﬁcantly better than PaLM on

real-world advanced language proﬁciency exams and passes exams in all evaluated languages (see Figure 1). For some

exams, this is a level of language proﬁciency sufﬁcient to teach that language. In this report, generated samples and

measured metrics are from the model itself without any external augmentations such as Google Search or Translate.

PaLM 2 includes control tokens to enable inference-time control over toxicity, modifying only a fraction of pre-training

as compared to prior work (Korbak et al., 2023). Special ‘canary’ token sequences were injected into PaLM 2 pre-

training data to enable improved measures of memorization across languages (Carlini et al., 2019, 2021). We ﬁnd

that PaLM 2 has lower average rates of verbatim memorization than PaLM, and for tail languages we observe that

memorization rates increase above English only when data is repeated several times across documents. We show that

PaLM 2 has improved multilingual toxicity classiﬁcation capabilities, and evaluate potential harms and biases across a

range of potential downstream uses. We also include an analysis of the representation of people in pre-training data.

These sections help downstream developers assess potential harms in their speciﬁc application contexts (Shelby et al.,

2023), so that they can prioritize additional procedural and technical safeguards earlier in development. The rest of this

report focuses on describing the considerations that went into designing PaLM 2 and evaluating its capabilities.

Note that not all capabilities of PaLM 2 are currently exposed via PaLM 2 APIs.

剩余91页未读，继续阅读

评论收藏

内容反馈

glowlaw

2023-06-15

手册内容分类详细，可以根据需求快速查询所需信息，节省了研究时间。
余青葭

2023-06-15

在阅读手册的过程中，我更深入地了解了Google PaLM 2的设计思路和实现方法，收益匪浅。
林祈墨

2023-06-15

手册对于Google PaLM 2的理解有很大的帮助，但如果能增加一些复杂场景下的案例，可能会更加丰富。
型爷

2023-06-15

该手册提供了很多实用的技术资料，对于Google PaLM 2技术的理解有很大的帮助。
雨后的印

2023-06-15

这是一份指导Google PaLM 2技术应用的好材料，对于技术研究的人来说不容错过。

前往

页

此星光明

粉丝: 5w+
资源: 916

Google PaLM 2 技术手册

最新资源

Google PaLM 2 技术手册

谷歌发布技术报告：PaLM-2 推理超越 GPT-4，训练文本是第一代近 5 倍.pdf

谷歌PaLM 2技术报告Palm2 Tech Report.pdf

谷歌发布技术报告：PaLM-2 推理超越 GPT-4，训练文本是第一代近 5 倍

PaLM 2 Technical Report ，PaLM 2技术报告

笨鸟版PALM培训手册

palm webos开发手册

Palm OS 开发中文教程.rar

java2palm-java转palm格式软件

Palm webOS 中文版.rar

palm 自己做软件

palm 上网设置方法

palm 键盘灯开关

Palm手机用SuperMemo

palm 加载软件免安装

谷歌PaLM杀疯了，已从语言模型进化成机器人大脑？？

palm pre 刷机帮助

palm mame模拟器代码

深度睡眠 for Palm pixi / palm pixi plus

palm room.rar

stable-diffusion部署需要的包

大规模语言模型：从理论到实践

21个免费无限制免登录chatgpt资源， OpenAI GPT-4\3.5 模型的智能对话链接

人工智能大模型介绍.pptx

ChatGPT智能AI机器人微信小程序源码-带部署教程

llama3-中文微调训练集，让llama3更懂中文

diabetes糖尿病数据集

LM Studio windows版本安装

transformer代码

线性代数-同济大学第七版

最新资源