没有合适的资源?快使用搜索试试~ 我知道了~
Claude3技术报告
1 下载量 43 浏览量
2024-03-05
21:23:52
上传
评论 2
收藏 26.93MB PDF 举报
温馨提示
试读
42页
Model_Card_Claude_3.pdf
资源推荐
资源详情
资源评论
The Claude 3 Model Family: Opus, Sonnet, Haiku
Anthropic
Abstract
We introduce Claude 3, a new family of large multimodal models – Claude 3 Opus, our
most capable offering, Claude 3 Sonnet, which provides a combination of skills and speed,
and Claude 3 Haiku, our fastest and least expensive model. All new models have vision
capabilities that enable them to process and analyze image data. The Claude 3 family
demonstrates strong performance across benchmark evaluations and sets a new standard on
measures of reasoning, math, and coding. Claude 3 Opus achieves state-of-the-art results
on evaluations like GPQA [1], MMLU [2], MMMU [3] and many more. Claude 3 Haiku
performs as well or better than Claude 2 [4] on most pure-text tasks, while Sonnet and
Opus significantly outperform it. Additionally, these models exhibit improved fluency in
non-English languages, making them more versatile for a global audience. In this report,
we provide an in-depth analysis of our evaluations, focusing on core capabilities, safety,
societal impacts, and the catastrophic risk assessments we committed to in our Responsible
Scaling Policy [5].
1 Introduction
This model card introduces the Claude 3 family of models, which set new industry benchmarks across rea-
soning, math, coding, multi-lingual understanding, and vision quality.
Like its predecessors, Claude 3 models employ various training methods, such as unsupervised learning and
Constitutional AI [6]. These models were trained using hardware from Amazon Web Services (AWS) and
Google Cloud Platform (GCP), with core frameworks including PyTorch [7], JAX [8], and Triton [9].
A key enhancement in the Claude 3 family is multimodal input capabilities with text output, allowing users
to upload images (e.g., tables, graphs, photos) along with text prompts for richer context and expanded
use cases as shown in Figure 1 and Appendix B.
1
The model family also excels at tool use, also known
as function calling, allowing seamless integration of Claude’s intelligence into specialized applications and
custom workflows.
Claude 3 Opus, our most intelligent model, sets a new standard on measures of reasoning, math, and coding.
Both Opus and Sonnet demonstrate increased proficiency in nuanced content creation, analysis, forecasting,
accurate summarization, and handling scientific queries. These models are designed to empower enterprises
to automate tasks, generate revenue through user-facing applications, conduct complex financial forecasts,
and expedite research and development across various sectors. Claude 3 Haiku is the fastest and most afford-
able option on the market for its intelligence category, while also including vision capabilities. The entire
Claude 3 family improves significantly on previous generations for coding tasks and fluency in non-English
languages like Spanish and Japanese, enabling use cases like translation services and broader global utility.
Developed by Anthropic and announced in March 2024, the Claude 3 model family will be available in our
consumer offerings (Claude.ai, Claude Pro) as well as enterprise solutions like the Anthropic API, Amazon
Bedrock, and Google Vertex AI. The knowledge cutoff for the Claude 3 models is August 2023.
This model card is not intended to encompass all of our research. For comprehensive insights into our training
and evaluation methodologies, we invite you to explore our research papers (e.g., Challenges in Evaluating
1
We support JPEG/PNG/GIF/WebP, up to 10MB and 8000x8000px. We recommend avoiding small or low resolution
images.
AI Systems [10], Red Teaming Language Models to Reduce Harms [11], Capacity for Moral Self-Correction
in Large Language Models [12], Towards Measuring the Representation of Subjective Global Opinions in
Language Models [13], Frontier Threats Red Teaming for AI Safety [14], and our Responsible Scaling Policy
[5] to address catastrophic risks). In addition to our public research, we are also committed to sharing findings
and best practices across industry, government, and civil society and regularly engage with these stakeholders
to share insights and best practices. We expect to release new findings as we continue our research and
evaluations of frontier models.
2 Model Details
2.1 Intended Uses
Claude is trained to be a helpful, honest, and harmless assistant. Claude models excel at open-ended con-
versation and collaboration on ideas, and also perform exceptionally well in coding tasks and when working
with text - whether searching, writing, editing, outlining, or summarizing.
2
The Claude 3 family’s multi-
modal features can interpret visual input (e.g. charts, graphs, and photos) to support additional use cases
and productivity. Claude models have a helpful, conversational tone and can take direction on “personality.”
Users have described them as feeling steerable, adaptive, and engaging.
Claude uses all the text that users input (the prompt) and all the text it has generated so far within the con-
versation to predict the next words or tokens that would be most helpful. This means that Claude constructs
its responses one set of characters at a time, in order. It cannot go back and edit its responses after they have
been constructed unless users give it a chance to do so in a subsequent prompt. Claude can also only see (and
make predictions on) what appears in its context window. It can’t remember previous separate conversations
unless users reinsert such material in the prompt, nor can it open links.
2.2 Unintended Uses
The models should not be used on their own in high-stakes situations where an incorrect answer could cause
harm. For example, while Claude models could support a lawyer or doctor, they should not be deployed
instead of one, and any responses should still be reviewed by a human. Claude models do not currently
search the web (though users can ask them to interact with a document that they share directly), and the
models only answer questions using data up to mid-2023. Claude models can be connected to search tools
and are thoroughly trained to utilize them (over the web or other databases), but unless specifically indicated,
it should be assumed that Claude models are not using this capability. Claude models have multilingual
capabilities but perform less strongly on low-resource languages (see our multilingual evaluations below for
more details in Section 5.6).
2.3 Prohibited Uses
Our Acceptable Use Policy (AUP) [15] includes details on prohibited use cases. These prohibited uses
include, but are not limited to, political campaigning or lobbying, surveillance, social scoring, criminal justice
decisions, law enforcement, and decisions related to financing, employment, and housing. The AUP also
outlines additional safety requirements for business uses, such as requiring disclosure that an AI system is
being used and outlining what its capabilities and limitations are. The AUP also details which use cases
require implementing human-in-the-loop measures.
The AUP applies to both image and text prompts, and all Anthropic users must read and affirmatively ac-
knowledge the AUP before accessing Claude models. We regularly review and update the AUP to ensure that
our product is as safe and trustworthy as possible.
2.4 Safeguarding Against Misuse
Detecting and mitigating prohibited uses of our technology are essential to preventing bad actors from mis-
using our models to generate abusive, deceptive, or misleading content. We use automated systems to detect
violations of our AUP as they occur in real time. User prompts that are flagged as violating the AUP trigger
an instruction to our models to respond even more cautiously. In cases where the user prompt is particularly
2
For more information and advice on prompt design, please see our documentation at https://docs.anthropic.com/
claude/docs/introduction-to-prompt-design.
2
severe or harmful, we will block the model from responding altogether, and in the case of repeated violations,
we may terminate the user’s Claude access.
2.5 Training Data
Claude 3 models are trained on a proprietary mix of publicly available information on the Internet as of
August 2023, as well as non-public data from third parties, data provided by data labeling services and
paid contractors, and data we generate internally. We employ several data cleaning and filtering methods,
including deduplication and classification. The Claude 3 suite of models have not been trained on any user
prompt or output data submitted to us by users or customers, including free users, Claude Pro users, and API
customers.
When Anthropic obtains data by crawling public web pages, we follow industry practices with respect to
robots.txt instructions and other signals that website operators use to indicate whether they permit crawling
of the content on their sites. In accordance with our policies, Anthropic’s crawler does not access password-
protected or sign-in pages or bypass CAPTCHA controls, and we conduct diligence on the data that we
use. Anthropic operates its crawling system transparently, which means website operators can easily identify
Anthropic visits and signal their preferences to Anthropic.
2.6 Training Process
Claude was trained with a focus on being helpful, harmless, and honest. Training techniques include pre-
training on large diverse data to acquire language capabilities through methods like word prediction, as well
as human feedback techniques that elicit helpful, harmless, honest responses. Anthropic used a technique
called Constitutional AI [16] to align Claude with human values during reinforcement learning by explicitly
specifying rules and principles based on sources like the UN Declaration of Human Rights. With Claude 3
models, we have added an additional principle to Claude’s constitution to encourage respect for disability
rights, sourced from our research on Collective Constitutional AI [17]. Some of the human feedback data
used to finetune Claude was made public [18] alongside our RLHF [19] and red-teaming research.
Once our models are fully trained, we run a suite of evaluations for safety. Our Trust and Safety team also
runs continuous classifiers to monitor prompts and outputs for harmful, malicious use cases that violate our
AUP. See more on both in the evaluations sections below.
2.7 Release Decisions and Maintenance
We take a number of concrete steps to responsibly develop and deploy AI systems, drawing on guidance from
the NIST AI Risk Management Framework and its Map, Measure, Manage, and Govern Subcategories [20].
We clearly document the ways in which our products may and may not be used, as well as the limitations and
potential risks of using our products. We regularly evaluate our systems through interactive red teaming, as
well as assessments against benchmarks for both product performance and potential safety risks. To manage
potential risks, we incrementally roll out access to our products to ensure their safety and reliability; use a
combination of automated monitoring for potential harms and violations of our AUP, as well as human review
to audit the accuracy of our classifiers; and regularly update our models to versions that have been hardened
against newly-identified risks and potential vulnerabilities.
We also treat sensitive data and the personal information of the end users of our products and services with
great care. We implement retention policies to ensure that our storage of personal and sensitive information is
proportionate to the need for the data, such as to monitor and improve our Trust and Safety processes. For our
consumer products and use of our website, our privacy policy [21] shares additional details on data privacy,
use, and retention.
We also follow our Responsible Scaling Policy, which guides our development and deployment of increas-
ingly capable AI systems, as described below. As a Public Benefit Corporation (PBC), we are focused on
the safe development and deployment of AI systems at all levels of the organization, up to and including our
executive leadership team.
3
3 Security
We protect the security of the environment of our models to help ensure their integrity using a variety of con-
nection authentication and authorization techniques; people are required to use multi-factor authentication at
all times. Our advanced models are protected by two-party controls. Access to AI model infrastructure is
granted explicitly per user and validated per access attempt. All accounts with access to the serving infrastruc-
ture hosting our services are protected via rigorous password requirements and multi-factor authentication.
Each account is provisioned with the minimum privilege levels needed by its owner. Additional layers of
defense include continuous systems’ monitoring, 24/7 alert response, endpoint hardening, data storage and
sharing controls, personnel vetting, and physical security hardening. We take significant care in testing any
code changes prior to deployment to production environments including code review. Finally, we engage
with penetration testers to exercise our detection systems and improve our defense posture.
4 Social Responsibility
As a PBC, Anthropic is committed to developing safe and responsible AI systems throughout each stage of
the development process. Claude 3 models show a more nuanced understanding of requests, recognize real
harm, and refuse to answer harmless prompts less often than prior models. That said, they can still make
mistakes and our work to make Claude more helpful, harmless, and honest is ongoing. Ethical considerations
also shape both our AUP, which delineates permissible and impermissible uses of Claude, and the Trust and
Safety processes that enforce it.
4.1 Constitutional AI
Our core research focus has been training Claude models to be helpful, honest, and harmless. Currently, we
do this by giving models a Constitution – a set of ethical and behavioral principles that the model uses to
guide its outputs. The majority of the principles in Claude’s constitution are the same as those we published
in May 2023 [6]. Using this Constitution, models are trained to avoid sexist, racist, and toxic outputs, as well
as to avoid helping a human engage in illegal or unethical activities. In response to our work on Collective
Constitutional AI [17], we added an additional principle informed by our public input process, which in-
structs Claude to be understanding of and accessible to individuals with disabilities, resulting in lower model
stereotype bias.
4.2 Labor
Anthropic works with several data work platforms which are responsible for engaging and managing data
workers who work on Anthropic’s projects.
Data work tasks include selecting preferred model outputs in order to train AI models to align with those
preferences; evaluating model outputs according to a broad range of criteria (e.g., accuracy, helpfulness,
harmlessness, etc.); and adversarially testing (i.e., red teaming) our models to identify potential safety vul-
nerabilities. This data work is primarily used in our technical safety research, and select aspects of it are also
used in our model training.
4.3 Sustainability
We offset our emissions (including from our cloud computing usage) and work with cloud providers that
prioritize renewable energy and carbon neutrality. Anthropic works to fully offset our operational carbon
emissions each year, partnering with external experts to conduct a rigorous analysis of our company-wide
carbon footprint. Once measured, we invest in verified carbon credits to fully offset our annual footprint.
Our credits directly fund emissions reduction projects. Our goal is to maintain net zero climate impact on an
annual basis through such initiatives and offsets.
5 Core Capabilities Evaluations
We conducted a comprehensive evaluation of the Claude 3 family to analyze trends in their capabilities across
various domains. Our assessment included several broad categories:
4
• Reasoning: Benchmarks in this category require mathematical, scientific, and commonsense rea-
soning, testing the models’ ability to draw logical conclusions and apply knowledge to real-world
scenarios.
• Multilingual: This category comprises tasks for translation, summarization, and reasoning in mul-
tiple languages, evaluating the models’ linguistic versatility and cross-lingual understanding.
• Long Context: These evaluations are focused on question answering and retrieval, assessing the
models’ performance in handling extended texts and extracting relevant information.
• Honesty / Factuality: Questions in this category assess the models’ ability to provide accurate
and reliable responses, either in terms of factual accuracy or fidelity to provided source materials.
When unsure, the models are expected to be honest about their limitations, expressing uncertainty
or admitting that they do not have sufficient information to provide a definitive answer.
• Multimodal: Evaluations include questions on science diagrams, visual question answering, and
quantitative reasoning based on images.
These capabilities evaluations helped measure the models’ skills, strengths, and weaknesses across a range
of tasks. Many of these evaluations are industry standard, and we have invested in additional evaluation
techniques and topics described below. We also present internal benchmarks we’ve developed over the course
of training to address issues with harmless refusals.
5.1 Reasoning, Coding, and Question Answering
We evaluated the Claude 3 family on a series of industry-standard benchmarks covering reasoning, read-
ing comprehension, math, science, and coding. The Claude 3 models demonstrate superior capabilities in
these areas, surpassing previous Claude models, and in many cases achieving state-of-the-art results. These
improvements are highlighted in our results presented in Table 1.
We tested our models on challenging domain-specific questions in GPQA [1], MMLU [2], ARC-Challenge
[22], and PubMedQA [23]; math problem solving in both English (GSM8K, MATH) [24, 25] and multilingual
settings (MGSM) [26]; common-sense reasoning in HellaSwag [27], WinoGrande [28]; reasoning over text in
DROP [29]; reading comprehension in RACE-H [30] and QuALITY [31] (see Table 6); coding in HumanEval
[32], APPS [33], and MBPP [34]; and a variety of tasks in BIG-Bench-Hard [35, 36].
GPQA (A Graduate-Level Google-Proof Q&A Benchmark) is of particular interest because it is a new evalu-
ation released in November 2023 with difficult questions focused on graduate level expertise and reasoning.
We focus mainly on the Diamond set as it was selected by identifying questions where domain experts agreed
on the solution, but experts from other domains could not successfully answer the questions despite spending
more than 30 minutes per problem, with full internet access. We found the GPQA evaluation to have very
high variance when sampling with chain-of-thought at T = 1. In order to reliably evaluate scores on the Di-
amond set 0-shot CoT (50.4%) and 5-shot CoT (53.3%), we compute the mean over 10 different evaluation
rollouts. In each rollout, we randomize the order of the multiple choice options. We see that Claude 3 Opus
typically scores around 50% accuracy. This improves greatly on prior models but falls somewhat short of
graduate-level domain experts, who achieve accuracy scores in the 60-80% range [1] on these questions.
We leverage majority voting [37] at test time to evaluate the performance by asking models to solve each
problem using chain-of-thought reasoning (CoT) [38] N different times, sampling at T = 1, and then we
report the answer that occurs most often. When we evaluate in this way in a few-shot setting Maj@32 Opus
achieves a score of 73.7% for MATH and 59.5% for GPQA. For the latter, we averaged over 10 iterations of
Maj@32 as even with this evaluation methodology, there was significant variance (with some rollouts scoring
in the low 60s, and others in the mid-to-high 50s).
5
剩余41页未读,继续阅读
资源评论
Kiss-AI
- 粉丝: 152
- 资源: 1
上传资源 快速赚钱
- 我的内容管理 展开
- 我的资源 快来上传第一个资源
- 我的收益 登录查看自己的收益
- 我的积分 登录查看自己的积分
- 我的C币 登录后查看C币余额
- 我的收藏
- 我的下载
- 下载帮助
安全验证
文档复制为VIP权益,开通VIP直接复制
信息提交成功