没有合适的资源?快使用搜索试试~ 我知道了~
资源推荐
资源详情
资源评论
4/23/23, 2:21 PM
How ChatGPT actually works
https://www.assemblyai.com/blog/how-chatgpt-actually-works/?continueFlag=e8b9a5063408f7cd43498176aa606bf5
1/45
DEEP
LEARNING
How
ChatGPT
actually
works
Since its release
,
the public has
been playing with ChatGPT and
seeing what it can do
,
but how
does ChatGPT actually work
?
While the details of its inner
workings have not been published
,
we can piece together its
functioning principles from recent
research
.
Marco Ramponi
Developer Educator at
AssemblyAI
Dec
23,
2022
Blog About
AssemblyAI
Try our AI
Models
4/23/23, 2:21 PM
How ChatGPT actually works
https://www.assemblyai.com/blog/how-chatgpt-actually-works/?continueFlag=e8b9a5063408f7cd43498176aa606bf5
2/45
ChatGPT is the latest
language model from
OpenAI and represents
a signi
fi
cant
improvement over its
predecessor GPT
-3.
Similarly to many Large
Language Models
,
ChatGPT is capable of
generating text in a wide
range of styles and for
di
erent purposes
,
but
with remarkably greater
precision
,
detail
,
and
coherence
.
It represents
the next generation in
OpenAI
'
s line of Large
Language Models
,
and it
is designed with a strong
focus on interactive
conversations
.
The creators have used
a combination of both
Supervised Learning and
Reinforcement Learning
to
fi
ne
-
tune ChatGPT
,
but it is the
Reinforcement
Blog About
AssemblyAI
Try our AI
Models
4/23/23, 2:21 PM
How ChatGPT actually works
https://www.assemblyai.com/blog/how-chatgpt-actually-works/?continueFlag=e8b9a5063408f7cd43498176aa606bf5
3/45
Learning component
speci
fi
cally that makes
ChatGPT unique
.
The
creators use a particular
technique called
Reinforcement Learning
from Human Feedback
(
RLHF
),
which uses
human feedback in the
training loop to minimize
harmful
,
untruthful
,
and
/
or biased outputs
.
We are going to examine
GPT
-3'
s limitations and
how they stem from its
training process
,
before
learning how RLHF
works and
understand
how ChatGPT uses
RLHF to overcome
these issues
.
We will
conclude by looking at
some of the limitations
of this methodology
.
Capability vs
Alignment in
Blog About
AssemblyAI
Try our AI
Models
4/23/23, 2:21 PM
How ChatGPT actually works
https://www.assemblyai.com/blog/how-chatgpt-actually-works/?continueFlag=e8b9a5063408f7cd43498176aa606bf5
4/45
"
alignment vs capability
"
can be
thought of as a more abstract
analogue of
"
accuracy vs
precision
"
In the context of
machine learning
,
the
term
capability
refers to
a model
'
s ability to
perform a speci
fi
c task
or set of tasks
.
A
model
'
s capability is
typically evaluated by
how well it is able to
optimize its objective
function
,
the
mathematical
expression that de
fi
nes
the goal of the model
.
For example
,
a model
designed to predict
stock market prices
Large Language
Models
Blog About
AssemblyAI
Try our AI
Models
4/23/23, 2:21 PM
How ChatGPT actually works
https://www.assemblyai.com/blog/how-chatgpt-actually-works/?continueFlag=e8b9a5063408f7cd43498176aa606bf5
5/45
might have an objective
function that measures
the accuracy of the
model
'
s predictions
.
If
the model is able to
accurately predict the
movement of stock
prices over time
,
it
would be considered to
have a high level of
capability for this task
.
Alignment
,
on the other
hand
,
is concerned with
what we actually want
the model to do
versus
what it is being trained
to do
.
It asks the
question
“
is that
objective function
consistent with our
intentions
?”
and refers
to the extent to which a
model
'
s goals and
behavior align with
human values and
expectations
.
For a
simple concrete
example
,
say we train a
Table of contents
Capability vs Alignment in Large Language
Models
How language model training strategies
can produce misalignment
Reinforcement Learning from Human
Feedback
Performance Evaluation
Shortcomings of the methodology
Selected references for further reading
Blog About
AssemblyAI
Try our AI
Models
剩余44页未读,继续阅读
资源评论
IT徐师兄
- 粉丝: 1838
- 资源: 2689
上传资源 快速赚钱
- 我的内容管理 展开
- 我的资源 快来上传第一个资源
- 我的收益 登录查看自己的收益
- 我的积分 登录查看自己的积分
- 我的C币 登录后查看C币余额
- 我的收藏
- 我的下载
- 下载帮助
安全验证
文档复制为VIP权益,开通VIP直接复制
信息提交成功