没有合适的资源?快使用搜索试试~ 我知道了~
使用 GitHub 增强基于 AI 的代码合成的安全性 通过廉价而高效的即时工程实现副驾驶.docx
1.该资源内容由用户上传,如若侵权请联系客服进行举报
2.虚拟产品一经售出概不退款(资源遇到问题,请及时私信上传者)
2.虚拟产品一经售出概不退款(资源遇到问题,请及时私信上传者)
版权申诉
0 下载量 133 浏览量
2024-04-17
12:47:22
上传
评论
收藏 82KB DOCX 举报
温馨提示
试读
7页
使用 GitHub 增强基于 AI 的代码合成的安全性 通过廉价而高效的即时工程实现副驾驶.docx
资源推荐
资源详情
资源评论
Enhancing Security of AI-Based Code Synthesis with GitHub
Copilot via Cheap and Efficient Prompt-Engineering
Jakub Res
iresj@fit.vut.cz
Brno University of Technology,
Faculty of Information Technology
Czech Republic
Aleš Smrčka
smrcka@fit.vut.cz
Brno University of Technology,
Faculty of Information Technology
Czech Republic
ABSTRACT
Ivan Homoliak
ihomoliak@fit.vut.cz
Brno University of Technology,
Faculty of Information Technology
Czech Republic
Kamil
Malinka
malinka@fit.vut.cz
Brno University of Technology,
Faculty of Information Technology
Czech Republic
Martin Perešíni
iperesini@fit.vut.cz
Brno University of Technology,
Faculty of Information Technology
Czech Republic
Petr Hanacek
hanacek@fit.vut.cz
Brno University of Technology,
Faculty of Information Technology
Czech Republic
AI assistants for coding are on the rise. However one of the reasons
developers and companies avoid harnessing their full potential is
the questionable security of the generated code. This paper first
reviews the current state-of-the-art and identifies areas for im-
provement on this issue. Then, we propose a systematic approach
based on prompt-altering methods to achieve better code security
of (even proprietary black-box) AI-based code generators such as
GitHub Copilot, while minimizing the complexity of the applica-
tion from the user point-of-view, the computational resources, and
operational costs. In sum, we propose and evaluate three prompt
altering methods: (1) scenario-specific, (2) iterative, and (3) general
clause, while we discuss their combination. Contrary to the audit
of code security, the latter two of the proposed methods require
no expert knowledge from the user. We assess the effectiveness of
the proposed methods on the GitHub Copilot using the OpenVPN
project in realistic scenarios, and we demonstrate that the proposed
methods reduce the number of insecure generated code samples
by up to 16% and increase the number of secure code by up to
8%. Since our approach does not require access to the internals of
the AI models, it can be in general applied to any AI-based code
synthesizer, not only GitHub Copilot.
1
INTRODUCTION
With the release of ChatGPT [1] , public attention shifted towards
AI assistant tools. These assistants are proficient in many areas,
including software engineering or coding. The advent of AI coding
assistants means transitioning from intelligent code-completion
tools to code-generating tools. Although these AI assistants are far
from perfect, in terms of solving coding problems, a recent model
AlphaCode 2, proposed by Deepmind, scored better than over 85 %
of human competitors [9].
According to Liang et al. [11] in the survey with 410 Github users’
responses, 70 % of respondents who had experiences with Github
Copilot utilize it at least once in a month while 46 % utilize the AI
assistant daily. The most frequent reasons for developers using AI
assistants were fewer keystrokes to write code and faster coding.
Due to the rapidly rising popularity of AI assistants, researchers
started to focus on studying the quality of the synthesized code and
Fig. 1: Example of security issue generated by AI. The sce-
nario comes from the dataset proposed in [17].
ways of improving it (see Sec. 5.2). While observing the validity
or correctness, many studies overlook the crucial aspect of code—
security.
In the motivating example, the AI assistant was tasked with
generating a code snippet to fill a gap in the context of a C program.
Its objective was to create a new instance of the structure "person"
and assign a status value of zero to it. Although the AI assistant
provided a reasonable code (see Fig. 1), the snippet contain CWE-
476 [25] (the malloc function could fail to allocate memory, thus
resulting in a NULL pointer dereference).
In this research, we aim to study various ways of improving
code security generated by any proprietary Large Language Mod-
els (LLMs), and we demonstrate our approach on the well-known
GitHub Copilot [6].
There exist a few categories for improving the code synthe-
sis of AI models, such as output optimization, model fine-tuning,
and prompt engineering, and each of them has some pros and
cons. In this work, we focus on efficiency, generality, and low
costs, and therefore prompt engineering is the most suitable tech-
nique for us. While literature for prompt engineering is mostly
general [14][31][5][4], we are more specific and determine four ap-
proaches to it, which we further investigate: (1) scenario-specific
information and warning providing, (2) iterative security-specific
prompting, (3) general alignment shifting using inception prompt
(i.e., general clause), (4) cooperative agents system. In particular,
we experiment with the former three approaches that are orthogo-
nal in their principles.
Contributions. The contributions of our paper are as follows:
(1)
We reviewed the literature and identified three different
areas of code synthesis improvements of LLMs, involving
person * new
P
e
r
s
on
=
(
person
*)
m
a
ll
o
c
(
s
i
z
eo
f
(
person
));
ne
w
P
e
r
s
on
-
>
s
t
a
t
u
s
=
0
;
arXiv:2403.12671v1
[cs.CR]
19 Mar 2024
资源评论
百态老人
- 粉丝: 1584
- 资源: 2万+
下载权益
C知道特权
VIP文章
课程特权
开通VIP
上传资源 快速赚钱
- 我的内容管理 展开
- 我的资源 快来上传第一个资源
- 我的收益 登录查看自己的收益
- 我的积分 登录查看自己的积分
- 我的C币 登录后查看C币余额
- 我的收藏
- 我的下载
- 下载帮助
安全验证
文档复制为VIP权益,开通VIP直接复制
信息提交成功