键盘敲击声解码论文(需要翻译)_键盘敲击声资源-CSDN文库

毕业设计

需积分: 5 73 浏览量 2023-08-15 18:59:10 上传评论收藏 12.85MB PDF 举报

资源推荐

资源详情

资源评论

A Practical Deep Learning-Based Acoustic Side

Channel Attack on Keyboards

Joshua Harrison

, Ehsan Toreini

, and Maryam Mehrnezhad

Durham University, joshua.b.harrison@durham.ac.uk

University of Surrey, e.toreini@surrey.ac.uk

Royal Holloway University of London,

maryam.mehrnezhad@rhul.ac.uk

August 3, 2023

Abstract

With recent developments in deep learning, the ubiquity of micro-

phones and the rise in online services via personal devices, acoustic side

channel attacks present a greater threat to keyboards than ever. This pa-

per presents a practical implementation of a state-of-the-art deep learning

model in order to classify laptop keystrokes, using a smartphone integrated

microphone. When trained on keystrokes recorded by a nearby phone, the

classiﬁer achieved an accuracy of 95%, the highest accuracy seen without

the use of a language model. When trained on keystrokes recorded using

the video-conferencing software Zoom, an accuracy of 93% was achieved,

a new best for the medium. Our results prove the practicality of these

side channel attacks via oﬀ-the-shelf equipment and algorithms. We dis-

cuss a series of mitigation methods to protect users against these series of

attacks.

Index terms— Acoustic side channel attack, Deep learning, User security and

privacy, Laptop keystroke attacks, Zoom-based acoustic attacks

1 Introduction

Side channel attacks (SCAs) involve the collection and interpretation of sig-

nals emitted by a device [30]. Such attacks have been successfully implemented

utilising a number of emanation types, such as electromagnetic (EM) waves

[34], power consumption [17], mobile sensors [23, 22, 21], as well as sound [4].

With such a wide range of available mediums, target devices have been similarly

varied, with compromised devices including printers [5], the Enigma machine

[32] and even Intel x86 processors [37]. It was found in [34] that wireless key-

boards produce detectable and readable EM emanations, however there exists

a far more prevalent emanation that is both ubiquitous and easier to detect:

keystroke sounds [27]. The ubiquity of keyboard acoustic emanations makes

them not only a readily available attack vector, but also prompts victims to

underestimate (and therefore not try to hide) their output. For example, when

arXiv:2308.01074v1 [cs.CR] 2 Aug 2023

typing a password, people will regularly hide their screen but will do little

to obfuscate their keyboard’s sound. The lack of concern regarding keyboard

acoustics could be due to the relatively small body of modern literature. While

multiple papers have created models capable of inferring the correct key from

test data, these models are often trained and tested on older, thicker, mechani-

cal keyboards with far more pronounced acoustics than modern ones, especially

laptops.

While keyboards have gotten less pronounced over time, the technology with

which their acoustics can be accessed and processed has improved dramatically.

Examples include advancements in microphone technology, with Voice over In-

ternet Protocol (VoIP) calls [8] and smartwatches [20] being used to collect

keystroke recordings.

Deep Learning (DL) is a subsection of machine learning (ML), in which the

model consists of multiple layers of connected neurons. Despite being prevalent

in the ﬁeld of computing since the 1960s, DL saw a boom in research in the 2010s

beneﬁting from improvements in graphics processing technology and resulting

in huge advances in image recognition [18], the invention of Generative Adver-

sarial Networks [14] and the invention of transformers [33]. This trend in the

performance improvement continues still, with the recent development of the

state-of-the-art CoAt Network for image recognition [9], which combines more

traditional convolutional models with transformers. This improvement in DL

performance coincides with an increase in access to DL tools. Python packages

such as PyTorch [26] provide free and near-universal access to the tools required

to run these models on most devices. With the recent developments in both the

performance of (and access to) both microphones and DL models, the feasibility

of an acoustic attack on keyboards begins to look likely, as reiterated in recent

research [6]. While recent papers have explored the viability of ASCAs on lap-

top keyboards [6, 8], the area remains under-explored considering that laptops

make a prime attack vector. Laptops are more transportable than desktop com-

puters and therefore more available in public areas where keyboard acoustics

may be overheard, such as libraries, coﬀee shops and study spaces. Moreover,

laptops are non-modular, meaning the same model will have the same keyboard

and hence similar keyboard emanations. This uniformity within laptops could

mean that, should a popular laptop prove susceptible to ASCA, a large portion

of the population could be at risk.

In the early 2000s, SCA attacks evaluation was suggested to be encompassed

in cryptographic algorithm evaluation in many international standards bodies,

such as 3GPP security architecture [1]. However, due to a lack of testable

methods and practical tools, such an important suggestion never turned into

practical standards and guidelines. There have been many academic attempts,

but nothing led to standardisation. For instance, in a NIST report in 2011

[13], a testing methodology was proposed to assess whether a cryptographic

module utilising side channel analysis countermeasures can provide resistance to

these attacks commensurate with the desired security level. In a recent report

[2], the authors developed and compared SCA-protected implementations of

three ﬁnalists in the NIST LWC standardisation process. While there is no

speciﬁc research dedicated to side channel attack standardisation, there have

been industrial attempts to rectify some of the known attacks. For instance,

in 2018 Google proposed a new technique to mitigate the infamous Spectre

class of attacks. Similarly, Intel added hardware and ﬁrmware mitigations to

tackle the same range of side channel attacks. Similarly, some general guidelines

lines have been developed. For instance, the NSA TEMPEST includes acoustic

emanations as a side channel but there are limitations in how they have deﬁned

acoustic in their terminology. Also, FIPS 140-3 draft, does not include acoustic

emanations as a side channel, despite the fact that it has been used to extract

RSA private keys from CPU’s [12]. Despite these eﬀorts, there is no explicit

standardisation work on ASC attacks. W3C speciﬁcations on sensors

(e.g.,

motion sensors on mobile devices) has a dedicated section to security and privacy

considerations, where among the other risks, suggests keystroke monitoring as

one of the possible threats enabled by such sensors. These sensors have proved

to contribute to ASC attacks. The mitigation strategies suggest a range of

methods, though none of them guarantees full support.

In this paper, we present a practical fully–automated ASCA which deploys

cutting edge deep learning models to improve the body of knowledge. We will

address these research questions: (RQ1) Can we design and implement a fully

automated ASCA pipeline, including the keystroke separation, feature extrac-

tion and predictions? (RQ2) Can we deploy an accurate deep learning approach

for ASCA? (RQ3) Can we perform an accurate remote ASCA attack on VoIP

communications considering the compression and information loss in the audio

transmissions?

In this paper, we contribute to the body of knowledge in a number of ways.

(1) We propose a novel technique to deploy deep learning models featuring self-

attention layers for an ASC attack on a keyboard for the ﬁrst time. (2) We

propose and implement a practical deep learning-based acoustic side channel

attack on keyboards. We use self-attention transformer layers in this attack

on keyboards for the ﬁrst time. (3) We evaluated our designed attack in real–

world attack scenarios; laptop keyboards in the same room as the attacker

microphone (via a mobile device) and laptop keystrokes via a Zoom call. We

perform experiments and run multiple evaluations and our results outperform

those of previous work.

2 Related Work

While they remain a relatively under-explored topic of research, ASCAs are

not a new concept to the ﬁeld of cybersecurity. Encryption devices have been

subject to emanation-based attacks since the 1950s, with British spies utilising

the acoustic emanations of Hagelin encryption devices (of very similar design to

Enigma) within the Egyptian embassy [35]. Additionally, the earliest paper on

emanation-based SCAs found by this review was written for the United States’

National Security Agency (NSA) in 1972 [11]. This governmental origin of AS-

CAs creates speculation that such an attack may already be possible on modern

devices, but remains classiﬁed. [4] notes that classiﬁed documents produced by

the NSA’s side channel speciﬁcation (TEMPEST) are known to discuss acous-

tic emanations. Additionally, the partially declassiﬁed NSA document NACSIM

5000 [24] explicitly listed acoustic emanations as a source of compromise in 1982.

Within the realm of public knowledge, ASCAs have seen varying success when

applied to modern keyboards, employing a similarly varied array of methods.

w3.org/TR/generic-sensor/#mitigation-strategies

Surveying these methods, various observations may be made about the current

research landscape.

In the last decade, the number of microphones within acoustic range of

keyboards has increased and will likely continue to do so. In an attempt to

explore these attack vectors, recent research has been utilising alternate methods

of keystroke collection. As an example, in [38], the authors implemented an

attack utilising a number of oﬀ-the-shelf smartphones. These devices (as is the

case for a majority of modern phones) feature 2 distinct microphones at opposite

ends of the phone. When used together, recordings made by the collective

microphones provided suﬃcient time delay of arrival (TDoA) information to

triangulate keystroke position, achieving over 72.2% accuracy. [6] built upon this

research by implementing TDoA via a single smartphone in order to establish

distance to a target device, eventually achieving 91.52% keystroke accuracy

when used within a larger attack pipeline.

Alongside smartphones, video conferencing applications have seen promising

results as an attack vector. Keystrokes intercepted from a VoIP call were used

in [3], achieving a keystroke accuracy of 74.3% and this success was echoed by

[8] which achieved a top-5 accuracy of 91.7% via simply calling a victim over

Skype. These successes mark the ﬁrst ASCAs implemented without the need for

physical access to a victim’s vicinity and carry the implication that if a victim’s

microphone could be accessed covertly, a similar attack could be performed. The

same implication can be found with the use of smartwatches as an attack vector.

While it remains unlikely an attacker could covertly place their smartwatch

in a private location such as an oﬃce, compromising a victim’s smartwatch

could allow unbridled collection of acoustic keystroke information. Additionally,

smartwatches can uniquely access wrist motion, a concerning property which is

utilised by [20] to achieve 93.75% word recovery.

One approach that saw prominent usage in the 2000’s but has become less

common in modern papers is the use of hidden Markov models (HMMs). A

HMM (in this context) is a model trained on a corpus of text in order to predict

the most likely word or character in the positions of a sequence. For example,

if a classiﬁer output ‘Hwllo’, a HMM could be used to infer that ‘w’ was in fact

a falsely classiﬁed ‘e’. [39] presents a method of ASCA attack on keyboards in

which two HMMs are utilised: the ﬁrst generating likely letters from a series of

classes and the second correcting the grammar and spelling of the ﬁrst. Similarly

to [39], [5] used a HMM to correct the output of a classiﬁer and saw an increase

from 72% to 95% accuracy when implemented. A diﬀerence in the two studies

however, sheds light on a potential drawback to HMM usage (and the possible

reason for lack of recent popularity).

In much of the literature, neural networks are not perceived as very successful

models when conducting keystroke recognition. In [39], a neural network was

tested against a linear classiﬁer and was deemed less accurate. Additionally, in

[16] a neural network was found to perform the worst out of all methods tested,

and it is noted that neither [39] nor [16] could reproduce the results achieved in

[4] through use of a neural network. [3] found that multiple methods performed

better than neural networks in testing, while [32] implemented a neural network

that performed third best out of all tested classiﬁers. A majority of these papers

give very little detail regarding the structure or size of the neural networks

implemented, making comparison between them diﬃcult, but in none of these

cases was a neural network selected as the ﬁnal model. Given that Transformers

were invented in 2018 by Vaswani et al. [33], this paper is the ﬁrst use of neural

networks featuring self-attention layers for an ASC attack on a keyboard.

Alongside models, variety exists between studies with respect to target de-

vices. [4], the paper most commonly cited as the ﬁrst ASCA targeting a key-

board, was written in 2004 and attacked high-proﬁle plastic keyboards synony-

mous with the time. Despite being such an early paper in the ﬁeld, success

was found in attacking an ATM keypad, a corded telephone as well as 2 keys

from a laptop keyboard. While [39] and [7] perform their experiments on key-

boards similar to those from [4], [16] investigates a more modern keyboard with

a slightly recessed design. The keycaps remain large and plastic however and

diﬀer greatly from modern laptop keyboards. [16]’s authors do however ac-

knowledge that the testing of laptop keyboards may produce diﬀerent results,

due to a lack of ‘release peak’ in the waveform.

Of the surveyed literature, [6] and [8] were the only 2 papers to feature AS-

CAs on full laptop keyboards and are the most promising studies with respect

to real-world implementation. Both papers utilise two statistical models used in

similar ways: the ﬁrst to infer some information regarding the victim’s environ-

ment and the second to classify keystrokes into letters. The two papers diﬀer

in most other ways however, with [8] gathering keystrokes via Skype and the

inbuilt microphone of the laptops, while [6] utilises a mobile phone placed near

the victim’s computer. Additionally, [8] uses k-NN clustering and a Logistic

Regression classiﬁer while [6] utilises support vector machines (SVMs). Despite

their diﬀerences, both papers are notable for their accuracy, with [6] achiev-

ing 91.2% in cross validation and 72.25% when attacking unknown victims and

keyboards. Meanwhile [8] achieves a top-5 accuracy of 91.7% given knowledge

of the victim’s typing style. [6] implements it’s attack on 2 laptops, made by

Alienware and Lenovo respectively and is notable for being the only study to

feature membrane keyboards. [8] presents a much more representative study of

keyboards, attacking 6 laptops, two of each: MacBook Pro 13” 2014, Lenovo

Thinkpad E540 and Toshiba Tecras M2.

3 Attack Design

In this section, we discuss the overall design of our proposed ASC attack. Then,

we explain our proposed approach in data collection, feature extraction and our

model design.

3.1 Fully–automated On–site and Remote ASCA

In both set of experiments (via phone and Zoom), 36 of the laptop’s keys were

used (0-9, a-z) with each being pressed 25 times in a row, varying in pressure

and ﬁnger, and a single ﬁle containing all 25 presses.

Keystroke isolation: Once all presses were recorded, a function was im-

plemented with which individual keystrokes could be extracted. Keystroke ex-

traction is executed in a majority of recent literature [39, 7, 16, 6] via a similar

method: performing the fast Fourier transform on the recording and summing

the coeﬃcients across frequencies to get ‘energy’. An energy threshold is then

deﬁned and used to signify the presence of a keystroke. The complete isolation

剩余20页未读，继续阅读

评论收藏

内容反馈

mike_tzx

粉丝: 1
资源: 4

键盘敲击声解码论文(需要翻译)

敲键盘音效敲击键盘的声音.wav

键盘钩子，利用键盘钩子敲击键盘发出不同的声音提示

关闭Windows 8触摸键盘模拟真实键盘敲击声.docx

jQuery五彩炫酷模拟键盘 带键盘敲击声音

基于人耳听觉模型的煤矿顶板敲击声音信号特征提取

C语言入门例程-模拟键盘打字键盘敲击背景声中逐个输出字符

利用敲击声音信号进行禽蛋破损检测和模糊识别

键盘敲击识别技术靠谱吗.docx

VB实现敲击键盘发出不同的声音.rar

键盘监控_记录键盘的动作和敲击按键时的时间.zip

tiper 键盘发声软件 V1.63

控制台键盘字母敲击游戏

键盘钢琴让您敲击键盘弹奏钢琴

按键盘字母发音

敲击键盘PPT背景图片.zip

vuekeyboard是一个模拟人手敲击键盘的vuejs组件

鼠标键盘里程计 v1.0.zip

C++开发的模拟敲击键盘报语音系统.7z

HTML5 虚拟键盘模拟的键盘事件.rar

34个经典javaweb项目实例.zip

毕业设计 springBoot人力资源管理系统+毕业论文+前后端源代码

项目源码：基于Hadoop+Spark招聘推荐可视化系统 大数据项目 计算机毕业设计

毕业设计：舆情监测系统（SpringBoot+NLP）

基于spring boot的小区物业管理系统源码+论文+答辩ppt

计算机毕业设计：Flask股票数据采集分析可视化系统 python+爬虫+金融数据

人脸识别系统OpenCV+dlib+python（含数据库）Pyqt5界面设计 项目源码 毕业设计

毕业设计-基于JAVA的springboot超市进销存系统(源代码+论文）

基于51单片机的智能电子秤系统设计(含代码仿真及论文)无需积分！

Python爬取智联招聘网站数据，2023.10.31测试，可跑

最新资源

jQuery五彩炫酷模拟键盘带键盘敲击声音

项目源码：基于Hadoop+Spark招聘推荐可视化系统大数据项目计算机毕业设计

人脸识别系统OpenCV+dlib+python（含数据库）Pyqt5界面设计项目源码毕业设计