【免费】Aardvark协议（BFT）1资源-CSDN文库

需积分: 0 59 浏览量更新于2022-08-03 收藏 429KB PDF 举报

《Aardvark协议（BFT）1》在分布式计算领域，拜占庭容错（Byzantine Fault Tolerance，BFT）是确保系统在存在恶意或错误行为的节点时仍能正常运行的关键技术。Aardvark协议是针对这一问题提出的一种新型BFT协议，旨在提高对拜占庭故障的容忍度，同时保持高效的性能。近年来，虽然已经开发出了一些快速的BFT状态机复制协议，如PBFT（Practical Byzantine Fault Tolerance）、Q/U、HQ和Zyzzyva等，但这些协议在面对单个故障的客户端或服务器时表现得相当脆弱。它们的性能可能会大幅度下降，甚至导致长时间的服务不可用。这种现象与BFT系统的初衷相悖，因为真正的拜占庭容错系统应该能够在出现故障时依然提供可靠服务。本论文首先揭示了现有BFT协议的脆弱性，指出它们在面对拜占庭故障时的不足。然后，它提出了构建BFT服务的一系列原则，以确保即使在发生拜占庭故障时，系统仍能保持实用性。基于这些原则，研究者设计并实现了一种新的BFT协议——Aardvark。 Aardvark协议的设计目标是在保持高效性能的同时，增强对拜占庭故障的容忍度。在测试中，Aardvark协议的峰值性能可以达到最优秀的现有协议的96%，并且在最多f个服务器和任意数量的客户端出现故障时，仍然能提供相当一部分性能。实验结果显示，对于各种注入的故障，Aardvark协议在吞吐量上表现出色，范围在每秒11706到38667个请求之间。 Aardvark协议的核心创新在于其对拜占庭故障的处理策略和优化机制。它采用了更精细的故障检测和恢复机制，以及更为灵活的共识算法，以减少单点故障对整个系统的影响。此外，Aardvark还通过改进通信和验证流程，降低了处理故障时的延迟，提高了系统的可用性。 Aardvark协议是对传统BFT协议的一次重要改进，它不仅提升了系统的容错能力，而且在性能上也取得了显著的提升。这为构建更安全、更可靠的分布式系统提供了新的解决方案，尤其是在那些对容错性和性能有高要求的应用场景中，Aardvark协议具有很大的应用潜力。

USENIX Association NSDI ’09: 6th USENIX Symposium on Networked Systems Design and Implementation 153

Making Byzantine Fault Tolerant Systems

Tolerate Byzantine Faults

Allen Clement, Edmund Wong, Lorenzo Alvisi, Mike Dahlin

The University of Texas at Austin

Mirco Marchetti

The University of Modena and Reggio Emilia

Abstract

This paper argues for a new approach to building Byzan-

tine fault tolerant replication systems. We observe that

although recently developed BFT state machine replica-

tion protocols are quite fast, they don’t tolerate Byzantine

faults very well: a single faulty client or server is capa-

ble of rendering PBFT, Q/U, HQ, and Zyzzyva virtually

unusable. In this paper, we (1) demonstrate that exist-

ing protocols are dangerously fragile, (2) deﬁne a set of

principles for constructing BFT services that remain use-

ful even when Byzantine faults occur, and (3) apply these

principles to construct a new protocol, Aardvark. Aard-

vark can achieve peak performance within 40% of that of

the best existing protocol in our tests and provide a sig-

niﬁcant fraction of that performance when up to f servers

and any number of clients are faulty. We observe useful

throughputs between 11706 and 38667 requests per sec-

ond for a broad range of injected faults.

1 Introduction

This paper is motivated by a simple observation: al-

though recently developed BFT state machine replica-

tion protocols have driven the costs of BFT replication

to remarkably low levels [1, 8, 12, 18], the reality is that

they don’t tolerate Byzantine faults very well. In fact, a

single faulty client or server can render these systems ef-

fectively unusable by inﬂicting multiple orders of mag-

nitude reductions in throughput and even long periods

of complete unavailability. Performance degradations of

such degree are at odds with what one would expect from

a system that calls itself Byzantine fault tolerant—after

all, if a single fault can render a system unavailable, can

that system truly be said to tolerate failures?

To illustrate the the problem, Table 1 shows the mea-

sured performance of a variety of systems both in the

absence of failures and when a single faulty client sub-

mits a carefully crafted series of requests. As we show

later, a wide range of other behaviors—faulty primaries,

recovering replicas, etc.—can have a similar impact. We

believe that these collapses are byproducts of a single-

minded focus on designing BFT protocols with ever

more impressive best-case performance. While this fo-

cus is understandable—after years in which BFT repli-

cation was dismissed as too expensive to be practical,

it was important to demonstrate that high-performance

BFT is not an oxymoron—it has led to protocols whose

complexity undermines robustness in two ways: (1) the

protocols’ design includes fragile optimizations that al-

low a faulty client or server to knock the system off of

the optimized execution path to an expensive alternative

path and (2) the protocols’ implementation often fails to

handle properly all of the intricate corner cases, so that

the implementations are even more vulnerable than the

protocols appear on paper.

The primary contribution of this paper is to advocate a

new approach, robust BFT (RBFT), to building BFT sys-

tems. Our goal is to change the way BFT systems are de-

signed and implemented by shifting the focus from con-

structing high-strung systems that maximize best case

performance to constructing systems that offer accept-

able and predictable performance under the broadest pos-

sible set of circumstances—including when faults occur.

System Peak Throughput Faulty Client

PBFT [8] 61710 0

Q/U [1] 23850 0

†

HQ [12] 7629 N/A

‡

Zyzzyva [18] 65999 0

Aardvark 38667 38667

Table 1: Observed peak throughput of BFT systems in a

fault-free case and when a single faulty client submits a

carefully crafted series of requests. We detail our mea-

surements in Section 7.2.

†

The result reported for Q/U is

for correct clients issuing conﬂicting requests.

‡

The HQ

prototype demonstrates fault-free performance and does

not implement many of the error-handling steps required

to handle inconsistent MACs.

154 NSDI ’09: 6th USENIX Symposium on Networked Systems Design and Implementation USENIX Association

RBFT explicitly considers performance during both

gracious intervals—when the network is synchronous,

replicas are timely and fault-free, and clients correct—

and uncivil execution intervals in which network links

and correct servers are timely, but up to f = 

n−1



servers and any number of clients are faulty. The last

row of Table 1 shows the performance of Aardvark, an

RBFT state machine replication protocol whose design

and implementation are guided by this new philosophy.

In some ways, Aardvark is very similar to traditional

BFT protocols: clients send requests to a primary who

relays requests to the replicas who agree (explicitly or

implicitly) on the sequence of requests and the corre-

sponding results—not unlike PBFT [8], High through-

put BFT [19], Q/U [1], HQ [12], Zyzzyva [18], ZZ [32],

Scrooge [28], etc.

In other ways, Aardvark is very different and chal-

lenges conventional wisdom. Aardvark utilizes signa-

tures for authentication, even though, as Castro correctly

observes, “eliminating signatures and using MACs in-

stead eliminates the main performance bottleneck in pre-

vious systems” [7]. Aardvark performs regular view

changes, even though view changes temporarily prevent

the system from doing useful work. Aardvark utilizes

point to point communication, even though renouncing

IP-multicast gives up throughput deliberately.

We reach these counter-intuitive choices by following

a simple and systematic approach: without ever compro-

mising safety, we deliberately refocus both the design

of the system and the engineering choices involved in

its implementation on the stress that failures can impose

on performance. In applying this strategy for RBFT to

construct Aardvark, we choose an extreme position in-

spired by maxi-min strategies in game theory [26]: we

reject any optimization for gracious executions that can

decrease performance during uncivil executions.

Surprisingly, these counter-intuitive choices impose

only a modest cost on its peak performance. As Table 1

illustrates, Aardvark sustains peak throughput of 38667

requests/second, which is within 40% of the best perfor-

mance we measure on the same hardware for four stat-

of-the-art protocols. At the same time, Aardvark’s fault

tolerance is dramatically improved. For a broad range

of client, primary, and server misbehaviors we prove that

Aardvark’s performance remains within a constant fac-

tor of its best case performance. Testing of the prototype

shows that these changes signiﬁcantly improve robust-

ness under a range of injected faults.

Once again, however, the main contribution of this pa-

per is neither the Aardvark protocol nor implementation.

It is instead a new approach that can—and we believe

should—be applied to the design of other BFT protocols.

In particular, we (1) demonstrate that existing protocols

and their implementations are fragile, (2) argue that BFT

protocols should be designed and implemented with a fo-

cus on robustness, and (3) use Aardvark to demonstrate

that the RBFT approach is viable: we gain qualitatively

better performance during uncivil intervals at only mod-

est cost to performance during gracious intervals.

In Section 2 we describe our system model and the

guarantees appropriate for high assurance systems. In

Section 3 we elaborate on the need to rethink Byzan-

tine fault tolerance and identify a set of design principles

for RBFT systems. In Section 4 we present a system-

atic methodology for designing RBFT systems and an

overview of Aardvark. In Section 5 we describe in detail

the important components of the Aardvark protocol. In

Section 6 we present an analysis of Aardvark’s expected

performance. In Section 7 we present our experimental

evaluation. In Section 8 we discuss related work.

2 System model

We assume the Byzantine failure model where faulty

nodes (servers or clients) can behave arbitrarily [21] and

a strong adversary can coordinate faulty nodes to com-

promise the replicated service. We do, however, assume

the adversary cannot break cryptographic techniques like

collision-resistant hashing, message authentication codes

(MACs), encryption, and signatures. We denote a mes-

sage X signed by principal p’s public key as X

. We

denote a message X with a MAC appropriate for princi-

pals p and r as X

r,p

. We denote a message containing

a MAC authenticator—an array of MACs appropriate for

veriﬁcation by every replica—as X

µ

Our model puts no restriction on clients, except that

their number be ﬁnite: in particular, any number of

clients can be arbitrarily faulty. However, the system’s

safety and liveness properties are guaranteed only if at

most f = 

n−1

 servers are faulty.

Finally, we assume an asynchronous network where

synchronous intervals, during which messages are deliv-

ered with a bounded delay, occur inﬁnitely often.

Deﬁnition 1 (Synchronous interval). During a syn-

chronous interval any message sent between correct pro-

cesses is delivered within a bounded delay T if the sender

retransmits according to some schedule until it is deliv-

ered.

3 Recasting the problem

The foundation of modern BFT state machine replication

rests on an impossibility result and on two principles that

assist us in dealing with it. The impossibility result, of

course, is FLP [13], which states that no solution to con-

sensus can be both safe and live in an asynchronous sys-

tems if nodes can fail. The two principles, ﬁrst applied

by Lamport to his Paxos protocol [20], are at the core

of Castro and Liskov’s seminal work on PBFT [7]. The

ﬁrst states that synchrony must not be needed for safety:

USENIX Association NSDI ’09: 6th USENIX Symposium on Networked Systems Design and Implementation 155

as long as a threshold of faulty servers is not exceeded,

the replicated service must always produce linearizable

executions, independent of whether the network loses,

reorders, or arbitrarily delays messages. The second rec-

ognizes, given FLP, that synchrony must play a role in

liveness: clients are guaranteed to receive replies to their

requests only during intervals in which messages sent to

correct nodes are received within some ﬁxed (but poten-

tially unknown) time interval from when they are sent.

Within these boundaries, the engineering of BFT pro-

tocols has embraced Lampson’s well-known recommen-

dation: “Handle normal and worst case separately as a

rule because the requirements for the two are quite dif-

ferent. The normal case must be fast. The worst case

must make some progress” [22]. Ever since PBFT, the

design of BFT systems has then followed a predictable

pattern: ﬁrst, characterize what deﬁnes the normal (com-

mon) case; then, pull out all the stops to make the system

perform well for that case. While different systems don’t

completely agree on what deﬁnes the common case [16],

on one point they are unanimous: the common case in-

cludes only gracious executions, deﬁned as follows:

Deﬁnition 2 (Gracious execution). An execution is gra-

cious iff (a) the execution is synchronous with some

implementation-dependent short bound on message de-

lay and (b) all clients and servers behave correctly.

The results of this approach continue to be spectac-

ular. Since Zyzzyva last year reported a throughput of

over 85,000 null requests per second [18], several new

protocols have further improved on that mark [16, 28].

Despite these impressive results, we argue that a sin-

gle minded focus on aggressively tuning BFT systems

for the best case of gracious execution, a practice that

we have engaged in with relish [18], is increasingly mis-

guided, dangerous, and even futile.

It is misguided, because it encourages the design and

implementation of systems that fail to deliver on their ba-

sic promise: to tolerate Byzantine faults. While provid-

ing impressive throughput during gracious executions,

today’s high-performance BFT systems are content to

guaranteeing weak liveness guarantees (e.g. “eventual

progress”) in the presence of Byzantine failures. Unfor-

tunately, as we previewed in Figure 1 and show in detail

in Section 7.2, these guarantees are weak indeed. Al-

though current BFT systems can survive Byzantine faults

without compromising safety, we contend that a system

that can be made completely unavailable by a simple

Byzantine failure can hardly be said to tolerate Byzan-

tine faults.

It is dangerous, because it encourages fragile opti-

mizations. Fragile optimizations are harmful in two

ways. First, as we will see in Section 7.2, they make it

easier for a faulty client or server to knock the system off

its hard-won optimized execution path and enter an alter-

native, much more expensive one. Second, they weigh

down the system with subtle corner cases, increasing the

likelihood of buggy or incomplete implementations.

It is (increasingly) futile, because the race to optimize

common case performance has reached a point of dimin-

ishing return where many services’ peak demands are al-

ready far under the best-case throughput offered by ex-

isting BFT replication protocols. For such systems, good

enough is good enough, and further improvements in best

case agreement throughput will have little effect on end-

to-end system performance.

In our view, a BFT system fulﬁlls its obligations

when it provides acceptable and dependable performance

across the broadest possible set of executions, including

executions with Byzantine clients and servers. In par-

ticular, the temptation of fragile optimizations should be

resisted: a BFT system should be designed around an

execution path that has three properties: (1) it provides

acceptable performance, (2) it is easy to implement, and

(3) it is robust against Byzantine attempts to push the sys-

tem away from it. Optimizations for the common case

should be accepted only as long as they don’t endanger

these properties.

FLP tells us that we cannot guarantee liveness in an

asynchronous environment. This is no excuse to cling to

gracious executions only. In particular, there is no theo-

retical reason why BFT systems should not be expected

to perform well in what we call uncivil executions:

Deﬁnition 3 (Uncivil execution). An execution is

uncivil iff (a) the execution is synchronous with some

implementation-dependent short bound on message de-

lay, (b) up to f servers and an arbitrary number of clients

are Byzantine, and (c) all remaining clients and servers

are correct.

Hence, we propose to build RBFT systems that pro-

vide adequate performance during uncivil executions.

Although we recognize that this approach is likely to re-

duce the best case performance, we believe that for a

BFT system a limited reduction in peak throughput is

preferable to the devastating loss of availability that we

report in Figure 1 and Section 7.2.

Increased robustness may come at effectively no ad-

ditional cost as long as a service’s peak demand is be-

low the throughput achievable through RBFT design:

as a data point, our Aardvark prototype reaches a peak

throughput of 38667 req/s.

Similarly, when systems have other bottlenecks, Am-

dahl’s law limits the impact of changing the performance

of agreement. For example, we report in Section 7 that

PBFT can execute almost 62,000 null requests per sec-

ond, suggesting that agreement consumes 16.1µs per re-

quest. If, rather than a null service, we replicate a service

156 NSDI ’09: 6th USENIX Symposium on Networked Systems Design and Implementation USENIX Association

for which executing an average request consumes 100µs

of processing time, then peak throughput with PBFT set-

tles to about 8613 requests per second. For the same ser-

vice, a protocol with twice the agreement overhead of

PBFT (i.e., 32.2µs per request), would still achieve peak

throughput of about 7564 requests/second: in this hy-

pothetical example, doubling agreement overhead would

reduce peak end-to-end throughput by about 12%.

4 Aardvark: RBFT in action

Aardvark is a new BFT system designed and imple-

mented to be robust to failures. The Aardvark pro-

tocol consists of 3 stages: client request transmission,

replica agreement, and primary view change. This is the

same basic structure of PBFT [8] and its direct descen-

dants [4, 18, 19, 33, 32], but revisited with the goal of

achieving an execution path that satisﬁes the properties

outlined in the previous section: acceptable performance,

ease of implementation, and robustness against Byzan-

tine disruptions. To avoid the pitfalls of fragile opti-

mizations, we focus at each stage of the protocol on how

faulty nodes, by varying both the nature and the rate of

their actions and omissions, can limit the ability of cor-

rect nodes to perform in a timely fashion what the proto-

col requires of them. This systematic methodology leads

us to the three main design differences between Aardvark

and previous BFT systems: (1) signed client requests, (2)

resource isolation, and (3) regular view changes.

Signed client requests. Aardvark clients use digital

signatures to authenticate their requests. Digital signa-

tures provide non-repudiation and ensure that all correct

replicas make identical decisions about the validity of

each client request, eliminating a number of expensive

and tricky corner cases found in existing protocols that

make use of weaker (though faster) message authentica-

tion code (MAC) authenticators [7] to authenticate client

requests. The difﬁculty with utilizing MAC authentica-

tors is that they do not provide the non-repudiation prop-

erty of digital signatures—one node validating a MAC

authenticator does not guarantee that any other nodes

will validate that same authenticator [2].

As we mentioned in the Introduction, digital signa-

tures are generally seen as too expensive to use. Aard-

vark uses them only for client requests where it is pos-

sible to push the expensive act of generating the signa-

ture onto the client while leaving the servers with the

less expensive veriﬁcation operation. Primary-to-replica,

replica-to-replica, and replica-to-client communication

rely on MAC authenticators. The quorum-driven nature

of server-initiated communication ensures that a single

faulty replica is unable to force the system into undesir-

able execution paths.

Because of the additional costs associated with verify-

ing signatures in place of MACs, Aardvark must guard

Replica

Clients

Figure 1: Physical network in Aardvark.

against new denial-of-service attacks where the system

receives a large numbers of requests with signatures that

need to be veriﬁed. Our implementation limits the num-

ber of signature veriﬁcations a client can inﬂict on the

system by (1) utilizing a hybrid MAC-signature construct

to put a hard limit on the number of faulty signature veri-

ﬁcations a client can inﬂict on the system and (2) forcing

a client to complete one request before issuing the next.

Resource isolation. The Aardvark prototype imple-

mentation explicitly isolates network and computational

resources.

As illustrated by Fig. 1, Aardvark uses separate net-

work interface controllers (NICs) and wires to connect

each pair of replicas. This step prevents a faulty server

from interfering with the timely delivery of messages

from good servers, as happened when a single broken

NIC shut down the immigration system at the Los An-

geles International Airport [9]. It also allows a node to

defend itself against brute force denial of service attacks

by disabling the offending NIC. However, using phys-

ically separate NICs for communication between each

pair of servers incurs a performance hit, as Aardvark can

no longer use hardware multicast to optimize all-to-all

communication.

As Figure 2 shows, Aardvark uses separate work

queues for processing messages from clients and indi-

vidual replicas. Employing a separate queue for client

requests prevents client trafﬁc from drowning out the

replica-to-replica communications required for the sys-

tem to make progress. Similarly, employing a sepa-

rate queue for each replica allows Aardvark to sched-

ule message processing fairly, ensuring that a replica is

able to efﬁciently gather the quorums it needs to make

progress. Aardvark can also easily leverage separate

hardware threads to process incoming client and replica

requests. Taking advantage of hardware parallelism al-

lows Aardvark to reclaim part of the costs paid to verify

signatures on client requests.

剩余15页未读，继续阅读

身份认证购VIP最低享 7 折!

30元优惠券

资源推荐

资源评论

马李灵珊

粉丝: 41
资源: 297

Aardvark协议（BFT）1

蜜獾BFT协议 1

Aardvark

aardvark最新驱动v2.16

aardvark烧录器驱动

aardvark-api-windows-x86_64-v5.15.zip

aardvark-v4.00用户手册.pdf

swift-Aardvark是一个库能够很容易创建可操作的错误报告

aardvark:Aardvark编程语言

f-111_aardvark.zip

VB.NET PMBUS RS485-IIC 串口穿透及AARDVARK PMBUS IIC

Aardvark是一个库，它使创建可行的错误报告变得非常简单。-Swift开发

Infrastructure-aardvark-proxy：Aardvark代理中间人

aardvark_topsites_multiple.nasl

aardvark:OpenTSDB视图

Aardvark2-crx插件

Aardvark-开源

aardvark-开源

用于全相位 Aardvark I2C/SPI 接口的仪器控制工具箱支持包：使用全相位 Aardvark I2C/SPI 板与 I2C 或 SPI 设备通信。-matlab开发

aardvark.docs:结合了aardvark平台提供的多个软件包的简单示例。 每个平台存储库都带有单独的示例-在这里，我们使用aardvark.rendering和aardvark.media收集总体示例。

aardvark.vr：开源Aardvark渲染引擎的虚拟现实扩展

129) Aardvark - BuddyPress、会员与社区主题 v4.54.zip

TotalPhaseUSB-v3.0.0.rar

Aardvark.OpenImageDenoise：英特尔Open Image Denoise API的Aardvark绑定

caardvark：Aardvark中的扑克牌

apm：Aardvark的经理

【PHP项目源码-毕业设计期末大作业】Aardvark Topsites PHP v5.2.0

STM32通过I2C在线升级(En)1

aardvark.semantictextonforests:语义文本森林的实现（包括用C ++ CLI编写的libsvm的.NET包装器）

ADC08D1520CVAL评估板用户指南

最新资源

aardvark.docs:结合了aardvark平台提供的多个软件包的简单示例。每个平台存储库都带有单独的示例-在这里，我们使用aardvark.rendering和aardvark.media收集总体示例。