没有合适的资源?快使用搜索试试~ 我知道了~
(已读)A Compact Rijndael Hardware Architecture with S-Box Optimiza...
需积分: 13 1 下载量 26 浏览量
2020-02-12
19:36:46
上传
评论
收藏 696KB PDF 举报
温馨提示
试读
16页
文章题目为具有S-Box优化的紧凑型Rijndael硬件架构,文中将高级加密保准AES的加解密路径合并,并利用复合域技术及逻辑优化技术将算法中唯一的非线性部件S盒进行了改进,取得了良好的硬件实施效果
资源推荐
资源详情
资源评论
A Compact Rijndael Hardware Architecture
with S-Box Optimization
Akashi Satoh, Sumio Morioka, Kohji Takano, and Seiji Munetoh
IBM Research, Tokyo Research Laboratory, IBM Japan Ltd., 1623-14,
Shimotsuruma, Yamato-shi, Kanagawa 242-8502, Japan
{akashi,e02716,chano,munetoh}@jp.ibm.com
Abstract. Compact and high-speed hardware architectures and logic
optimization methods for the AES algorithm Rijndael are described.
Encryption and decryption data paths are combined and all arithmetic
components are reused. By introducing a new composite field, the S-Box
structure is also optimized. An extremely small size of 5.4 Kgates is ob-
tained for a 128-bit key Rijndael circuit using a 0.11-µm CMOS standard
cell library. It requires only 0.052 mm
2
of area to support both encryp-
tion and decryption with 311 Mbps throughput. By making effective use
of the SPN parallel feature, the throughput can be boosted up to 2.6
Gbps for a high-speed implementation whose size is 21.3 Kgates.
1 Introduction
DES (Data Encryption Standard) [14,1], which is a common-key block cipher
for US federal information processing standards, has also been used as a de
facto standard for more than 20 years. NIST (National Institute of Standard
Technology) has selected Rijndael [2] as the new Advanced Encryption Standard
(AES) [13]. Many hardware architectures for Rijndael were proposed and their
performances were evaluated by using ASIC libraries [8,18,10,9] and FPGAs [3,
17,6,11,5]. However, they are simple implementations according to the Rijndael
specification, and none are yet small enough for practical use. The AES has to be
embeddable not only in high-end servers but also in low-end consumer products
such as mobile terminals. Therefore, sharing and reusing hardware resources,
and compressing the gate logic are indispensable to produce a small Rijndael
circuit.
The SPN structure of Rijndael is suitable for highly parallel processing, but
it usually requires more hardware resources compared with the Feistel structure
used in many other ciphers developed after DES. This is because, all data is
encoded in each round of Rijndael processing, while only half of data is processed
at once in DES. In addition, Rijndael has two separate data paths for encryption
and decryption.
In this paper, we describe a compact data path architecture for Rijndael,
where the hardware resources are efficiently shared between encryption and de-
cryption. The key arithmetic component S-Box has been implemented using
C. Boyd (Ed.): ASIACRYPT 2001, LNCS 2248, pp. 239–254, 2001.
c
Springer-Verlag Berlin Heidelberg 2001
240 A. Satoh et al.
look-up table logic or ROMs in the previous approaches, which requires a lot
of hardware support. Reference [16] proposed the use of composite field arith-
metic to reduce the computation cost of the S-Box, but no detailed hardware
implementation was provided. Therefore, we propose a methodology to optimize
the S-Box by introducing a new composite field, and show its advantages in
comparison to the previous work.
2 Rijndael Algorithm
Fig. 1 shows a Rijndael encryption process for 128-bit plain text data string and
a 128-bit secret key, with the number of rounds set to 10. These numbers are
used throughout this paper, including for our hardware implementation. Each
round and the initial stage requires a 128-bit round key, and thus 11 sets of round
keys are generated from the secret key. The input data is arranged as a 4 × 4
matrix of bytes. The primitive functions SubBytes, ShiftRows and MixColumns
are based on byte-oriented arithmetic, and AddRoundKey is a simple 128-bitwise
XOR operation.
SubBytes is a nonlinear transformation that uses 16 byte substitution ta-
bles (S-Boxes). An S-Box is the multiplicative inverse of a Galois field GF (2
8
)
followed by an affine transformation. In the decryption process, the affine trans-
formation is executed prior to the inversion. The irreducible polynomial used by
a Rijndael S-Box is
m(x)=x
8
+ x
4
+ x
3
+ x +1. (1)
ShiftRows is a cyclic shift operation of the last three rows by different offsets.
MixColumns treats the 4-byte data in each column as coefficients of a 4-term
polynomial, and multiplies the data modulo x
4
+ 1 with the fixed polynomial
given by
c(x)={03}x
3
+ {01}x
2
+ {01}x + {02}. (2)
In the decryption process, InvMixColumns multiplies each column with the poly-
nomial
c
−1
(x)={0B}x
3
+ {0D}x
2
+ {09}x + {0E} (3)
and InvShiftRows shifts the last three rows in the opposite direction from
ShiftRows.
The key expander in Fig. 1 generates 11 sets of 128-bit round keys from one
128-bit secret key by using a 4-byte S-Box. These round keys can be prepared on
the fly in parallel with the encryption process. In the decryption process, these
sets of keys are used in reverse order. Therefore, all keys have to be generated and
stored in registers in advance, or the final round key in the encryption process
has to be pre-calculated for on-the-fly key scheduling. Because the first method
requires the equivalent of a 1,408-bit register (128 bits × 11), and is not suitable
A Compact Rijndael Hardware Architecture with S-Box Optimization 241
88 8
SubBytes
MixColumns
AddRoundKey
SubBytes
ShiftRows
AddRoundKey
SubBytes
ShiftRows
AddRoundKey
88 8
AddRoundKey
128-bit 11
round keys
a
00
a
10
a
20
a
30
a
b
00
a
01 03
a
a
10
a
11 13
a
a
20
a
21 23
a
a
30
a
31 33
a
00 01 03
10 11 13
20 21 23
30 31 33
bb
bb b
bb b
bb b
a
j
S-Box
0
a
j1
a
j2
a
j3
b
j0
b
j1
b
j2
b
j3
c
()
a
00 02
a
01
a
03
a
a
10
a
11 13
a
a
20 22
a
21
a
23
a
a
30 32
a
31
a
33
a
12
a
a
00 02
a
01
a
03
a
10
a
20
a
21
a
31
a
30
a
32
a
left rotation by 1
left rotation by 2
left rotation by 3
1
1
0
0
0
1
1
0
+
a
01 03
a
a
11 13
a
a
21 23
a
a
31 33
a
02
a
22
a
32
a
a
ij
b
00 0201 03
10 1211 13
20 2221 23
30 3231 33
bbb
bbbb
bbbb
bbbb
b
ij
no shift
a
00
a
10
a
20
a
30
a
01 03
a
a
11 13
a
a
21 23
a
a
31 33
a
02
a
22
a
32
a
k
00 0201 03
10 1211 13
20 2221 23
30 3231 33
kkk
kkkk
kkkk
kkkk
12
a
b
00 0201 03
10 1211 13
20 2221 23
30 3231 33
bbb
bbbb
bbbb
bbbb
=
x
b
ij
=
1
0
0
0
1
1
1
1
1
1
1
1
1
0
0
0
0
1
1
1
1
1
0
0
0
0
1
1
1
1
1
0
0
0
0
1
1
1
1
1
1
1
0
0
0
1
1
1
1
1
1
0
0
0
1
1
1
1
1
1
0
0
0
1
MixColumns
ShiftRows
Encryption Block
32
S-Box
<<8
S-Box
<<8
323232
S-Box
<<8
Rcon[1]
Rcon[9]
Rcon[10]
Key Expander
128-bit plain text
128-bit cipher text
a
ij
-1
128-bit secret key
Fig. 1. Encryption process of Rijndael algorithm
for compact hardware, the second approach was chosen for the implementation
described in the next section. Rcon[i] in Fig. 1 is a 4-byte value, and the lower
3 bytes are 0 for all i, and the highest byte is the bit representation of the
polynomial x
i
mod m(x).
3 Data Path Architecture
3.1 Data Path Sharing between Encryption and Decryption
In order to minimize the size of our Rijndael hardware, resource sharing in the
data path is fully employed as shown in Fig. 2. This circuit can execute both
encryption and decryption. The 128-bit data (4 × 4 bytes) block is divided into
four 32-bit columns, and is processed column by column through the 32-bit data
bus. Therefore one round takes 4 clock cycles. It is not a good idea to make
the bus width smaller than 32 bits, because the MixColumns operation needs
32-bits of data at one time. A smaller bus requires more registers and selectors,
and resource sharing is hindered, resulting in an inefficient implementation.
242 A. Satoh et al.
The “Enc/Dec block” has 16-byte data registers, and they execute ShiftRows
(or InvShiftRows) operations by themselves. Each 4-byte column is transformed
by four parallel S-Boxes as SubBytes (or InvSubBytes). The order of ShiftRows
and SubBytes is different from that in Fig. 1, though this does not affect the
operations’ results.
Selectors change the circuit state between encryption and decryption. The
data path
δ
−1
→ x
−1
→ δ
−1
and affine → MixColumns
is selected for encryption, and the path
affine
−1
and δ
−1
→ x
−1
→ δ
−1
→ InvMixColumns
is used for decryption. δ
−1
and δ are isomorphism functions for field conversions.
Details are described in Section 4.
By moving InvMixColumns from the front of each S-Box to the back, Mix-
Columns and InvMixColumns can be merged and some selectors are eliminated.
As a result, the circuit size and the critical path length are reduced. An addi-
tional InvMixColumns is required in the key expander, but the area impact is
minor.
3.2 S-Box Sharing with Key Expander
The key expander reuses the S-Boxes in the encryption/decryption block to
generate a 128-bit key in each round. The S-Boxes are used once by the key
expander, and four times by the encryption/decryption block, for a total of five
times in every round. While the key expander uses the S-Boxes, the ShiftRows (or
InvShiftRows) operation is executed simultaneously. As shown in Fig. 1, only the
AddRoundKey operation is executed in the initial round, and the MixColumns
(or InvMixColumns for decryption) is omitted in the final round. This operation
switching is carried out by controlling the 4:1 selector at the bottom of Fig. 2.
The first round key used in AddRoundKey is the initial key data stored in the
key registers, and a transformation with the S-Boxes is not necessary. Therefore
the first round takes four cycles, and the entire encryption process takes 54 (=
4+5× 10) cycles. The decryption process also takes 54 cycles. When a new
secret key is provided, the key expander takes 10 cycles to generate the initial
decryption key, which is the final round key in the encryption.
As described in Section 2, Rcon[i] is a 4-byte constant value, and the highest
order byte is generated by modular multiplication on GF (2
8
). The circuit RC in
Fig. 3 generates the constant values sequentially during the encryption process,
starting from {01}, and RC
−1
calculates the same values in reverse order from
{36}. These circuits are also merged as shown in this figure.
剩余15页未读,继续阅读
资源评论
一把过
- 粉丝: 0
- 资源: 1
上传资源 快速赚钱
- 我的内容管理 展开
- 我的资源 快来上传第一个资源
- 我的收益 登录查看自己的收益
- 我的积分 登录查看自己的积分
- 我的C币 登录后查看C币余额
- 我的收藏
- 我的下载
- 下载帮助
安全验证
文档复制为VIP权益,开通VIP直接复制
信息提交成功