Everything we know about CRC but afraid to
forget
Andrew Kadatch
1
and Bob Jenkins
2
1
Google Inc.
2
Microsoft Corporation
September 3, 2010
Abstract
This pap er describes a novel interleaved, parallelizeable word-by-word
CRC computation algorithm which computes N -bit CRC (N ≤ 64) on
modern Intel and AMD processors in 1.2 CPU cycles per byte, improv-
ing state of the art over word-by-word 32-bit and 64-bit CRCs (2.1 CPU
cycles/byte) and classic byte-by-byte CRC computation (6-7 CPU cy-
cles/byte). It computes 128-bit CRC in 1.7 CPU cycles/byte.
CRC implementations are heavily optimized and hard to understand.
This paper describes CRC algorithms as they evolved over time, splitting
complex optimizations into a sequence of natural improvements.
This paper also presents a collection of CRC “tricks” that we found
handy on many occassions.
Contents
1 Definition of CRC 2
2 Related work 3
3 CRC tricks and tips 4
3.1 Incremental CRC computation . . . . . . . . . . . . . . . . . . . 4
3.2 Changing initial CRC value . . . . . . . . . . . . . . . . . . . . . 4
3.3 Concatenation of CRCs . . . . . . . . . . . . . . . . . . . . . . . 5
3.4 In-place modification of CRC-ed message . . . . . . . . . . . . . 5
3.5 Storing CRC value after the message . . . . . . . . . . . . . . . . 6
4 Efficient software implementation 7
4.1 Mapping bitstreams to hardware registers . . . . . . . . . . . . . 7
4.2 Multiplication of D-normalized polynomials . . . . . . . . . . . . 7
4.3 Multiplication of unnormalized polynomial . . . . . . . . . . . . . 7
1