Intel® 64 and IA-32
Architectures
Software Developer’s Manual
Volume 2B:
Instruction Set Reference, N-Z
NOTE: The Intel 64 and IA-32 Architectures Software
Developer's Manual consists of five volumes: Basic Architecture,
Order Number 253665; Instruction Set Reference A-M, Order
Number 253666; Instruction Set Reference N-Z, Order Number
253667; System Programming Guide, Part 1, Order Number
253668; System Programming Guide, Part 2, Order Number
253669. Refer to all five volumes when evaluating your design
needs.
Order Number: 253667-035US
June 2010
ii Vol. 2B
INFORMATION IN THIS DOCUMENT IS PROVIDED IN CONNECTION WITH INTEL PRODUCTS. NO LICENSE,
EXPRESS OR IMPLIED, BY ESTOPPEL OR OTHERWISE, TO ANY INTELLECTUAL PROPERTY RIGHTS IS GRANT-
ED BY THIS DOCUMENT. EXCEPT AS PROVIDED IN INTEL'S TERMS AND CONDITIONS OF SALE FOR SUCH
PRODUCTS, INTEL ASSUMES NO LIABILITY WHATSOEVER AND INTEL DISCLAIMS ANY EXPRESS OR IMPLIED
WARRANTY, RELATING TO SALE AND/OR USE OF INTEL PRODUCTS INCLUDING LIABILITY OR WARRANTIES
RELATING TO FITNESS FOR A PARTICULAR PURPOSE, MERCHANTABILITY, OR INFRINGEMENT OF ANY
PATENT, COPYRIGHT OR OTHER INTELLECTUAL PROPERTY RIGHT.
UNLESS OTHERWISE AGREED IN WRITING BY INTEL, THE INTEL PRODUCTS ARE NOT DESIGNED NOR IN-
TENDED FOR ANY APPLICATION IN WHICH THE FAILURE OF THE INTEL PRODUCT COULD CREATE A SITUA-
TION WHERE PERSONAL INJURY OR DEATH MAY OCCUR.
Intel may make changes to specifications and product descriptions at any time, without notice. Designers
must not rely on the absence or characteristics of any features or instructions marked "reserved" or "unde-
fined." Intel reserves these for future definition and shall have no responsibility whatsoever for conflicts or
incompatibilities arising from future changes to them. The information here is subject to change without no-
tice. Do not finalize a design with this information.
The Intel
®
64 architecture processors may contain design defects or errors known as errata. Current char-
acterized errata are available on request.
Intel
®
Hyper-Threading Technology requires a computer system with an Intel
®
processor supporting Hyper-
Threading Technology and an Intel
®
HT Technology enabled chipset, BIOS and operating system.
Performance will vary depending on the specific hardware and software you use. For more information, see
http://www.intel.com/technology/hyperthread/index.htm; including details on which processors support Intel HT
Technology.
Intel
®
Virtualization Technology requires a computer system with an enabled Intel
®
processor, BIOS, virtual
machine monitor (VMM) and for some uses, certain platform software enabled for it. Functionality, perfor-
mance or other benefits will
vary depending on hardware and software configurations. Intel
®
Virtualization
Technology-enabled BIOS and VMM applications are currently in development.
64-bit computing on Intel architecture requires a computer system with a processor, chipset, BIOS, oper-
ating system, device drivers and applications enabled for Intel
®
64 architecture. Processors will not operate
(including 32-bit operation) without an Intel
®
64 architecture-enabled BIOS. Performance will vary de-
pending on your hardware and software configurations. Consult with your system vendor for more infor-
mation.
Enabling Execute Disable Bit functionality requires a PC with a processor with Execute Disable Bit capability
and a supporting operating system. Check with your PC manufacturer on whether your system delivers Ex-
ecute Disable Bit functionality.
Intel, Pentium, Intel Xeon, Intel NetBurst, Intel Core, Intel Core Solo, Intel Core Duo, Intel Core 2 Duo,
Intel Core 2 Extreme, Intel Pentium D, Itanium, Intel SpeedStep, MMX, Intel Atom, and VTune are trade-
marks or registered trademarks of Intel Corporation or its subsidiaries in the United States and other coun-
tries.
*Other names and brands may be claimed as the property of others.
Contact your local Intel sales office or your distributor to obtain the latest specifications and before placing
your product order.
Copies of documents which have an ordering number and are referenced in this document, or other Intel
literature, may be obtained by calling 1-800-548-4725, or by visiting Intel’s website at http://www.intel.com
Copyright © 1997-2010 Intel Corporation
Vol. 2B 4-1
CHAPTER 4
INSTRUCTION SET REFERENCE, N-Z
4.1 IMM8 CONTROL BYTE OPERATION FOR PCMPESTRI /
PCMPESTRM / PCMPISTRI / PCMPISTRM
The notations introduced in this section are referenced in the reference pages of
PCMPESTRI, PCMPESTRM, PCMPISTRI, PCMPISTRM. The operation of the immediate
control byte is common to these four string text processing instructions of SSE4.2.
This section describes the common operations.
4.1.1 General Description
The operation of PCMPESTRI, PCMPESTRM, PCMPISTRI, PCMPISTRM is defined by
the combination of the respective opcode and the interpretation of an immediate
control byte that is part of the instruction encoding.
The opcode controls the relationship of input bytes/words to each other (determines
whether the inputs terminated strings or whether lengths are expressed explicitly) as
well as the desired output (index or mask).
The Imm8 Control Byte for PCMPESTRM/PCMPESTRI/PCMPISTRM/PCMPISTRI
encodes a significant amount of programmable control over the functionality of those
instructions. Some functionality is unique to each instruction while some is common
across some or all of the four instructions. This section describes functionality which
is common across the four instructions.
The arithmetic flags (ZF, CF, SF, OF, AF, PF) are set as a result of these instructions.
However, the meanings of the flags have been overloaded from their typical mean-
ings in order to provide additional information regarding the relationships of the two
inputs.
PCMPxSTRx instructions perform arithmetic comparisons between all possible pairs
of bytes or words, one from each packed input source operand. The boolean results
of those comparisons are then aggregated in order to produce meaningful results.
The Imm8 Control Byte is used to affect the interpretation of individual input
elements as well as control the arithmetic comparisons used and the specific aggre-
gation scheme.
Specifically, the Imm8 Control Byte consists of bit fields that control the following
attributes:
• Source data format — Byte/word data element granularity, signed or unsigned
elements
4-2 Vol. 2B
INSTRUCTION SET REFERENCE, N-Z
• Aggregation operation — Encodes the mode of per-element comparison
operation and the aggregation of per-element comparisons into an intermediate
result
• Polarity — Specifies intermediate processing to be performed on the interme-
diate result
• Output selection — Specifies final operation to produce the output (depending
on index or mask) from the intermediate result
4.1.2 Source Data Format
If the Imm8 Control Byte has bit[0] cleared, each source contains 16 packed bytes.
If the bit is set each source contains 8 packed words. If the Imm8 Control Byte has
bit[1] cleared, each input contains unsigned data. If the bit is set each source
contains signed data.
Table 4-1. Source Data Format
Imm8[1:
0] Meaning Description
00b Unsigned bytes Both 128-bit sources are treated as packed, unsigned
bytes.
01b Unsigned words Both 128-bit sources are treated as packed, unsigned
words.
10b Signed bytes Both 128-bit sources are treated as packed, signed bytes.
11b Signed words Both 128-bit sources are treated as packed, signed words.
Vol. 2B 4-3
INSTRUCTION SET REFERENCE, N-Z
4.1.3 Aggregation Operation
All 256 (64) possible comparisons are always performed. The individual Boolean
results of those comparisons are referred by “BoolRes[Reg/Mem element index, Reg
element index].” Comparisons evaluating to “True” are represented with a 1, False
with a 0 (positive logic). The initial results are then aggregated into a 16-bit (8-bit)
intermediate result (IntRes1) using one of the modes described in the table below, as
determined by Imm8 Control Byte bit[3:2].
See Section 4.1.6 for a description of the overrideIfDataInvalid() function used in
Table 4-3.
Table 4-2. Aggregation Operation
Imm8[3:2
] Mode Comparison
00b Equal any The arithmetic comparison is “equal.”
01b Ranges Arithmetic comparison is “greater than or equal” between
even indexed bytes/words of reg and each byte/word of
reg/mem.
Arithmetic comparison is “less than or equal” between odd
indexed bytes/words of reg and each byte/word of reg/mem.
(reg/mem[m] >= reg[n] for n = even, reg/mem[m] <= reg[n]
for n = odd)
10b Equal each The arithmetic comparison is “equal.”
11b Equal ordered The arithmetic comparison is “equal.”
Table 4-3. Aggregation Operation
Mode Pseudocode
Equal any
(find characters from a set)
UpperBound = imm8[0] ? 7 : 15;
IntRes1 = 0;
For j = 0 to UpperBound, j++
For i = 0 to UpperBound, i++
IntRes1[j] OR= overrideIfDataInvalid(BoolRes[j,i])