Intel and compatable CPU's Programming Information

Intel SSE MMX2 KNI documentation

AMD 64 Bit & Opteron resource on this site

Intel Itanium 64 Bit processor

CPU Heat Dissipation Table

Intel 80386 Reference Programmer's Manual

Our Partners:

up: Chapter 3 -- Applications Instruction Set
prev: 3.3 Decimal Arithmetic Instructions
next: 3.5 Control Transfer Instructions


3.4 Logical Instructions

The group of logical instructions includes:
  • The Boolean operation instructions
  • Bit test and modify instructions
  • Bit scan instructions
  • Rotate and shift instructions
  • Byte set on condition

3.4.1 Boolean Operation Instructions

The logical operations are AND, OR, XOR, and NOT.

NOT (Not) inverts the bits in the specified operand to form a one's complement of the operand. The NOT instruction is a unary operation that uses a single operand in a register or memory. NOT has no effect on the flags.

The AND, OR, and XOR instructions perform the standard logical operations "and", "(inclusive) or", and "exclusive or". These instructions can use the following combinations of operands:

  • Two register operands
  • A general register operand with a memory operand
  • An immediate operand with either a general register operand or a memory operand.
AND, OR, and XOR clear OF and CF, leave AF undefined, and update SF, ZF, and PF.

3.4.2 Bit Test and Modify Instructions

This group of instructions operates on a single bit which can be in memory or in a general register. The location of the bit is specified as an offset from the low-order end of the operand. The value of the offset either may be given by an immediate byte in the instruction or may be contained in a general register.

These instructions first assign the value of the selected bit to CF, the carry flag. Then a new value is assigned to the selected bit, as determined by the operation. OF, SF, ZF, AF, PF are left in an undefined state. Table 3-1 defines these instructions.

Table 3-1. Bit Test and Modify Instructions

Instruction                      Effect on CF            Effect on
                                                         Selected Bit

Bit (Bit Test)                   CF := BIT                (none)
BTS (Bit Test and Set)           CF := BIT                BIT := 1
BTR (Bit Test and Reset)         CF := BIT                BIT := 0
BTC (Bit Test and Complement)    CF := BIT                BIT := NOT(BIT)

3.4.3 Bit Scan Instructions

These instructions scan a word or doubleword for a one-bit and store the index of the first set bit into a register. The bit string being scanned may be either in a register or in memory. The ZF flag is set if the entire word is zero (no set bits are found); ZF is cleared if a one-bit is found. If no set bit is found, the value of the destination register is undefined.

BSF (Bit Scan Forward) scans from low-order to high-order (starting from bit index zero).

BSR (Bit Scan Reverse) scans from high-order to low-order (starting from bit index 15 of a word or index 31 of a doubleword).

3.4.4 Shift and Rotate Instructions

The shift and rotate instructions reposition the bits within the specified operand.

These instructions fall into the following classes:

  • Shift instructions
  • Double shift instructions
  • Rotate instructions

3.4.4.1 Shift Instructions

The bits in bytes, words, and doublewords may be shifted arithmetically or logically. Depending on the value of a specified count, bits can be shifted up to 31 places.

A shift instruction can specify the count in one of three ways. One form of shift instruction implicitly specifies the count as a single shift. The second form specifies the count as an immediate value. The third form specifies the count as the value contained in CL. This last form allows the shift count to be a variable that the program supplies during execution. Only the low order 5 bits of CL are used.

CF always contains the value of the last bit shifted out of the destination operand. In a single-bit shift, OF is set if the value of the high-order (sign) bit was changed by the operation. Otherwise, OF is cleared. Following a multibit shift, however, the content of OF is always undefined.

The shift instructions provide a convenient way to accomplish division or multiplication by binary power. Note however that division of signed numbers by shifting right is not the same kind of division performed by the IDIV instruction.

SAL (Shift Arithmetic Left) shifts the destination byte, word, or doubleword operand left by one or by the number of bits specified in the count operand (an immediate value or the value contained in CL). The processor shifts zeros in from the right (low-order) side of the operand as bits exit from the left (high-order) side. See Figure 3-6.

SHL (Shift Logical Left) is a synonym for SAL (refer to SAL).

SHR (Shift Logical Right) shifts the destination byte, word, or doubleword operand right by one or by the number of bits specified in the count operand (an immediate value or the value contained in CL). The processor shifts zeros in from the left side of the operand as bits exit from the right side. See Figure 3-7.

SAR (Shift Arithmetic Right) shifts the destination byte, word, or doubleword operand to the right by one or by the number of bits specified in the count operand (an immediate value or the value contained in CL). The processor preserves the sign of the operand by shifting in zeros on the left (high-order) side if the value is positive or by shifting by ones if the value is negative. See Figure 3-8.

Even though this instruction can be used to divide integers by a power of two, the type of division is not the same as that produced by the IDIV instruction. The quotient of IDIV is rounded toward zero, whereas the "quotient" of SAR is rounded toward negative infinity. This difference is apparent only for negative numbers. For example, when IDIV is used to divide -9 by 4, the result is -2 with a remainder of -1. If SAR is used to shift -9 right by two bits, the result is -3. The "remainder" of this kind of division is +3; however, the SAR instruction stores only the high-order bit of the remainder (in CF).

The code sequence in Figure 3-9 produces the same result as IDIV for any M = 2^(N), where 0 < N < 32. This sequence takes about 12 to 18 clocks, depending on whether the jump is taken; if ECX contains M, the corresponding IDIV ECX instruction will take about 43 clocks.




3.4.4.2 Double-Shift Instructions

These instructions provide the basic operations needed to implement operations on long unaligned bit strings. The double shifts operate either on word or doubleword operands, as follows:
  1. Taking two word operands as input and producing a one-word output.
  2. Taking two doubleword operands as input and producing a doubleword output.
Of the two input operands, one may either be in a general register or in memory, while the other may only be in a general register. The results replace the memory or register operand. The number of bits to be shifted is specified either in the CL register or in an immediate byte of the instruction.

Bits are shifted from the register operand into the memory or register operand. CF is set to the value of the last bit shifted out of the destination operand. SF, ZF, and PF are set according to the value of the result. OF and AF are left undefined.

SHLD (Shift Left Double) shifts bits of the R/M field to the left, while shifting high-order bits from the Reg field into the R/M field on the right (see Figure 3-10). The result is stored back into the R/M operand. The Reg field is not modified.

SHRD (Shift Right Double) shifts bits of the R/M field to the right, while shifting low-order bits from the Reg field into the R/M field on the left (see Figure 3-11). The result is stored back into the R/M operand. The Reg field is not modified.

3.4.4.3 Rotate Instructions

Rotate instructions allow bits in bytes, words, and doublewords to be rotated. Bits rotated out of an operand are not lost as in a shift, but are "circled" back into the other "end" of the operand.

Rotates affect only the carry and overflow flags. CF may act as an extension of the operand in two of the rotate instructions, allowing a bit to be isolated and then tested by a conditional jump instruction (JC or JNC). CF always contains the value of the last bit rotated out, even if the instruction does not use this bit as an extension of the rotated operand.

In single-bit rotates, OF is set if the operation changes the high-order (sign) bit of the destination operand. If the sign bit retains its original value, OF is cleared. On multibit rotates, the value of OF is always undefined.

ROL (Rotate Left) rotates the byte, word, or doubleword destination operand left by one or by the number of bits specified in the count operand (an immediate value or the value contained in CL). For each rotation specified, the high-order bit that exits from the left of the operand returns at the right to become the new low-order bit of the operand. See Figure 3-12.

ROR (Rotate Right) rotates the byte, word, or doubleword destination operand right by one or by the number of bits specified in the count operand (an immediate value or the value contained in CL). For each rotation specified, the low-order bit that exits from the right of the operand returns at the left to become the new high-order bit of the operand. See Figure 3-13.

RCL (Rotate Through Carry Left) rotates bits in the byte, word, or doubleword destination operand left by one or by the number of bits specified in the count operand (an immediate value or the value contained in CL).

This instruction differs from ROL in that it treats CF as a high-order one-bit extension of the destination operand. Each high-order bit that exits from the left side of the operand moves to CF before it returns to the operand as the low-order bit on the next rotation cycle. See Figure 3-14 .

RCR (Rotate Through Carry Right) rotates bits in the byte, word, or doubleword destination operand right by one or by the number of bits specified in the count operand (an immediate value or the value contained in CL).

This instruction differs from ROR in that it treats CF as a low-order one-bit extension of the destination operand. Each low-order bit that exits from the right side of the operand moves to CF before it returns to the operand as the high-order bit on the next rotation cycle. See Figure 3-15 .






3.4.4.4 Fast "BIT BLT" Using Double Shift Instructions

One purpose of the double shifts is to implement a bit string move, with arbitrary misalignment of the bit strings. This is called a "bit blt" (BIT BLock Transfer.) A simple example is to move a bit string from an arbitrary offset into a doubleword-aligned byte string. A left-to-right string is moved 32 bits at a time if a double shift is used inside the move loop.
MOV   ESI,ScrAddr
MOV   EDI,DestAddr
MOV   EBX,WordCnt
MOV   CL,RelOffset      ; relative offset Dest-Src
MOV   EDX,[ESI]         ; load first word of source
ADD   ESI,4             ; bump source address
BltLoop:
LODS                    ; new low order part
SHLD  EDX,EAX,CL        ; EDX overwritten with aligned stuff
XCHG  EDX,EAS           ; Swap high/low order parts
STOS                    ; Write out next aligned chunk
DEC   EBX
JA    BltLoop
This loop is simple yet allows the data to be moved in 32-bit pieces for the highest possible performance. Without a double shift, the best that can be achieved is 16 bits per loop iteration by using a 32-bit shift and replacing the
XCHG with a ROR by 16 to swap high and low order parts of registers. A more general loop than shown above would require some extra masking on the first doubleword moved (before the main loop), and on the last doubleword moved (after the main loop), but would have the same basic 32-bits per loop iteration as the code above.

3.4.4.5 Fast Bit-String Insert and Extract

The double shift instructions also enable:
  • Fast insertion of a bit string from a register into an arbitrary bit location in a larger bit string in memory without disturbing the bits on either side of the inserted bits.
  • Fast extraction of a bits string into a register from an arbitrary bit location in a larger bit string in memory without disturbing the bits on either side of the extracted bits.
The following coded examples illustrate bit insertion and extraction under variousconditions:
  1. Bit String Insert into Memory (when bit string is 1-25 bits long, i.e., spans four bytes or less):
    ; Insert a right-justified bit string from register into
    ; memory bit string.
    ;
    ; Assumptions:
    ; 1) The base of the string array is dword aligned, and
    ; 2) the length of the bit string is an immediate value
    ;    but the bit offset is held in a register.
    ;
    ; Register ESI holds the right-justified bit string
    ; to be inserted.
    ; Register EDI holds the bit offset of the start of the
    ; substring.
    ; Registers EAX and ECX are also used by this
    ; "insert" operation.
    ;
    MOV   ECX,EDI      ; preserve original offset for later use
    SHR   EDI,3        ; signed divide offset by 8 (byte address)
    AND   CL,7H        ; isolate low three bits of offset in CL
    MOV   EAX,[EDI]strg_base      ; move string dword into EAX
    ROR   EAX,CL       ; right justify old bit field
    SHRD  EAX,ESI,length          ; bring in new bits
    ROL   EAX,length   ; right justify new bit field
    ROL   EAX,CL       ; bring to final position
    MOV   [EDI]strg_base,EAX      ; replace dword in memory
    
  2. Bit String Insert into Memory (when bit string is 1-31 bits long, i.e. spans five bytes or less):
    ; Insert a right-justified bit string from register into
    ; memory bit string.
    ;
    ; Assumptions:
    ; 1) The base of the string array is dword aligned, and
    ; 2) the length of the bit string is an immediate value
    ;    but the bit offset is held in a register.
    ;
    ; Register ESI holds the right-justified bit string
    ; to be inserted.
    ; Register EDI holds the bit offset of the start of the
    ; substring.
    ; Registers EAX, EBX, ECX, and EDI are also used by
    ; this "insert" operation.
    ;
    MOV   ECX,EDI     ; temp storage for offset
    SHR   EDI,5       ; signed divide offset by 32 (dword address)
    SHL   EDI,2       ; multiply by 4 (in byte address format)
    AND   CL,1FH      ; isolate low five bits of offset in CL
    MOV   EAX,[EDI]strg_base      ; move low string dword into EAX
    MOV   EDX,[EDI]strg_base+4    ; other string dword into EDX
    MOV   EBX,EAX     ; temp storage for part of string     + rotate
    SHRD  EAX,EDX,CL  ; double shift by offset within dword + EDX:EAX
    SHRD  EAX,EBX,CL  ; double shift by offset within dword + right
    SHRD  EAX,ESI,length          ; bring in new bits
    ROL   EAX,length  ; right justify new bit field
    MOV   EBX,EAX     ; temp storage for part of string         + rotate
    SHLD  EAX,EDX,CL  ; double shift back by offset within word + EDX:EAX
    SHLD  EDX,EBX,CL  ; double shift back by offset within word + left
    MOV   [EDI]strg_base,EAX      ; replace dword in memory
    MOV   [EDI]strg_base+4,EDX    ; replace dword in memory
    
  3. Bit String Insert into Memory (when bit string is exactly 32 bits long, i.e., spans five or four types of memory):
    ; Insert right-justified bit string from register into
    ; memory bit string.
    ;
    ; Assumptions:
    ; 1) The base of the string array is dword aligned, and
    ; 2) the length of the bit string is 32
    ;    but the bit offset is held in a register.
    ;
    ; Register ESI holds the 32-bit string to be inserted.
    ; Register EDI holds the bit offset of the start of the
    ; substring.
    ; Registers EAX, EBX, ECX, and EDI are also used by
    ; this "insert" operation.
    ;
    MOV   EDX,EDI     ; preserve original offset for later use
    SHR   EDI,5       ; signed divide offset by 32 (dword address)
    SHL   EDI,2       ; multiply by 4 (in byte address format)
    AND   CL,1FH      ; isolate low five bits of offset in CL
    MOV   EAX,[EDI]strg_base      ; move low string dword into EAX
    MOV   EDX,[EDI]strg_base+4    ; other string dword into EDX
    MOV   EBX,EAX     ; temp storage for part of string     + rotate
    SHRD  EAX,EDX     ; double shift by offset within dword + EDX:EAX
    SHRD  EDX,EBX     ; double shift by offset within dword + right
    MOV   EAX,ESI     ; move 32-bit bit field into position
    MOV   EBX,EAX     ; temp storage for part of string         + rotate
    SHLD  EAX,EDX     ; double shift back by offset within word + EDX:EAX
    SHLD  EDX,EBX     ; double shift back by offset within word + left
    MOV   [EDI]strg_base,EAX      ; replace dword in memory
    MOV   [EDI]strg_base,+4,EDX   ; replace dword in memory
    
  4. Bit String Extract from Memory (when bit string is 1-25 bits long, i.e., spans four bytes or less):
    ; Extract a right-justified bit string from memory bit
    ; string into register
    ;
    ; Assumptions:
    ; 1) The base of the string array is dword aligned, and
    ; 2) the length of the bit string is an immediate value
    ;    but the bit offset is held in a register.
    ;
    ; Register EAX holds the right-justified, zero-padded
    ; bit string that was extracted.
    ; Register EDI holds the bit offset of the start of the
    ; substring.
    ; Registers EDI, and ECX are also used by this "extract."
    ;
    MOV  ECX,EDI      ; temp storage for offset
    SHR  EDI,3        ; signed divide offset by 8 (byte address)
    AND  CL,7H        ; isolate low three bits of offset
    MOV  EAX,[EDI]strg_base       ; move string dword into EAX
    SHR  EAX,CL       ; shift by offset within dword
    AND  EAX,mask     ; extracted bit field in EAX
    
  5. Bit String Extract from Memory (when bit string is 1-32 bits long, i.e., spans five bytes or less):
    ; Extract a right-justified bit string from memory bit
    ; string into register.
    ;
    ; Assumptions:
    ; 1) The base of the string array is dword aligned, and
    ; 2) the length of the bit string is an immediate
    ;    value but the bit offset is held in a register.
    ;
    ; Register EAX holds the right-justified, zero-padded
    ; bit string that was extracted.
    ; Register EDI holds the bit offset of the start of the
    ; substring.
    ; Registers EAX, EBX, and ECX are also used by this "extract."
    MOV   ECX,EDI     ; temp storage for offset
    SHR   EDI,5       ; signed divide offset by 32 (dword address)
    SHL   EDI,2       ; multiply by 4 (in byte address format)
    AND   CL,1FH      ; isolate low five bits of offset in CL
    MOV   EAX,[EDI]strg_base      ; move low string dword into EAX
    MOV   EDX,[EDI]strg_base+4    ; other string dword into EDX
    SHRD  EAX,EDX,CL  ; double shift right by offset within dword
    AND   EAX,mask    ; extracted bit field in EAX
    

3.4.5 Byte-Set-On-Condition Instructions

This group of instructions sets a byte to zero or one depending on any of the 16 conditions defined by the status flags. The byte may be in memory or may be a one-byte general register. These instructions are especially useful for implementing Boolean expressions in high-level languages such as Pascal.

SETcc (Set Byte on Condition cc) set a byte to one if condition cc is true; sets the byte to zero otherwise. Refer to Appendix D for a definition of the possible conditions.

3.4.6 Test Instruction

TEST (Test) performs the logical "and" of the two operands, clears OF and CF, leaves AF undefined, and updates SF, ZF, and PF. The flags can be tested by conditional control transfer instructions or by the byte-set-on-condition instructions. The operands may be doublewords, words, or bytes.

The difference between TEST and AND is that TEST does not alter the destination operand. TEST differs from BT in that TEST is useful for testing the value of multiple bits in one operations, whereas BT tests a single bit.


up: Chapter 3 -- Applications Instruction Set
prev: 3.3 Decimal Arithmetic Instructions
next: 3.5 Control Transfer Instructions