3    Main Instruction Set

The assembler's instruction set consists of a main instruction set and a floating-point instruction set. This chapter describes the main instruction set; Chapter 4 describes the floating-point instruction set. For details on the instruction set beyond the scope of this manual, refer to the Alpha Architecture Reference Manual.

The assembler's main instruction set contains the following classes of instructions:

Tables in this chapter show the format of each instruction in the main instruction set. The tables list the instruction names and the forms of operands that can be used with each instruction. The specifiers used in the tables to identify operands have the following meanings:
Operand Specifier Description
address A symbolic expression whose effective value is used as an address.
b_reg Base register. An integer register containing a base address to which is added an offset (or displacement) value to produce an effective address.
d_reg Destination register. An integer register that receives a value as a result of an operation.
d_reg/s_reg One integer register that is used as both a destination register and a source register.
label A label that identifies a location in a program.
no_operands No operands are specified.
offset An immediate value that is added to the contents of a base register to calculate an effective address.
palcode A value that determines the operation performed by a PALcode instruction.
s_reg, s_reg1, s_reg2 Source registers whose contents are to be used in an operation.
val_expr An expression whose value is used as an absolute value.
val_immed An immediate value that is to be used in an operation.
jhint An address operand that provides a hint of where a jmp or jsr instruction will transfer control.
rhint An immediate operand that provides software with a hint about how a ret or jsr_coroutine instruction is used.


3.1    Load and Store Instructions

Load and store instructions load immediate values and move data between memory and general registers. This section describes the general-purpose load and store instructions supported by the assembler.

Table 3-1 lists the mnemonics and operands for instructions that perform load and store operations. The table is divided into groups of instructions. The operands specified within a particular group apply to all of the instructions contained in that group.

Table 3-1: Load and Store Formats

Instruction Mnemonic Operands
Load Address
Load Byte
Load Byte Unsigned
Load Word
Load Word Unsigned
Load Sign Extended Longword
Load Sign Extended Longword Locked
Load Quadword
Load Quadword Locked
Load Quadword Unaligned
Unaligned Load Word
Unaligned Load Word Unsigned
Unaligned Load Word Unsigned
Unaligned Load Longword
lda[Table Note 1]
ldb
ldbu
ldw
ldwu
ldl[Table Note 1]
ldl_l[Table Note 1]
ldq[Table Note 1]
ldq_l[Table Note 1]
ldq_u[Table Note 1]
uldw
uldwu
uldl
uldq
d_reg, address
Load Address High
Load Global Pointer
ldah[Table Note 1]
ldgp
d_reg, offset(b_reg)
Load Immediate Longword
Load Immediate Quadword
ldil
ldiq
d_reg, val_expr
Store Byte
Store Word
Store Longword
Store Longword Conditional
Store Quadword
Store Quadword Conditional
Store Quadword Unaligned
Unaligned Store Word
Unaligned Store Longword
Unaligned Store Quadword
stb
stw
stl[Table Note 1]
stl_c[Table Note 1]
stq[Table Note 1]
stq_c[Table Note 1]
stq_u[Table Note 1]
ustw
ustl
ustq
s_reg, address

Table Notes:

  1. In addition to the normal operands that can be specified with this instruction, relocation operands can also be specified (see Section 2.6.4).

Section 3.1.1 describes the operations performed by load instructions and Section 3.1.2 describes the operations performed by store instructions.


3.1.1    Load Instruction Descriptions

Load instructions move values (addresses, values of expressions, or contents of memory locations) into registers. For all load instructions, the effective address is the 64-bit twos-complement sum of the contents of the index register and the sign-extended offset.

Instructions whose address operands contain symbolic labels imply an index register, which the assembler determines. Some assembler load instructions can produce multiple machine-code instructions (see Section C.4).

Note

Load instructions can generate many code sequences for which the linker must fix the address by resolving external data items.

Table 3-2 describes the operations performed by load instructions.

Table 3-2: Load Instruction Descriptions

Instruction Description
Load Address (lda) Loads the destination register with the effective address of the specified data item.
Load Byte (ldb) Loads the least significant byte of the destination register with the contents of the byte specified by the effective address. Because the loaded byte is a signed value, its sign bit is replicated to fill the other bytes in the destination register. (The assembler uses temporary registers AT and t9 for this instruction.)
Load Byte Unsigned (ldbu) Loads the least significant byte of the destination register with the contents of the byte specified by the effective address. Because the loaded byte is an unsigned value, the other bytes of the destination register are cleared to zeros. (The assembler uses temporary registers AT and t9 for this instruction - unless the setting of the .arch directive or the -arch flag on the cc or as command line causes the assembler to generate a single machine instruction in response to the ldbu command.)
Load Word (ldw) Loads the two least significant bytes of the destination register with the contents of the word specified by the effective address. Because the loaded word is a signed value, its sign bit is replicated to fill the other bytes in the destination register.

If the effective address is not evenly divisible by two, a data-alignment exception may be signaled. (The assembler uses temporary registers AT and t9 for this instruction.)

Load Word Unsigned (ldwu) Loads the two least significant bytes of the destination register with the contents of the word specified by the effective address. Because the loaded word is an unsigned value, the other bytes of the destination register are cleared to zeros.

If the effective address is not evenly divisible by two, a data alignment exception may be signaled. (The assembler uses temporary registers AT and t9 for this instruction - unless the setting of the .arch directive or the -arch flag on the cc or as command line causes the assembler to generate a single machine instruction in response to the ldwu command.)

Load Sign Extended Longword (ldl) Loads the four least significant bytes of the destination register with the contents of the longword specified by the effective address. Because the loaded longword is a signed value, its sign bit is replicated to fill the other bytes in the destination register.

If the effective address is not evenly divisible by four, a data-alignment exception is signaled.

Load Sign Extended Longword Locked (ldl_l) Loads the four least significant bytes of the destination register with the contents of the longword specified by the effective address. Because the loaded longword is a signed value, its sign bit is replicated to fill the other bytes in the destination register.

If the effective address is not evenly divisible by four, a data-alignment exception is signaled.

If an ldl_l instruction executes without generating an exception, the processor records the target physical address in a per-processor locked-physical-address register and sets the per-processor lock flag.

If the per-processor lock flag is still set when a stl_c instruction is executed, the store occurs; otherwise, it does not occur.

Load Quadword (ldq) Loads the destination register with the contents of the quadword specified by the effective address. All bytes of the register are replaced with the contents of the loaded quadword.

If the effective address is not evenly divisible by eight, a data-alignment exception is signaled.

If a literal relocation type is specified in the ldq instruction, one machine instruction is generated and the symbol and offset is stored in the .lita section. Other relocation types generate a sequence of instructions and the symbol and offset is stored in that sequence.

 
 
Load Quadword Locked (ldq_l) Loads the destination register with the contents of the quadword specified by the effective address. All bytes of the register are replaced with the contents of the loaded quadword.

If the effective address is not evenly divisible by eight, a data-alignment exception is signaled. If an ldq_l instruction executes without generating an exception, the processor records the target physical address in a per-processor locked-physical-address register and sets the per-processor lock flag.

If the per-processor lock flag is still set when a stq_c instruction is executed, the store occurs; otherwise, it does not occur.

Load Quadword Unaligned (ldq_u) Loads the destination register with the contents of the quadword specified by the effective address (with the three low-order bits cleared). The address does not have to be aligned on an 8-byte boundary; it can be any byte address.
Unaligned Load Word (uldw) Loads the two least significant bytes of the destination register with the word at the specified address. The address does not have to be aligned on a 2-byte boundary; it can be any byte address. Because the loaded word is a signed value, its sign bit is replicated to fill the other bytes in the destination register. (The assembler uses temporary registers AT, t9, and t10 for this instruction.)
 
 
Unaligned Load Word Unsigned (uldwu) Loads the two least significant bytes of the destination register with the word at the specified address. The address does not have to be aligned on a 2-byte boundary; it can be any byte address. Because the loaded word is an unsigned value, the other bytes of the destination register are cleared to zeros. (The assembler uses temporary registers AT, t9, and t10 for this instruction.)
Unaligned Load Longword (uldl) Loads the four least significant bytes of the destination register with the longword at the specified address. The address does not have to be aligned on a 4-byte boundary; it can be any byte address in memory. (The assembler uses temporary registers AT, t9, and t10 for this instruction.)
Unaligned Load Quadword (uldq) Loads the destination register with the quadword at the specified address. The address does not have to be aligned on an 8-byte boundary; it can be any byte address in memory. (The assembler uses temporary registers AT, t9, and t10 for this instruction.)
Load Address High (ldah) Loads the destination register with the effective address of the specified data item. In computing the effective address, the signed constant offset is multiplied by 65536 before adding to the base register. The signed constant must be in the range -32768 to 32767.
Load Global Pointer (ldgp) Loads the destination register with the global pointer value for the procedure. The sum of the base register and the sign-extended offset specifies the address of the ldgp instruction.
Load Immediate Longword (ldil) Loads the destination register with the value of an expression that can be computed at assembly time. The value is converted to canonical longword form before being stored in the destination register; bit 31 is replicated in bits 32 though 63 of the destination register. (See Appendix B for additional information on canonical forms.)
Load Immediate Quadword (ldiq) Loads the destination register with the value of an expression that can be computed at assembly time.


3.1.2    Store Instruction Descriptions

For all store instructions, the effective address is the 64-bit twos-complement sum of the contents of the index register and the sign-extended 16-bit offset.

Instructions whose address operands contain symbolic labels imply an index register, which the assembler determines. Some assembler store instructions can produce multiple machine-code instructions (see Section C.4).

Table 3-3 describes the operations performed by store instructions.

Table 3-3: Store Instruction Descriptions

Instruction Description
Store Byte (stb) Stores the least significant byte of the source register in the memory location specified by the effective address. (The assembler uses temporary registers AT, t9, and t10 for this instruction - unless the setting of the .arch directive or the -arch flag on the cc or as command line causes the assembler to generate a single machine instruction in response to the stb command.)
Store Word (stw) Stores the two least significant bytes of the source register in the memory location specified by the effective address.

If the effective address is not evenly divisible by two, a data-alignment exception may be signaled. (The assembler uses temporary registers AT, t9, and t10 for this instruction - unless the setting of the .arch directive or the -arch flag on the cc or as command line causes the assembler to generate a single machine instruction in response to the stw command.)

Store Longword (stl) Stores the four least significant bytes of the source register in the memory location specified by the effective address.

If the effective address is not evenly divisible by four, a data-alignment exception is signaled.

Store Longword Conditional (stl_c) Stores the four least significant bytes of the source register in the memory location specified by the effective address, if the lock flag is set. The lock flag is returned in the source register and is then set to zero.

If the effective address is not evenly divisible by four, a data-alignment exception is signaled.

Store Quadword (stq) Stores the contents of the source register in the memory location specified by the effective address.

If the effective address is not evenly divisible by eight, a data-alignment exception is signaled.

Store Quadword Conditional (stq_c) Stores the contents of the source register in the memory location specified by the effective address, if the lock flag is set. The lock flag is returned in the source register and is then set to zero.

If the effective address is not evenly divisible by eight, a data-alignment exception is signaled.

Store Quadword Unaligned (stq_u) Stores the contents of the source register in the memory location specified by the effective address (with the three low-order bits cleared).
Unaligned Store Word (ustw) Stores the two least significant bytes of the source register in the memory location specified by the effective address. The address does not have to be aligned on a 2-byte boundary; it can be any byte address. (The assembler uses temporary registers AT, t9, t10, t11, and t12 for this instruction.)
Unaligned Store Longword (ustl) Stores the four least significant bytes of the source register in the memory location specified by the effective address. The address does not have to be aligned on a 4-byte boundary; it can be any byte address. (The assembler uses temporary registers AT, t9, t10, t11, and t12 for this instruction.)
Unaligned Store Quadword (ustq) Stores the contents of the source register in a memory location specified by the effective address. The address does not have to be aligned on an 8-byte boundary; it can be any byte address. (The assembler uses temporary registers AT, t9, t10, t11, and t12 for this instruction.)


3.2    Arithmetic Instructions

Arithmetic instructions perform arithmetic operations on values in registers. (Floating-point arithmetic instructions are described in Section 4.3.)

Table 3-4 lists the mnemonics and operands for instructions that perform arithmetic operations. The table is divided into groups of instructions. The operands specified within a particular group apply to all of the instructions contained in that group.

Table 3-4: Arithmetic Instruction Formats

Instruction Mnemonic Operands
Clear clr d_reg
Absolute Value Longword
Absolute Value Quadword
Negate Longword (without overflow)
Negate Longword (with overflow)
Negate Quadword (without overflow)
Negate Quadword (with overflow)
Sign-Extension Byte
Sign-Extension Longword
Sign-Extension Word
absl
absq
negl
neglv
negq
negqv
sextb
sextl
sextw

s_reg, d_reg
or
d_reg/s_reg
or
val_immed, d_reg

Add Longword (without overflow)
Add Longword (with overflow)
Add Quadword (without overflow)
Add Quadword (with overflow)
Scaled Longword Add by 4
Scaled Quadword Add by 4
Scaled Longword Add by 8
Scaled Quadword Add by 8
Multiply Longword (without overflow)
Multiply Longword (with overflow)
Multiply Quadword (without overflow)
Multiply Quadword (with overflow)
Subtract Longword (without overflow)
Subtract Longword (with overflow)
Subtract Quadword (without overflow)
Subtract Quadword (with overflow)
Scaled Longword Subtract by 4
Scaled Quadword Subtract by 4
Scaled Longword Subtract by 8
Scaled Quadword Subtract by 8
Unsigned Quadword Multiply High
Divide Longword
Divide Longword Unsigned
Divide Quadword
Divide Quadword Unsigned
Longword Remainder
Longword Remainder Unsigned
Quadword Remainder
Quadword Remainder Unsigned
addl
addlv
addq
addqv
s4addl
s4addq
s8addl
s8addq
mull
mullv
mulq
mulqv
subl
sublv
subq
subqv
s4subl
s4subq
s8subl
s8subq
umulh
divl
divlu
divq
divqu
reml
remlu
remq
remqu
s_reg1, s_reg2, d_reg
or
d_reg/s_reg1, s_reg2
or
s_reg1, val_immed, d_reg
or
d_reg/s_reg1, val_immed

Table 3-5 describes the operations performed by arithmetic instructions.

Table 3-5: Arithmetic Instruction Descriptions


Instruction Description
Clear (clr) Sets the contents of the destination register to zero.
Absolute Value Longword (absl) Computes the absolute value of the contents of the source register and places the result in the destination register. If the value in the source register is -2147483648, an overflow exception is signaled.
Absolute Value Quadword (absq) Computes the absolute value of the contents of the source register and places the result in the destination register. If the value in the source register is -9223372036854775808, an overflow exception is signaled.
Negate Longword (without overflow) (negl) Negates the integer contents of the four least significant bytes in the source register and places the result in the destination register. An overflow occurs if the value in the source register is -2147483648, but the overflow exception is not signaled.
Negate Longword (with overflow) (neglv) Negates the integer contents of the four least significant bytes in the source register and places the result in the destination register. If the value in the source register is -2147483648, an overflow exception is signaled.
Negate Quadword (without overflow) (negq) Negates the integer contents of the source register and places the result in the destination register. An overflow occurs if the value in the source register is -2147483648, but the overflow exception is not signaled.
Negate Quadword (with overflow) (negqv) Negates the integer contents of the source register and places the result in the destination register. An overflow exception is signaled if the value in the source register is -9223372036854775808.
Sign-Extension Byte (sextb) Moves the least significant byte of the source register into the least significant byte of the destination register. Because the moved byte is a signed value, its sign bit is replicated to fill the other bytes in the destination register.
Sign-Extension Word (sextw) Moves the two least significant bytes of the source register into the two least significant bytes of the destination register. Because the moved word is a signed value, its sign bit is replicated to fill the other bytes in the destination register.
Sign-Extension Longword (sextl) Moves the four least significant bytes of the source register into the four least significant bytes of the destination register. Because the moved longword is a signed value, its sign bit is replicated to fill the other bytes in the destination register.
Add Longword (without overflow) (addl) Computes the sum of two signed 32-bit values. This instruction adds the contents of s_reg1 to the contents of s_reg2 or the immediate value and then places the result in the destination register. Overflow exceptions never occur.
Add Longword (with overflow) (addlv) Computes the sum of two signed 32-bit values. This instruction adds the contents of s_reg1 to the contents of s_reg2 or the immediate value and then places the result in the destination register. If the result cannot be represented as a signed 32-bit number, an overflow exception is signaled.
Add Quadword (without overflow) (addq) Computes the sum of two signed 64-bit values. This instruction adds the contents of s_reg1 to the contents of s_reg2 or the immediate value and then places the result in the destination register. Overflow exceptions never occur.
Add Quadword (with overflow) (addqv) Computes the sum of two signed 64-bit values. This instruction adds the contents of s_reg1 to the contents of s_reg2 or the immediate value and then places the result in the destination register. If the result cannot be represented as a signed 64-bit number, an overflow exception is signaled.
Scaled Longword Add by 4 (s4addl) Computes the sum of two signed 32-bit values. This instruction scales (multiplies) the contents of s_reg1 by four and then adds the contents of s_reg2 or the immediate value. The result is stored in the destination register. Overflow exceptions never occur.
Scaled Quadword Add by 4 (s4addq) Computes the sum of two signed 64-bit values. This instruction scales (multiplies) the contents of s_reg1 by four and then adds the contents of s_reg2 or the immediate value. The result is stored in the destination register. Overflow exceptions never occur.
Scaled Longword Add by 8 (s8addl) Computes the sum of two signed 32-bit values. This instruction scales (multiplies) the contents of s_reg1 by eight and then adds the contents of s_reg2 or the immediate value. The result is stored in the destination register. Overflow exceptions never occur.
Scaled Quadword Add by 8 (s8addq) Computes the sum of two signed 64-bit values. This instruction scales (multiplies) the contents of s_reg1 by eight and then adds the contents of s_reg2 or the immediate value. The result is stored in the destination register. Overflow exceptions never occur.
Multiply Longword (without overflow) (mull) Computes the product of two signed 32-bit values. This instruction places either the 32-bit product of s_reg1 and s_reg2 or the immediate value in the destination register. Overflows are not reported.
Multiply Longword (with overflow) (mullv) Computes the product of two signed 32-bit values. This instruction places either the 32-bit product of s_reg1 and s_reg2 or the immediate value in the destination register. If an overflow occurs, an overflow exception is signaled.
Multiply Quadword (without overflow) (mulq) Computes the product of two signed 64-bit values. This instruction places either the 64-bit product of s_reg1 and s_reg2 or the immediate value in the destination register. Overflow is not reported.
Multiply Quadword (with overflow) (mulqv) Computes the product of two signed 64-bit values. This instruction places either the 64-bit product of s_reg1 and s_reg2 or the immediate value in the destination register. If an overflow occurs, an overflow exception is signaled.
Subtract Longword (without overflow) (subl) Computes the difference of two signed 32-bit values. This instruction subtracts either the contents of s_reg2 or an immediate value from the contents of s_reg1 and then places the result in the destination register. Overflow exceptions never happen.
Subtract Longword (with overflow) (sublv) Computes the difference of two signed 32-bit values. This instruction subtracts either the contents of s_reg2 or an immediate value from the contents of s_reg1 and then places the result in the destination register. If the true result's sign differs from the destination register's sign, an overflow exception is signaled.
Subtract Quadword (without overflow) (subq) Computes the difference of two signed 64-bit values. This instruction subtracts the contents of s_reg2 or an immediate value from the contents of s_reg1 and then places the result in the destination register. Overflow exceptions never occur.
Subtract Quadword (with overflow) (subqv) Computes the difference of two signed 64-bit values. This instruction subtracts the contents of s_reg2 or an immediate value from the contents of s_reg1 and then places the result in the destination register. If the true result's sign differs from the destination register's sign, an overflow exception is signaled.
Scaled Longword Subtract by 4 (s4subl) Computes the difference of two signed 32-bit values. This instruction subtracts the contents of s_reg2 or the immediate value from the scaled (by 4) contents of s_reg1. The result is stored in the destination register. Overflow exceptions never occur.
Scaled Quadword Subtract by 4 (s4subq) Computes the difference of two signed 64-bit values. This instruction subtracts the contents of s_reg2 or the immediate value from the scaled (by 4) contents of s_reg1. The result is stored in the destination register. Overflow exceptions never occur.
Scaled Longword Subtract by 8 (s8subl) Computes the difference of two signed 32-bit values. This instruction subtracts the contents of s_reg2 or the immediate value from the scaled (by 8) contents of s_reg1. The result is stored in the destination register. Overflow exceptions never occur.
Scaled Quadword Subtract by 8 (s8subq) Computes the difference of two signed 64-bit values. This instruction subtracts the contents of s_reg2 or the immediate value from the scaled (by 8) contents of s_reg1. The result is stored in the destination register. Overflow exceptions never occur.
Unsigned Quadword Multiply High (umulh) Computes the product of two unsigned 64-bit values. This instruction multiplies the contents of s_reg1 by the contents of s_reg2 or the immediate value and then places the high-order 64 bits of the 128-bit product in the destination register.
Divide Longword (divl) Computes the quotient of two signed 32-bit values. This instruction divides the contents of s_reg1 by the contents of s_reg2 or the immediate value and then places the quotient in the destination register.

The divl instruction rounds toward zero. If the divisor is zero, an error is signaled. Overflow is signaled when dividing -2147483648 by -1. A call_pal PAL_gentrap instruction may be issued for divide-by-zero and overflow exceptions.

Divide Longword Unsigned (divlu) Computes the quotient of two unsigned 32-bit values. This instruction divides the contents of s_reg1 by the contents of s_reg2 or the immediate value and then places the quotient in the destination register.

If the divisor is zero, an exception is signaled and a call_pal PAL_gentrap instruction may be issued. Overflow exceptions never occur. (The assembler uses temporary registers AT, t9, t10, t11, and t12 for the divlu instruction.)

Divide Quadword (divq) Computes the quotient of two signed 64-bit values. This instruction divides the contents of s_reg1 by the contents of s_reg2 or the immediate value and then places the quotient in the destination register.

The divq instruction rounds toward zero. If the divisor is zero, an error is signaled. Overflow is signaled when dividing -9223372036854775808 by -1. A call_pal PAL_gentrap instruction may be issued for divide-by-zero and overflow exceptions. (The assembler uses temporary registers AT, t9, t10, t11, and t12 for the divq instruction.)

Divide Quadword Unsigned (divqu) Computes the quotient of two unsigned 64-bit values. This instruction divides the contents of s_reg1 by the contents of s_reg2 or the immediate value and then places the quotient in the destination register.

If the divisor is zero, an exception is signaled and a call_pal PAL_gentrap instruction may be issued. Overflow exceptions never occur. (The assembler uses temporary registers AT, t9, t10, t11, and t12 for the divqu instruction.)

Longword Remainder (reml) Computes the remainder of the division of two signed 32-bit values. The remainder reml(i,j) is defined as i-(j*divl(i,j)), where j != 0. This instruction divides the contents of s_reg1 by the contents of s_reg2 or by the immediate value and then places the remainder in the destination register.

The reml instruction rounds toward zero, for example, divl(5,-3)=-1 and reml(5,-3)=2.

For divide-by-zero, an error is signaled and a call_pal PAL_gentrap instruction may be issued. (The assembler uses temporary registers AT, t9, t10, t11, and t12 for the reml instruction.)

Longword Remainder Unsigned (remlu) Computes the remainder of the division of two unsigned 32-bit values. The remainder remlu(i,j) is defined as i-(j*divlu(i,j)), where j != 0. This instruction divides the contents of s_reg1 by the contents of s_reg2 or the immediate value and then places the remainder in the destination register.

For divide-by-zero, an error is signaled and a call_pal PAL_gentrap instruction may be issued. (The assembler uses temporary registers AT, t9, t10, t11, and t12 for the remlu instruction.)

Quadword Remainder (remq) Computes the remainder of the division of two signed 64-bit values. The remainder remq(i,j) is defined as i-(j*divq(i,j)) where j != 0. This instruction divides the contents of s_reg1 by the contents of s_reg2 or the immediate value and then places the remainder in the destination register.

The remq instruction rounds toward zero, for example, divq(5,-3)=-1 and remq(5,-3)=2.

For divide-by-zero, an error is signaled and a call_pal PAL_gentrap instruction may be issued. (The assembler uses temporary registers AT, t9, t10, t11, and t12 for the remq instruction.)

Quadword Remainder Unsigned (remqu) Computes the remainder of the division of two unsigned 64-bit values. The remainder remqu(i,j) is defined as i-(j*divqu(i,j)) where j != 0. This instruction divides the contents of s_reg1 by the contents of s_reg2 or the immediate value and then places the remainder in the destination register.

For divide-by-zero, an error is signaled and a call_pal PAL_gentrap instruction may be issued. (The assembler uses temporary registers AT, t9, t10, t11, and t12 for the remqu instruction.)


3.3    Logical and Shift Instructions

Logical and shift instructions perform logical operations and shifts on values in registers.

Table 3-6 lists the mnemonics and operands for instructions that perform logical and shift operations. The table is divided into groups of instructions. The operands specified within a particular group apply to all of the instructions contained in that group.

Table 3-6: Logical and Shift Instruction Formats

Instruction Mnemonic Operands
Logical Complement - NOT not s_reg, d_reg
or
d_reg/s_reg
or
val_immed, d_reg
Logical Product - AND
Logical Sum - OR
Logical Sum - OR
Logical Difference - XOR
Logical Product with Complement - ANDNOT
Logical Product with Complement - ANDNOT
Logical Sum with Complement - ORNOT
Logical Equivalence - XORNOT
Logical Equivalence - XORNOT
Shift Left Logical
Shift Right Logical
Shift Right Arithmetic
and
bis
or
xor
bic
andnot
ornot
eqv
xornot
sll
srl
sra
s_reg1, s_reg2, d_reg
or
d_reg/s_reg1, s_reg2
or
s_reg1, val_immed, d_reg
or
d_reg/s_reg1, val_immed

Table 3-7 describes the operations performed by logical and shift instructions.

Table 3-7: Logical and Shift Instruction Descriptions

Instruction Description
Logical Complement - NOT (not) Computes the Logical NOT of a value. This instruction performs a complement operation on the contents of s_reg1 and places the result in the destination register.
Logical Product - AND (and) Computes the Logical AND of two values. This instruction performs an AND operation between the contents of s_reg1 and either the contents of s_reg2 or the immediate value and then places the result in the destination register.
Logical Sum - OR (bis) Computes the Logical OR of two values. This instruction performs an OR operation between the contents of s_reg1 and either the contents of s_reg2 or the immediate value and then places the result in the destination register.
Logical Sum - OR (or) Synonym for bis.
Logical Difference - XOR (xor) Computes the XOR of two values. This instruction performs an XOR operation between the contents of s_reg1 and either the contents of s_reg2 or the immediate value and then places the result in the destination register.
Logical Product with Complement - ANDNOT (bic) Computes the Logical AND of two values. This instruction performs an AND operation between the contents of s_reg1 and the ones complement of either the contents of s_reg2 or the immediate value and then places the result in the destination register.
Logical Product with Complement - ANDNOT (andnot) Synonym for bic.
Logical Sum with Complement - ORNOT (ornot) Computes the logical OR of two values. This instruction performs an OR operation between the contents of s_reg1 and the ones complement of either the contents of s_reg2 or the immediate value and then places the result in the destination register.
Logical Equivalence - XORNOT (eqv) Computes the logical XOR of two values. This instruction performs an XOR operation between the contents of s_reg1 and the ones complement of either the contents of s_reg2 or the immediate value and then places the result in the destination register.
Logical Equivalence - XORNOT (xornot) Synonym for eqv.
Shift Left Logical (sll) Shifts the contents of a register left (toward the sign bit) and inserts zeros in the vacated bit positions. Register s_reg1 contains the value to be shifted, and either the contents of s_reg2 or the immediate value specifies the shift count. If s_reg2 or the immediate value is greater than 63 or less than zero, s_reg1 shifts by the result of the following AND operation: s_reg2 AND 63.
Shift Right Logical (srl) Shifts the contents of a register to the right (toward the least significant bit) and inserts zeros in the vacated bit positions. Register s_reg1 contains the value to be shifted, and either the contents of s_reg2 or the immediate value specifies the shift count. If s_reg2 or the immediate value is greater than 63 or less than zero, s_reg1 shifts by the result of the result of the following AND operation: s_reg2 AND 63.
Shift Right Arithmetic (sra) Shifts the contents of a register to the right (toward the least significant bit) and inserts the sign bit in the vacated bit position. Register s_reg1 contains the value to be shifted, and either the contents of s_reg2 or the immediate value specifies the shift count. If s_reg2 or the immediate value is greater than 63 or less than zero, s_reg1 shifts by the result of the following AND operation: s_reg2 AND 63.


3.4    Relational Instructions

Relational instructions compare values in registers.

Table 3-8 lists the mnemonics and operands for instructions that perform relational operations. Each of the instructions listed in the table can take an operand in any of the forms shown.

Table 3-8: Relational Instruction Formats

Instruction Mnemonic Operands
Compare Signed Quadword Equal
Compare Signed Quadword Less Than
Compare Signed Quadword Less Than or Equal
Compare Unsigned Quadword Less Than
Compare Unsigned Quadword Less Than or Equal
cmpeq
cmplt
cmple
cmpult
cmpule
s_reg1, s_reg2, d_reg
or
d_reg/s_reg1, s_reg2
or
s_reg1, val_immed, d_reg
or
d_reg/s_reg1, val_immed

Table 3-9 describes the operations performed by relational instructions.

Table 3-9: Relational Instruction Descriptions

Instruction Description
Compare Signed Quadword Equal (cmpeq) Compares two 64-bit values. If the value in s_reg1 equals the value in s_reg2 or the immediate value, this instruction sets the destination register to one; otherwise, it sets the destination register to zero.
Compare Signed Quadword Less Than (cmplt) Compares two signed 64-bit values. If the value in s_reg1 is less than the value in s_reg2 or the immediate value, this instruction sets the destination register to one; otherwise, it sets the destination register to zero.
Compare Signed Quadword Less Than or Equal (cmple) Compares two signed 64-bit values. If the value in s_reg1 is less than or equal to the value in s_reg2 or the immediate value, this instruction sets the destination register to one; otherwise, it sets the destination register to zero.
Compare Unsigned Quadword Less Than (cmpult) Compares two unsigned 64-bit values. If the value in s_reg1 is less than either the value in s_reg2 or the immediate value, this instruction sets the destination register to one; otherwise, it sets the destination register to zero.
Compare Unsigned Quadword Less Than or Equal (cmpule) Compares two unsigned 64-bit values. If the value in s_reg1 is less than or equal to either the value in s_reg2 or the immediate value, this instruction sets the destination register to one; otherwise, it sets the destination register to zero.


3.5    Move Instructions

Move instructions move data between registers.

Table 3-10 lists the mnemonics and operands for instructions that perform move operations. The table is divided into groups of instructions. The operands specified within a particular group apply to all of the instructions contained in that group.

Table 3-10: Move Instruction Formats

Instruction Mnemonic Operands
Move mov s_reg, d_reg
or
val_immed, d_reg
Move if Equal to Zero
Move if Not Equal to Zero
Move if Less Than Zero
Move if Less Than or Equal to Zero
Move if Greater Than Zero
Move if Greater Than or Equal to Zero
Move if Low Bit Clear
Move if Low Bit Set
cmoveq
cmovne
cmovlt
cmovle
cmovgt
cmovge
cmovlbc
cmovlbs
s_reg1, s_reg2, d_reg
or
d_reg/s_reg1, s_reg2
or
s_reg1, val_immed, d_reg
or
d_reg/s_reg1, val_immed

Table 3-11 describes the operations performed by move instructions.

Table 3-11: Move Instruction Descriptions

Instruction Description
Move (mov) Moves the contents of the source register or the immediate value to the destination register.
Move if Equal to Zero (cmoveq) Moves the contents of s_reg2 or the immediate value to the destination register if the contents of s_reg1 is equal to zero.
Move if Not Equal to Zero (cmovne) Moves the contents of s_reg2 or the immediate value to the destination register if the contents of s_reg1 is not equal to zero.
Move if Less Than Zero (cmovlt) Moves the contents of s_reg2 or the immediate value to the destination register if the contents of s_reg1 is less than zero.
Move if Less Than or Equal to Zero (cmovle) Moves the contents of s_reg2 or the immediate value to the destination register if the contents of s_reg1 is less than or equal to zero.
Move if Greater Than Zero (cmovgt) Moves the contents of s_reg2 or the immediate value to the destination register if the contents of s_reg1 is greater than zero.
Move if Greater Than or Equal to Zero (cmovge) Moves the contents of s_reg2 or the immediate value to the destination register if the contents of s_reg1 is greater than or equal to zero.
Move if Low Bit Clear (cmovlbc) Moves the contents of s_reg2 or the immediate value to the destination register if the low-order bit of s_reg1 is equal to zero.
Move if Low Bit Set (cmovlbs) Moves the contents of s_reg2 or the immediate value to the destination register if the low-order bit of s_reg1 is not equal to zero.


3.6    Control Instructions

Control instructions change the control flow of an assembly program. They affect the sequence in which instructions are executed by transferring control from one location in a program to another.

Table 3-12 lists the mnemonics and operands for instructions that perform control operations. The table is divided into groups of instructions. The operands specified within a particular group apply to all of the instructions contained in that group.

Table 3-12: Control Instruction Formats

Instruction Mnemonic Operands
Branch if Equal to Zero
Branch if Not Equal to Zero
Branch if Less Than Zero
Branch if Less Than or Equal to Zero
Branch if Greater Than Zero
Branch if Greater Than or Equal to Zero
Branch if Low Bit is Clear
Branch if Low Bit is Set
beq
bne
blt
ble
bgt
bge
blbc
blbs
s_reg, label
Branch
Branch to Subroutine
br
bsr
d_reg, label
or
label
Jump
Jump to Subroutine
jmp[Table Note 1]
jsr[Table Note 1]
d_reg, (s_reg), jhint
or
d_reg, (s_reg)
or
(s_reg), jhint
or
(s_reg)
or
d_reg, address
or
address
Return from Subroutine
Jump to Subroutine Return
ret
jsr_coroutine[Table Note 1]
d_reg, (s_reg), rhint
or
d_reg, (s_reg)
or
d_reg, rhint
or
d_reg
or
(s_reg), rhint
or
(s_reg)
or
rhint
or
no_operands

Table Notes:

  1. In addition to the normal operands that can be specified with this instruction, relocation operands can also be specified (see Section 2.6.4).

Table 3-13 describes the operations performed by control instructions. For all branch instructions described in the table, the branch destinations must be defined in the source being assembled, not in an external source file.

Table 3-13: Control Instruction Descriptions

Instruction Description
Branch if Equal to Zero (beq) Branches to the specified label if the contents of the source register is equal to zero.
Branch if Not Equal to Zero (bne) Branches to the specified label if the contents of the source register is not equal to zero.
Branch if Less Than Zero (blt) Branches to the specified label if the contents of the source register is less than zero. The comparison treats the source register as a signed 64-bit value.
Branch if Less Than or Equal to Zero (ble) Branches to the specified label if the contents of the source register is less than or equal to zero. The comparison treats the source register as a signed 64-bit value.
Branch if Greater Than Zero (bgt) Branches to the specified label if the contents of the source register is greater than zero. The comparison treats the source register as a signed 64-bit value.
Branch if Greater Than or Equal to Zero (bge) Branches to the specified label if the contents of the source register is greater than or equal to zero. The comparison treats the source register as a signed 64-bit value.
Branch if Low Bit is Clear (blbc) Branches to the specified label if the low-order bit of the source register is equal to zero.
Branch if Low Bit is Set (blbs) Branches to the specified label if the low-order bit of the source register is not equal to zero.
Branch (br) Branches unconditionally to the specified label. If a destination register is specified, the address of the instruction following the br instruction is stored in that register.
Branch to Subroutine (bsr) Branches unconditionally to the specified label and stores the return address in the destination register. If a destination register is not specified, register $26 (ra) is used.
Jump (jmp) Unconditionally jumps to a specified location. A symbolic address or the source register specifies the target location. If a destination register is specified, the address of the instruction following the jmp instruction is stored in the specified register.
Jump to Subroutine (jsr) Unconditionally jumps to a specified location and stores the return address in the destination register. If a destination register is not specified, register $26 (ra) is used. A symbolic address or the source register specifies the target location. The instruction jsr procname transfers to procname and saves the return address in register $26.
Return from Subroutine (ret) Unconditionally returns from a subroutine. If a destination register is specified, the address of the instruction following the ret instruction is stored in the specified register. The source register contains the return address. If the source register is not specified, register $26 (ra) is used. If a hint is not specified, a hint value of one is used.
Jump to Subroutine Return (jsr_coroutine) Unconditionally returns from a subroutine and stores the return address in the destination register. If a destination register is not specified, register $26 (ra) is used. The source register contains the target address. If the source register is not specified, register $26 (ra) is used.

All jump instructions (jmp, jsr, ret, jsr_coroutine) perform identical operations. They differ only in hints to possible branch-prediction logic. See the Alpha Architecture Reference Manual for information about branch-prediction logic.


3.7    Byte-Manipulation Instructions

Byte-manipulation instructions perform byte operations on values in registers.

Table 3-14 lists the mnemonics and operands for instructions that perform byte-manipulation operations. Each of the instructions listed in the table can take an operand in any of the forms shown.

Table 3-14: Byte-Manipulation Instruction Formats

Instruction Mnemonic Operands
 
   
Compare Byte
Extract Byte Low
Extract Word Low
Extract Longword Low
Extract Quadword Low
Extract Word High
Extract Longword High
Extract Quadword High
Insert Byte Low
Insert Word Low
Insert Longword Low
Insert Quadword Low
Insert Word High
Insert Longword High
Insert Quadword High
Mask Byte Low
Mask Word Low
Mask Longword Low
Mask Quadword Low
Mask Word High
Mask Longword High
Mask Quadword High
Zero Bytes
Zero Bytes NOT
cmpbge
extbl
extwl
extll
extql
extwh
extlh
extqh
insbl
inswl
insll
insql
inswh
inslh
insqh
mskbl
mskwl
mskll
mskql
mskwh
msklh
mskqh
zap
zapnot
s_reg1, s_reg2, d_reg
or
d_reg/s_reg1, s_reg2
or
s_reg1, val_immed, d_reg
or
d_reg/s_reg1, val_immed

Table 3-15 describes the operations performed by byte-manipulation instructions.

Table 3-15: Byte-Manipulation Instruction Descriptions

Instruction Description
Compare Byte (cmpbge) Performs eight parallel unsigned byte comparisons between corresponding bytes of register s_reg1 and s_reg2 or the immediate value. A bit is set in the destination register if a byte in s_reg1 is greater than or equal to the corresponding byte in s_reg2 or the immediate value.

The results of the comparisons are stored in the eight low-order bits of the destination register; bit 0 of the destination register corresponds to byte 0 and so forth. The 56 high-order bits of the destination register are cleared.

Extract Byte Low (extbl) Shifts the register s_reg1 right by 0-7 bytes, inserts zeros into the vacated bit positions, and then extracts the low-order byte into the destination register. The seven high-order bytes of the destination register are cleared to zeros. Bits 0-2 of register s_reg2 or the immediate value specify the shift count.

Extract Word Low (extwl) Shifts the register s_reg1 right by 0-7 bytes, inserts zeros into the vacated bit positions, and then extracts the two low-order bytes and stores them in the destination register. The six high-order bytes of the destination register are cleared to zeros. Bits 0-2 of register s_reg2 or the immediate value specify the shift count.
Extract Longword Low (extll) Shifts the register s_reg1 right by 0-7 bytes, inserts zeros into the vacated bit positions, and then extracts the four low-order bytes and stores them in the destination register. The four high-order bytes of the destination register are cleared to zeros. Bits 0-2 of register s_reg2 or the immediate value specify the shift count.
Extract Quadword Low (extql) Shifts the register s_reg1 right by 0-7 bytes, inserts zeros into the vacated bit positions, and then extracts all eight bytes and stores them in the destination register. Bits 0-2 of register s_reg2 or the immediate value specify the shift count.
Extract Word High (extwh) Shifts the register s_reg1 left by 0-7 bytes, inserts zeros into the vacated bit positions, and then extracts the two low-order bytes and stores them in the destination register. The six high-order bytes of the destination register are cleared to zeros. Bits 0-2 of register s_reg2 or the immediate value specify the shift count.
Extract Longword High (extlh) Shifts the register s_reg1 left by 0-7 bytes, inserts zeros into the vacated bit positions, and then extracts the four low-order bytes and stores them in the destination register. The four high-order bytes of the destination register are cleared to zeros. Bits 0-2 of register s_reg2 or the immediate value specify the shift count.
Extract Quadword High (extqh) Shifts the register s_reg1 left by 0-7 bytes, inserts zeros into the vacated bit positions, and then extracts all eight bytes and stores them in the destination register. Bits 0-2 of register s_reg2 or the immediate value specify the shift count.
Insert Byte Low (insbl) Shifts the register s_reg1 left by 0-7 bytes, inserts the byte into a field of zeros, and then places the result in the destination register. Bits 0-2 of register s_reg2 or the immediate value specify the shift count.
Insert Word Low (inswl) Shifts the register s_reg1 left by 0-7 bytes, inserts the word into a field of zeros, and then places the result in the destination register. Bits 0-2 of register s_reg2 or the immediate value specify the shift count.
Insert Longword Low (insll) Shifts the register s_reg1 left by 0-7 bytes, inserts the longword into a field of zeros, and then places the result in the destination register. Bits 0-2 of register s_reg2 or the immediate value specify the shift count.
Insert Quadword Low (insql) Shifts the register s_reg1 left by 0-7 bytes, inserts the quadword into a field of zeros, and then places the result in the destination register. Bits 0-2 of register s_reg2 or the immediate value specify the shift count.
Insert Quadword Low (insql) Shifts the register s_reg1 left by 0-7 bytes, inserts the quadword into a field of zeros, and then places the result in the destination register. Bits 0-2 of register s_reg2 or the immediate value specify the shift count.
Insert Word High (inswh) Shifts the register s_reg1 right by 0-7 bytes, inserts the word into a field of zeros, and then places the result in the destination register. Bits 0-2 of register s_reg2 or the immediate value specify the shift count.
Insert Longword High (inslh) Shifts the register s_reg1 right by 0-7 bytes, inserts the longword into a field of zeros, and then places the result in the destination register. Bits 0-2 of register s_reg2 or the immediate value specify the shift count.
Insert Quadword High (insqh) Shifts the register s_reg1 right by 0-7 bytes, inserts the quadword into a field of zeros, and then places the result in the destination register. Bits 0-2 of register s_reg2 or the immediate value specify the shift count.
Mask Byte Low (mskbl) Sets a byte in register s_reg1 to zero and stores the result in the destination register. Bits 0-2 of register s_reg2 or the immediate value specify the offset of the byte.
Mask Word Low (mskwl) Sets a word in register s_reg1 to zero and stores the result in the destination register. Bits 0-2 of register s_reg2 or the immediate value specify the offset of the word.
Mask Longword Low (mskll) Sets a longword in register s_reg1 to zero and stores the result in the destination register. Bits 0-2 of register s_reg2 or the immediate value specify the offset of the longword.
Mask Quadword Low (mskql) Sets a quadword in register s_reg1 to zero and stores the result in the destination register. Bits 0-2 of register s_reg2 or the immediate value specify the offset of the quadword.
Mask Word High (mskwh) Sets a word in register s_reg1 to zero and stores the result in the destination register. Bits 0-2 of register s_reg2 or the immediate value specify the offset of the word.
Mask Longword High (msklh) Sets a longword in register s_reg1 to zero and stores the result in the destination register. Bits 0-2 of register s_reg2 or the immediate value specify the offset of the longword.
Mask Quadword High (mskqh) Sets a quadword in register s_reg1 to zero and stores the result in the destination register. Bits 0-2 of register s_reg2 or the immediate value specify the offset of the quadword.
Zero Bytes (zap) Sets selected bytes of register s_reg1 to zero and places the result in the destination register. Bits 0-7 of register s_reg2 or an immediate value specify the bytes to be cleared to zeros. Each bit corresponds to one byte in register s_reg1; for example, bit 0 corresponds to byte 0. A bit with a value of one indicates its corresponding byte should be cleared to zeros.
Zero Bytes NOT (zapnot) Sets selected bytes of register s_reg1 to zero and places the result in the destination register. Bits 0-7 of register s_reg2 or an immediate value specify the bytes to be cleared to zeros. Each bit corresponds to one byte in register s_reg1; for example, bit 0 corresponds to byte 0. A bit with a value of zero indicates its corresponding byte should be cleared to zeros.


3.8    Special-Purpose Instructions

Special-purpose instructions perform miscellaneous tasks.

Table 3-16 lists the mnemonics and operands for instructions that perform special operations. The table is divided into groups of instructions. The operands specified within a particular group apply to all of the instructions contained in that group.

Table 3-16: Special-Purpose Instruction Formats

Instruction Mnemonic Operands
Call Privileged Architecture Library call_pal palcode
Architecture Mask amask s_reg, d_reg
or
val_immed, d_reg
Prefetch Data
Prefetch Data, Modify Intent
fetch
fetch_m
offset(b_reg)
Read Process Cycle Counter
Implementation Version
rpcc
implver
d_reg
No Operation
Universal No Operation
Trap Barrier
Exception Barrier
Memory Barrier
Write Memory Barrier
nop
unop
trapb
excb
mb
wmb
no_operands

Table 3-17 describes the operations performed by special-purpose instructions.

Table 3-17: Special-Purpose Instruction Descriptions

Instruction Description
Call Privileged Architecture Library (call_pal) Unconditionally transfers control to the exception handler. The palcode operand is interpreted by software conventions.
Architecture Mask (amask) The value of the contents of s_reg or the immediate value represent a mask of architectural extensions that are being requested. Bits are cleared if they correspond to architectural extensions that are present, and the result is placed in the destination register.
Prefetch Data (fetch) Indicates that the 512-byte block of data specified by the effective address should be moved to a faster-access part of the memory hierarchy.
Prefetch Data, Modify Intent (fetch_m) Indicates that the 512-byte block of data specified by the effective address should be moved to a faster-access part of the memory hierarchy. In addition, this instruction is a hint that part or all of the data may be modified.
Read Process Cycle Counter (rpcc) Returns the contents of the process cycle counter in the destination register.
Implementation Version (implver) A small integer is placed in the destination register. This integer specifies the major implementation version of the processor on which it is executed. This information can be used to make code-scheduling or tuning decisions. The returned small integer can have the values 0 or 1. 1 indicates an EV5 Alpha chip (21164). 0 indicates EV4, EV45, LCA, and LCA-45 Alpha chips (that is, 21064, 21064A, 21066, 21068, and 21066A, respectively).
No Operation (nop) Has no effect on the machine state.
Universal No Operation (unop) Has no effect on the machine state.
Trap Barrier (trapb) Guarantees that all previous arithmetic instructions are completed, without incurring any arithmetic traps, before any instructions after the trapb instruction are issued.
Exception Barrier (excb) Guarantees that all previous instructions complete any exception-related behavior or rounding-mode behavior before any instructions after the excb instruction are issued.
Memory Barrier (mb) Used to serialize access to memory. See the Alpha Architecture Reference Manual for addition information on memory barriers.
Write Memory Barrier (wmb) Guarantees that all previous store instructions access memory before any store instructions issued after the wmb instruction.