Butterfly CPU Core |
� 2003-2017 Finitron |
Table of Contents
Documentation Notes
This document refers to a sixteen bit quantity as a character and a thirty-two bit quantity as a word.
The design objectives help place the processor in it's proper field of operation. Basically this design is intended for a small footprint (in terms of both memory required and FPGA resources consumed.) It has also been designed with the resources of a typical FPGA in mind. Following are some of the criteria that were used on which to base the design.
A small memory space, potentially less than 65kB but possibly several megabytes of memory. |
A shared data / code memory space, most likely using external ROM and RAM resources which are most likely limited in operating frequency. |
A narrow bus interface to that memory, 16 bits expected but possibly only eight bits. |
Targeted specifically to implementation in an small FPGA; but scale-able to a larger project |
Using minimal resources with maximum functionality providing a level of functionality suitable for most projects. This includes support for a real operating system, externally generated hardware interrupts, and debugging. |
An easy target for porting existing high level language compilers and assemblers to. |
This design meets the above objectives in the following ways. Two versions of the processor are available (produced from the same source code). One is a strictly sixteen bit design with a sixteen bit wide datapath and an address space limited to 65kc (characters) of memory; the other is a thirty-two bit version with a larger address space. A sixteen bit instruction size was chosen to minimize the memory footprint required for programs. The instruction set was designed to be scaleable (object code compatible) from the sixteen bit version to the thirty-two bit version. By using a sixteen bit instruction format the benefits of a sixteen bit processor in terms of code size, can be obtained with a thirty-two bit processor. Because this design need not be especially fast, a simple two-stage pipelined processor was chosen. This provides more than adequate performance when dealing with the type of memory system that it is anticipated that this design will interface to. It is also a non-harvard architecture because the memory space is shared between both code and data to help conserve resources. A reasonably large general purpose register set is available making the design reasonably compatible with many existing compilers and assemblers. Where needed, additional specialized instructions have been added to the processor to support a sophisticated operating system and interrupt management..
Register Set
General Registers
Code | Register Name | Description |
0000 | r0 or 0 | this register is always zero (unchangeable) |
0001 | r1 | general purpose register |
0010 | r2 | general purpose register |
0011 | r3 | general purpose register |
0100 | r4 | general purpose register |
0101 | r5 | general purpose register |
0110 | r6 | general purpose register |
0111 | r7 | general purpose register |
1000 | r8 | general purpose register |
1001 | r9 | general purpose register |
1010 | r10 | general purpose register |
1011 | r11 | general purpose register |
1100 | r12 | general purpose register |
1101 | r13 | general purpose register (reserved for use by assembler) |
1110 | r14 | general purpose register (reserved for use as the SP by assembler) |
1111 | LR (r15) | (Link Register) This register is automatically updated by the CALL instructions. It is normally used as the subroutine link register and should be used for this purpose where possible. |
Interrupt Link Register
0001 |
ILR | Interrupt Link Register |
This register contains the value of the program counter at which an interrupt occurred. It is automatically updated by the interrupt instruction and is used by the interrupt return instruction to restore the original program counter value.
This register may be accessed using the TRS and TSR commands.
Status Register
The status flags are updated during the execution of every instruction. It is highly desirable to hide the internal state of a processor from programs as it prevents program failures due to accidental or malicious manipulation of flags. It also prevents programmers from using status register flags as a means to return information such as error status from subroutines, which is an undesirable way of doing things. Hiding the status flags could be done using interlocked instruction sequences, but it is not done in this design. While it is highly desirable to completely hide the internal state of a processor from a program it is somewhat impractical. A processor that supports interrupt mechanisms typically must have some means of recording and restoring the interrupt mask state at a minimum. This state frequently needs to be saved to memory in a stack frame or as part of process information. Given that this state is visible it makes little sense to hide the remaining flags using processor interlocks, which would only reduce the responsiveness to interrupts.
The real need for the manipulation of status register bits is to manage the interrupt state of the processor. Several older processors allow logical operations on the status register to support this need, however that opens the door to program bugs due to the inadvertent alteration of bits needed to determine branch conditions. To prevent this from happening, and because interrupt management is required, this processor provides specific instructions (DI, EI, RI) for manipulation of the interrupt state without affecting the remaining flags stored in the status register.
There is no reason to manipulate the status register except for possibly saving or restoring it from the stack frame so no detail of it's internal components is provided and it is subject to change. The status register is sixteen bits (a single character) and split into two halves. The lower portion contains the working copy of the flags and the upper portion contains a backup copy of the flags made when an interrupt occurs. Occurrence of an interrupt automatically copies the working flags to the backup copy, and sets the interrupt mask in the working copy. Execution of an interrupt return instruction (IRET) automatically copies the backup flags back into the working flags.
An additional complication of the status register is support for extended precision operations.
The status register may be accessed using the TRS and TSR commands.
SR (Backup version) | SR (working version) |
Addition / Subtraction / Comparison
Basic addition (ADD), subtraction (SUB) and comparison (CMP) operations are supported; extended precision operations are also supported via the carry input enable (CIN) instruction. The 'add' instruction has a three operand immediate form which is also used as the immediate form for subtract and compare operations. (A - B = A + (-B)). The compare operation is really an alternate form of the subtract operation where the destination register is r0. An additional instruction NEG is provided which is an alternate form of the SUBR instruction.
The processor supports a standard set of logical operations including and (AND), or (inclusive)(OR) and exclusive or (XOR). An additional derived operation is NOT which is an alternate form of the XOR instruction.
Left shifts are supported via the SHL instruction. Left shifts are the most common type of shift operation because they are frequently used in the multiplication operation and in producing address offsets. Right shifts are rarely used but arithmetic (ASR) and logical (SHR) right shifts of a single bit are supported. Support for right shifts is provided on most processors because it is otherwise difficult to perform. Only single bit rotates (either left (ROL) or right (ROR) ) are supported in this processor because they are rarely used and there are no high level languages that support them. Rotates may also be performed by using a combination of shifting and logical operations.
Program Flow Control
Branches
Branches are the most frequent form of program flow control operation and are usually performed in a conditional manner. The processor allows a nine bit branch displacement which covers virtually 100% of branch cases. A standard set of branch conditions is provided, outlined in the table below. Additionally a branch to subroutine (program counter relative call) instruction is provided. The branch to subroutine instruction automatically stores the return address in the link register (r15).
Branch conditions are based on the following flags that are maintained in the status register: z, v, c, and n.
Flags
Flag | Operation |
z | set when result is zero |
v | set on signed overflow of result |
n | set if result is negative |
c | set if carry (on add) or borrow (on sub) |
Branch Conditions
Code | Mnemonic | Description | Conditional Test |
0000 | BEQ | Branch if EQual | z |
0001 | BNE | Branch if Not Equal | !z |
0010 | BRA | BRanch Always | 1 |
0011 | BSR | Branch to SubRoutine (relative call) | 1 |
0100 | BMI | Branch if MInus | n |
0101 | BPL | Branch if PLus | !n |
0110 | {reserved} | reserved | |
0111 | {reserved} | reserved | |
1000 | BLT | Branch if Less Than | n ^ v |
1001 | BGE | Branch if Greater than or Equal | !(n ^ v) |
1010 | BLE | Branch if Less than or Equal | (n^v) | z |
1011 | BGT | Branch if Greater Than | !((n^v) | z) |
1100 | BLTU / BCS | Branch if Less Than (Unsigned) | c |
1101 | BGEU / BCC | Branch if Greater than or Equal (Unsigned) | !c |
1110 | BLEU | Branch if Less than or Equal (Unsigned) | c | z |
1111 | BGTU | Branch if Greater Than (Unsigned) | !(c | z) |
Jumps
Jumps (JMP) are not frequently used in program code so there is minimal support for this operation. Jumps are implemented as a specific case of the jump-and-link (JAL) instruction which is normally used for subroutine calls. By specifying r0 as the register to save the program counter in, a jump operation can be performed because the save of the program counter normally associated with the JAL instruction is nullified.
Subroutine Calls
Subroutine calls may be performed via the CALL (JAL) instructions. One of the problems with a fixed format sixteen bit instruction encoding is that subroutine call operations are complicated. There are simply not enough bits available in the opcode to support a subroutine address. Hence the provision of a call subroutine instruction that uses a program counter relative address to identify the target routine. Obviously, the distance this call can branch is severely limited. The JAL instruction is a simple, powerful instruction common in many newer architectures that provides for most common program flow transfer operations that are not covered by branches. This single instruction can perform regular jump operations, subroutine call operations using absolute or register indirect addresses, and a return from subroutine operation. The default form of the JAL instruction where the return address is stored in the link register (r15) is referred to as CALL.
It is sometimes desirable to stop processing at the current position in a program and wait for an external event to occur. This is often done in operating system code to make the most efficient use of resources. The STOP instruction provides the facility to wait for an externally generated event to occur. The processor monitors the status of an external signal called 'go', and if active allows the processor to exit the stopped state. This means the processor can be synchronized to external hardware. The 'go' signal may or may not be connected to an interrupt signal. Unless the go signal is connected to an interrupt signal, it merely releases the processor from the stop state with no other effect (stop/go do not cause an interrupt by themselves).
Interrupts
This processor supports interrupts in the form of a system call. When a hardware interrupt occurs the appropriate system call instruction is forced into the the processor's internal instruction pipeline. There are several pre-defined system calls to specific addresses in order to support hardware interrupts as listed in the table below. A system call jumps to the interrupt subroutine whose address is encoded directly in the system call instruction. The IRET instruction is used to return from a system call. The interrupt subroutine begins at the address specified in the system call. The pre-defined system call addresses are spaced every four characters in order to allow an extended jump instruction to the actual routine to be performed. If desired other program code or data may be placed in these locations. Addresses above FFC0 should not be used as these are reserved for future use.
Occurrence of an interrupt results in the status register (sr) begin copied to the back up version, the maskable interrupt mask being set to disable further interrupts, and the interrupt return address being copied to r14, so that both of these may be restored by the interrupt return operation.
System Call Address | Vector Type / Cause | Instruction Mnemonic | Comment |
FFFC | reset | RESET | will be called automatically during hardware reset |
FFF8 | non-maskable interrupt | NMI | will be automatically called during hardware interrupt |
FFF4 | maskable interrupt | IRQ | will be automatically called during hardware interrupt |
FFF0 | watch address match | WAT | called automatically when the current address matches the watch address register |
FFC4 | software interrupt | SYS | system call - meant to be used to call OS system functions |
FFC0 | break | BRK | called automatically when a null character is found in the instruction stream |
This processor does not support a byte oriented memory format because of a number of factors (internationalization of character sets, availability of large size low cost memory, frequent use of values larger than a byte, and performance to name a few). All addressing is in terms of characters which are sixteen bit quantities. The effect of this is to double the available address range supported by the sixteen bit version of the processor.
This is a load / store architecture; the only operations accessing memory are load and store operations. The processor supports both character (16 bit) and word (32 bit) loads and stores of data to memory (LC, LW, SC, SW). Character values loaded are sign extended to the width of the implementation when loaded.
Ideally the processor should be able to support three operand instructions (one destination, two sources) including cases where all three operands are registers, in order to allow the compiler to allocate registers more efficiently. However, from a practicality standpoint it is simpler and smaller (and thus faster) to use an instruction set that supports only two register operands at once within an FPGA. This does not degrade the performance of the processor to a significant degree as fewer than 1/4 of the instructions executed actually require three register operands, and of those not all will use three different registers. Supporting a register file with three independent ports consumes over twice as many resources for the register file as supporting a register file with two ports in the typical FPGA.
This is a sixteen register design. Limiting the register set to sixteen registers allows a compact instruction format and is consistent with the efficient use of resources within an FPGA.
The opcode format used here is a fixed sixteen bit format with an optional additional character (16 bits!) containing an extended immediate value. The mechanism used to encode immediate values in the instruction stream is somewhat convoluted due to the desire to be able to encode thirty-two bit immediate constants in the instruction stream without using an intermediate register or instructions, without lengthening the processor pipeline, causing unnecessary stalls, or needlessly wasting any program space. The constant prefix instruction is provided for the purpose of supplying twelve (bits four through fifteen) or twenty-eight additional bits for the constant when a four bit constant value encoded within an instruction is not sufficient. With the use of the constant prefix instruction both sixteen (sign extended) and thirty-two bit immediates are supported which are then signed extended to thirty-two bits. The constant prefix and following one or two characters are operated in an interlocked fashion to disallow an intervening interrupt to occur.
Basic Opcode Formats
op4 |
op12 |
misc | misc / reserved | |||
op4 | rd4 | rs4 | op4 | rr | register-register | |
op4 | rd4 | op4 | imm4 | rc | register-constant | |
op3 | sz1 | const12 | cp | constant prefix | ||
op3 | cond3 | disp10 | br | branch | ||
op4 | rd4 | rs4 | const4 | rrc | register-register-constant |
More Detailed Opcode Formats
op4 |
op12 |
inst. | desc. | ||||
0 | op12 | misc | misc. | brk, nop, end, stop, di, ei, ri, sys | |||
1 | rd4 | rs4 | imm4 | ADD | add immediate | add, sub, cmp | |
2 | rd4 | rs4 | op4 | rr | register-register operate | add, sub, cmp, and, or, xor | |
3 | rd4 | op4 | imm4 | ri | register-immediate operate | and, or, xor, subr, tsr, trs | |
4 | const12 | CP | constant prefix | ||||
5 | reserved for constant | ||||||
8 | rd4 | rs4 | disp4 | JAL | jump and link | jal, iret, call, jmp, ret | |
6 | rd4 | imm8 | ADDI8 | ||||
7 | reserved | ||||||
9 | Disp12 | CALL | |||||
A/B | cond4 | disp9 | Bcc | conditional branch | beq, bne, blt, bge, ble, bgt, bra, bsr, bmi, bpl, bltu, bgeu, bleu, bgtu | ||
C | rd4 | rs4 | disp4 | SB | store byte | sc | this is a 8 bit quantity |
D | rd4 | rs4 | disp4 | SW | store word | sw | this is a 16 bit quantity |
E | rd4 | rs4 | disp4 | LB | load byte | lc | this is a 8 bit quantity |
F | rd4 | rs4 | disp4 | LW | load word | lw | this is a 16 bit quantity |
16 bit sign extended constant format
44 | const15..4 |
op12 | const3..0 |
Arithmetic / Logical operations | Subroutine Calls / Jumps | Memory Operations | Miscellaneous | Branches | Interrupt Management | |||
ADC | ROR | JAL | LB | CON | BEQ | BRK | STOP | |
ADD | SBC | JMP | LEA | NOP | BGE | DI | SYS | |
AND | SBCR | CALL | LW | TRS | BGT | EI | ||
ASR | SHL | RET | SB | TSR | BLE | IRQ | ||
CMP | SHR | SW | BLT | NMI | ||||
NEG | SUB | Special | BNE | RI | ||||
NOT | SUBR | BRA | RESET | |||||
OR | XOR | END | IRET | |||||
ROL |
ADC Rd,Rs | |
ADC Rd,#n |
Synopsis
Arithmetic 'add' register with register or immediate, including carry.
Detail
Rd = Rd + Rs + c
desc: | op | Rd | Rs | op |
size: | 4 | 4 | 4 | 4 |
bits: | 15 12 | 11 8 | 7 4 | 3 0 |
bit pattern: | 0010 | R3..0 | R3..0 | 0001 |
Rd = Rd + #imm + c
desc: | op | Rd | op | imm |
size: | 4 | 4 | 4 | 4 |
bits: | 15 12 | 11 8 | 7 4 | 3 0 |
bit pattern: | 0011 | R3..0 | 0001 | n3..0 |
ADD Rd,Rs | |
ADD Rd,Rs,#n |
Synopsis
Arithmetic 'add' register with register or immediate.
Detail
Rd = Rd + Rs
desc: | op | Rd | Rs | op |
size: | 4 | 4 | 4 | 4 |
bits: | 15 12 | 11 8 | 7 4 | 3 0 |
bit pattern: | 0010 | R3..0 | R3..0 | 0000 |
Rd = Rs + #imm
desc: | op | Rd | Rs | imm |
size: | 4 | 4 | 4 | 4 |
bits: | 15 12 | 11 8 | 7 4 | 3 0 |
bit pattern: | 0001 | R3..0 | R3..0 | n3..0 |
AND Rd,Rs |
AND Rd,#n |
Synopsis
Logically 'and' register with register or immediate.
Detail
Rd = Rd & Rs
desc: | op | Rd | Rs | op |
size: | 4 | 4 | 4 | 4 |
bits: | 15 12 | 11 8 | 7 4 | 3 0 |
bit pattern: | 0010 | R3..0 | R3..0 | 0101 |
Rd = Rd & #imm
desc: | op | Rd | op | imm |
size: | 4 | 4 | 4 | 4 |
bits: | 15 12 | 11 8 | 7 4 | 3 0 |
bit pattern: | 0011 | R3..0 | 0101 | n3..0 |
ASR Rd |
Synopsis
Arithmetically shift register right by one bit.
Detail
Rd = Rd >> 1
The sign bit of the register is preserved during the shift.
desc: | op | rd | op | resv. |
size: | 4 | 4 | 4 | 4 |
bits: | 15 12 | 11 8 | 7 4 | 3 0 |
bit pattern: | 0011 | R3..0 | 1001 | 0001 |
Bcc label
Synopsis
Branch to label if condition is true
Detail
if cond then
pc = pc + disp
desc: | op | cond. | disp. |
size: | 3 | 3 | 10 |
bits: | 15 13 | 12 10 | 9 0 |
bit pattern: | 101 | c2..0 | n9..0 |
Branch conditions are based on the following flags that are maintained in the status register: c, z, v, and n.
Flags
Flag | Operation |
z | set when result is zero |
v | set on signed overflow of result |
n | set if result is negative |
Branch Conditions
Code | Mnemonic | Description | Conditional Test |
000 | BEQ | Branch if EQual | z = 1 |
001 | BNE | Branch if Not Equal | z = 0 |
010 | BRA | BRanch Always | 1 |
011 | CALL | Branch to SubRoutine (CALL) | 1 |
100 | BLT | Branch if Less Than | n ^ v |
101 | BGE | Branch if Greater than or Equal | !(n ^ v) |
110 | BLE | Branch if Less than or Equal | (n^v) | z |
111 | BGT | Branch if Greater Than | !((n^v) | z) |
Synopsis
Run break routine.
Detail
r14 = pc; pc = FFC0; flags.backup = flags; flags.im = 1
desc: | op | op | op | op |
size: | 4 | 4 | 4 | 4 |
bits: | 15 12 | 11 8 | 7 4 | 3 0 |
bit pattern: | 0000 | 0000 | 0000 | 0000 |
This instruction causes the processor to perform a software initiated 'break' interrupt routine. It is purposely defined as a zero byte so the processor will execute the break routine in the event that code flows into a region of memory that has been nulled out. This helps promote reliable system operation.
CALL d[Rs] | |
CALL d12[pc] |
Synopsis
Call subroutine
Detail
r15 = pc; pc = disp + Rs;
Call subroutine using register indirect with displacement mode. This is an alternate form of the JAL instruction.
desc: | op | Rd | Rs | disp. |
size: | 4 | 4 | 4 | 4 |
bits: | 15 12 | 11 8 | 7 4 | 3 0 |
bit pattern: | 0100 | 1111 | R3..0 | n3..0 |
r15 = pc; pc = pc + disp;
Call subroutine using program counter relative form. The twelve bit displacement is shifted left once before use allowing a call +/- 4Kb from the current PC.
desc: | op | disp |
size: | 4 | 12 |
bits: | 15 12 | 11 0 |
bit pattern: | 1001 | d11..0 |
CMP Rd,Rs |
CMP Rs,#n (see also ADD, SUB) |
Synopsis
Compare register with register or immediate.
Detail
flags = Rd - Rs
desc: | op | rd | rs | op |
size: | 3 | 4 | 4 | 5 |
bits: | 15 13 | 12 9 | 8 5 | 4 0 |
bit pattern: | 010 | R3..0 | R3..0 | 00110 |
flags = Rs - #imm
desc: | op | Rd | Rs | imm |
size: | 4 | 4 | 4 | 4 |
bits: | 15 12 | 11 8 | 7 4 | 3 0 |
bit pattern: | 0001 | 0000 | R3..0 | n3..0 |
CON #imm |
Synopsis
Constant prefix
desc: | op | size | imm |
size: | 3 | 1 | 12 |
bits: | 15 13 | 12 | 11 0 |
bit pattern: | 100 | sz | n11..0 |
The upper bits of the immediate constant are set for the next instruction, overriding sign extension of the immediate. The constant prefix instruction indicates the presence of an additional twelve or twenty-eight constant bits in the instruction stream. If the size bit is set to one, then the next instruction character contains bits 16 to 31 of the constant for the following instruction. If the size bit is zero, then the constant prefix only includes bits 4 through 15 which are sign extended to produce the constant for the next instruction. Interrupts are prevented from occurring between this instruction and the next instruction. Normally this instruction is automatically inserted by the assembler wherever an extended constant value is required.
DI |
Synopsis
Disable interrupts.
Detail
flags.im = 1
desc: | op | op | op | op |
size: | 4 | 4 | 4 | 4 |
bits: | 15 12 | 11 8 | 7 4 | 3 0 |
bit pattern: | 0000 | 0000 | 0001 | 0001 |
This instruction disables maskable hardware interrupts by setting the interrupt mask in the status register. It also disables the IRQ instruction.
EI |
Synopsis
Enable interrupts.
Detail
flags.im = 0
desc: | op | op. | op | op |
size: | 4 | 4 | 4 | 4 |
bits: | 15 12 | 11 8 | 7 4 | 3 0 |
bit pattern: | 0000 | 0000 | 0001 | 0000 |
This instruction enables maskable hardware interrupts by clearing the interrupt mask in the status register.
END |
Synopsis
No operation.
Detail
desc: | op | op | op | op |
size: | 4 | 4 | 4 | 4 |
bits: | 15 12 | 11 8 | 7 4 | 3 0 |
bit pattern: | 0000 | 0000 | 0001 | 1000 |
This instruction is provided for use in a software emulator of the processor. It indicates the end of a sequence of instructions to emulate. The processor will treat this instruction as a NOP instruction.
IRET |
Synopsis
Return from interrupt subroutine.
Detail
pc = r14; flags = backup flags
desc: | op | Rd | Rs | disp |
size: | 4 | 4 | 4 | 4 |
bits: | 15 12 | 11 8 | 7 4 | 3 0 |
bit pattern: | 0100 | 0000 | 1110 | 0000 |
This instruction returns from an interrupt routine by restoring the flag register and jumping back to the code that was interrupted (who's address is stored in r14). This is a special form of the JAL instruction.
Synopsis
Run irq routine.
Detail
r14 = pc; pc = FFFA; flags.backup = flags; flags.im = 1
desc: | op | op | op | op |
size: | 4 | 4 | 4 | 4 |
bits: | 15 12 | 11 8 | 7 4 | 3 0 |
bit pattern: | 0000 |
0000 |
0000 |
1101 |
This instruction causes the processor to perform a software initiated maskable interrupt routine. It has the same effect as an external hardware maskable interrupt. If the interrupt mask in the status register is set, then this instruction will be ignored.
JAL Rd,d[Rs] |
Synopsis
Jump and link to subroutine
Detail
Rd = pc; pc = disp + Rs;
Jump to subroutine using register indirect with displacement mode. The current value of the program counter is stored in the destination register. Note: r14 can't be used as a source register. When r14 is specified as a source the JAL will be interpreted as an RTI instead.
desc: | op | Rd | Rs | disp. |
size: | 4 | 4 | 4 | 4 |
bits: | 15 12 | 11 8 | 7 4 | 3 0 |
bit pattern: | 0100 | R3..0 | R3..0 | n3..0 |
JMP d[Rs] |
Synopsis
Jump to target
Detail
Rd = pc; pc = disp + Rs;
Jump to target code using register indirect with displacement mode. This is an alternate form of the JAL instruction.
desc: | op | Rd | Rs | disp. |
size: | 4 | 4 | 4 | 4 |
bits: | 15 12 | 11 8 | 7 4 | 3 0 |
bit pattern: | 0100 | 0000 | R3..0 | n3..0 |
LB | Rd,d[Rs] |
Synopsis
Load register byte from memory
Detail
Rd = memory [Rs + disp]
desc: | op | rd | rs | disp |
size: | 4 | 4 | 4 | 4 |
bits: | 15 12 | 11 8 | 7 4 | 3 0 |
bit pattern: | 1110 | R3..0 | R3..0 | d3..0 |
The character loaded from memory is sign extended to the register width.
LEA Rd,d[Rs] |
Synopsis
Load effective address.
Detail
Rd = Rs + disp
desc: | op | Rd | Rs | disp. |
size: | 4 | 4 | 4 | 4 |
bits: | 15 12 | 11 8 | 7 4 | 3 0 |
bit pattern: | 0001 | R3..0 | R3..0 | d3..0 |
The effective address is loaded into the target register. This instruction is an alternate form of the ADD instruction.
LW | Rd,d[Rs] |
Synopsis
Load register word from memory
Detail
Rd = memory [Rs + disp]
desc: | op | rd | rs | disp |
size: | 4 | 4 | 4 | 4 |
bits: | 15 12 | 11 8 | 7 4 | 3 0 |
bit pattern: | 1111 | R3..0 | R3..0 | d3..0 |
NEG Rd |
Synopsis
Take twos complement of register.
Detail
Rd = -Rd
desc: | op | rd | op | imm |
size: | 4 | 4 | 4 | 4 |
bits: | 15 12 | 11 8 | 7 4 | 3 0 |
bit pattern: | 0011 | R3..0 | 0010 | 0000 |
Synopsis
Run nmi routine.
Detail
r14 = pc; pc = FFFF_FFF8; flags.backup = flags; flags.im = 1
desc: | op | op | op | op |
size: | 4 | 4 | 4 | 4 |
bits: | 15 12 | 11 8 | 7 4 | 3 0 |
bit pattern: | 0000 |
0000 |
0000 |
1110 |
This instruction causes the processor to perform a software initiated non-maskable interrupt routine. It has the same effect as an external hardware non-maskable interrupt.
NOP |
Synopsis
No operation
Detail
desc: | op | op | op | op |
size: | 4 | 4 | 4 | 4 |
bits: | 15 12 | 11 8 | 7 4 | 3 0 |
bit pattern: | 0000 | 0000 | 0001 | 0100 |
This instruction acts merely as a placeholder. It performs no operation and has no effect on the processor. Many processors lack an explicit NOP operation resulting in different instructions being used for this purpose within the same processor. By providing an explicit NOP instruction some consistency in programs can be achieved. Without a NOP instruction many processors typically use instructions which have side effects in the form of affecting the processor status flags; this is not the case here.
NOT Rd |
Synopsis
Take ones complement of register.
Detail
Rd = ~Rd
desc: | op | rd | op | imm |
size: | 4 | 4 | 4 | 4 |
bits: | 15 12 | 11 8 | 7 4 | 3 0 |
bit pattern: | 0011 | R3..0 | 0100 | 1111 |
OR Rd,Rs |
OR Rd,#n |
Synopsis
Logically inclusively 'or' register with register or immediate.
Detail
Rd = Rd | Rs
desc: | op | rd | rs | op |
size: | 3 | 4 | 4 | 4 |
bits: | 15 12 | 11 8 | 7 4 | 3 0 |
bit pattern: | 0010 | R3..0 | R3..0 | 0110 |
Rd = Rd | #imm
desc: | op | rd | op | imm |
size: | 3 | 4 | 4 | 4 |
bits: | 15 12 | 11 8 | 7 4 | 3 0 |
bit pattern: | 0011 | R3..0 | 0110 | n3..0 |
RI |
Synopsis
Restore interrupt flag from backup copy.
Detail
flags.im = flags.backup im
desc: | op | op | op | op |
size: | 4 | 4 | 4 | 4 |
bits: | 15 12 | 11 8 | 7 4 | 3 0 |
bit pattern: | 0000 | 0000 | 0001 | 0010 |
This instruction restores the previous interrupt flag state from the backup copy of the status register. This allows restoring the interrupt state that was present when the backup of the status register was made. This is useful in operating system code where interrupts must be disabled to perform certain system functions (like updating system lists) and then restored after performing the operation.
Synopsis
Run reset routine.
Detail
r14 = pc; pc = FFFF_FFFC; flags.backup = flags; flags.im = 1
desc: | op | op | op | op |
size: | 4 | 4 | 4 | 4 |
bits: | 15 12 | 11 8 | 7 4 | 3 0 |
bit pattern: | 0000 |
0000 |
0000 |
1111 |
This instruction causes the processor to perform a software initiated reset routine. It has the same effect as an external hardware reset.
ROL Rd |
Synopsis
Rotate register left.
Detail
Rd = Rd << 1
desc: | op | Rd | op | imm |
size: | 4 | 4 | 4 | 4 |
bits: | 15 12 | 11 8 | 7 4 | 3 0 |
bit pattern: | 0011 | R3..0 | 1001 | 0001 |
ROR Rd |
Synopsis
Rotate register right.
Detail
Rd = Rd >> 1
desc: | op | rd | op | imm. |
size: | 4 | 4 | 4 | 4 |
bits: | 15 12 | 11 8 | 7 4 | 3 0 |
bit pattern: | 0011 | R3..0 | 1100 | 0001 |
Only single bit rotates are supported..
RET |
Synopsis
Return from subroutine.
Detail
pc = r15
desc: | op | rd | rs | disp. |
size: | 4 | 4 | 4 | 4 |
bits: | 15 12 | 11 8 | 7 4 | 3 0 |
bit pattern: | 0100 | 0000 | 1111 | 0000 |
This instruction returns to the calling routine by loading the program counter with the contents of the link register. This is an alternate form of the jump-and-link (JAL) instruction. It is possible to return to a point in the program after the subroutine call by specifying a positive displacement instead of zero. This allows constant parameters to a subroutine call to be placed directly in code immediately after the calling instruction.
SBC | Rd,Rs |
Synopsis
Subtract register from register, including carry.
Detail
Rd = Rd - Rs - c
desc: | op | rd | rs | op |
size: | 4 | 4 | 4 | 4 |
bits: | 15 12 | 11 8 | 7 4 | 3 0 |
bit pattern: | 0010 | R3..0 | R3..0 | 0011 |
SBCR | Rd,#n |
Synopsis
Subtract register from immediate, including carry.
Detail
Rd = n - Rd - c
Note: this instruction subtracts the register from the immediate constant, which is opposite to what is usually desired.
desc: | op | rd | op | imm |
size: | 4 | 4 | 4 | 4 |
bits: | 15 12 | 11 8 | 7 4 | 3 0 |
bit pattern: | 0011 | R3..0 | 0011 | n3..0 |
SB | Rd,d[Rs] |
Synopsis
Store byte from register memory
Detail
memory [Rs + disp] = Rd
desc: | op | rd | rs | disp |
size: | 4 | 4 | 4 | 4 |
bits: | 15 12 | 11 8 | 7 4 | 3 0 |
bit pattern: | 1100 | R3..0 | R3..0 | d3..0 |
SHL Rd |
Synopsis
Arithmetically shift register left.
Detail
Rd = Rd << 1
desc: | op | Rd | op | imm |
size: | 4 | 4 | 4 | 4 |
bits: | 15 12 | 11 8 | 7 4 | 3 0 |
bit pattern: | 0010 | R3..0 | 1000 | 0001 |
SHR Rd |
Synopsis
Logically shift register right by one bit.
Detail
Rd = Rd >> 1
desc: | op | rd | op | resv. |
size: | 4 | 4 | 4 | 4 |
bits: | 15 12 | 11 8 | 7 4 | 3 0 |
bit pattern: | 0011 | R3..0 | 1010 | 0001 |
STOP |
Synopsis
Stop processor.
Detail
desc: | op | op | op | op |
size: | 4 | 4 | 4 | 4 |
bits: | 15 12 | 11 8 | 7 4 | 3 0 |
bit pattern: | 0000 | 0000 | 0001 | 0101 |
This instruction stops the processor causing it to wait for the 'go' signal to occur before continuing. This allows the processor to be synchronized with external hardware. The 'go' signal is an external signal provided by hardware that may be tied to the hardware interrupt system. The stop instruction by itself does not automatically cause an interrupt routine to execute.
SUBR | Rd,#n |
Synopsis
Subtract register from immediate.
Detail
Rd = n - Rd
Note: this instruction is usually the opposite of what's needed.
desc: | op | rd | op | imm |
size: | 4 | 4 | 4 | 4 |
bits: | 15 12 | 11 8 | 7 4 | 3 0 |
bit pattern: | 0011 | R3..0 | 0010 | n3..0 |
SW | Rd,d[Rs] |
Synopsis
Store word from register to memory
Detail
memory [Rs + disp] = Rd
desc: | op | rd | rs | disp |
size: | 4 | 4 | 4 | 4 |
bits: | 15 12 | 11 8 | 7 4 | 3 0 |
bit pattern: | 1101 | R3..0 | R3..0 | d3..0 |
SUB | Rd,Rs | |
SUB | Rd,Rs,#n |
Synopsis
Subtract immediate or register from register.
Detail
Rd = Rd - Rs
desc: | op | rd | rs | op |
size: | 4 | 4 | 4 | 4 |
bits: | 15 12 | 11 8 | 7 4 | 3 0 |
bit pattern: | 0010 | R3..0 | R3..0 | 0010 |
Rd = Rs - n
This is really the add instruction with the immediate constant automatically negated by the assembler. Because it's really an add instruction the immediate constant is limited to the range -biggest integer to -smallest integer
desc: | op | rd | rs | imm |
size: | 4 | 4 | 4 | 4 |
bits: | 15 12 | 11 8 | 7 4 | 3 0 |
bit pattern: | 0001 | R3..0 | R3..0 | n3..0 |
Synopsis
Perform system call.
Detail
r14 = pc; pc = FFE2; flags.backup = flags; flags.im = 1
desc: | op | op | op | op |
size: | 4 | 4 | 4 | 4 |
bits: | 15 12 | 11 8 | 7 4 | 3 0 |
bit pattern: | 0000 |
0000 |
0000 |
0001 |
This instruction causes the processor to perform a system call. It is meant for operating system support.
TRS | Rd,Spr |
Synopsis
Transfer register to special purpose register.
Detail
Special Register = Rd
desc: | op | rd | op | spr |
size: | 4 | 4 | 4 | 4 |
bits: | 15 12 | 11 8 | 7 4 | 3 0 |
bit pattern: | 0011 | R3..0 | 1111 | R3..0 |
Special Registers
Currently there are only five special registers defined, the remaining codes are reserved for future use.
Code | Register Name | Description |
0000 | SR | Status Register - contains processor flags |
0001 | ILR | Interrupt link register |
0010 | WAR | Watch Address Register |
TSR | Rd,Spr |
Synopsis
Transfer special purpose register to register.
Detail
Rd = Special Register
desc: | op | rd | op | spr |
size: | 4 | 4 | 4 | 4 |
bits: | 15 12 | 11 8 | 7 4 | 3 0 |
bit pattern: | 0011 | R3..0 | 1110 | R3..0 |
Special Registers
Currently there are only five special registers defined, the remaining codes are reserved for future use.
Code | Register Name | Description |
0000 | SR | Status Register - contains processor flags |
0001 | ILR | Interrupt link register |
0010 | WAR | Watch Address Register |
0011 | VER | reserved for core version number |
0100 | ID | Core identifier (hardware thread id) |
XOR Rd,Rs | |
XOR Rd,#n |
Synopsis
Logically exclusively 'or' register with register or immediate.
Detail
Rd = Rd ^ Rs
desc: | op | rd | rs | op |
size: | 4 | 4 | 4 | 4 |
bits: | 15 12 | 11 8 | 7 4 | 3 0 |
bit pattern: | 0010 | R3..0 | R3..0 | 0100 |
Rd = Rd ^ n
desc: | op | rd | op | imm |
size: | 4 | 4 | 4 | 4 |
bits: | 15 12 | 11 8 | 7 4 | 3 0 |
bit pattern: | 0011 | R3..0 | 0100 | n3..0 |