Fastest implementation of rotate shift left (or right) by several bits on AVR and MSP
Brief Introduction: 16 16-bit register. Four of the registers are dedicated to program counter(r0 or pc), stack point(r1 or sp), status register(r2 or sr/cg1) and constant generator(r3 or cg2), while the remaining 12 registers(r4-r15) are general-purpose registers. There are 52 instructions in total.
Instructions in MSP are different with other microcontrollers.
bit rs, rd : rs & rd; Set status only, the destination is not written.
Logical instructions set C to the opposite of Z (C is set if the result is NOT zero), and clear V to 0.
bit #1, r8;
rrc r9;
rrc r8;
A byte instruction with a register destination clears the higher 8 bits of the register to 0.
mov(.b) @rs+, rd : Indirect autoincrement. The operand is in memory at the address held in rs, then the register rs is incremented by 1(operation in byte) or 2(operation in word).
Calling Convention and ABI Changes in MSP GCC
The stack pointer is always even. So pop and pop.b instructions will all increase SP by 2. And push and push.b instructions will all decrease SP by 2.
The least number of instructions needed for rotate shift. It means rotating shift right if x is positive, otherwise rotating shift left. The red pointers are the basic operations which can not be implemented by others. For example, rotate shift left by 2 bits can be implemented using rotate shift left by 1 bit twice. But it can not be done in turn. (It is the same for AVR)
Most of the 133 instructions require a single cycle to execute. The rich instruction set in combimed with the 32 8-bit general purpose registers(r0-r31) with single clock access time. Six of the 32 8-bit registers can be used as three 16-bit indirect register pointers(X, r26-r27; Y, r28-r29; and Z, r30-r31) for addressing the data space.
Instruction ldi r26, low(key) and ldi r27, high(key) can not be used in assemble c. It should be like this ldi r26, lo8(key) and ldi r27, hi8(key).
Despite using #include”constants.h”, some const values, such as KEY_SIZE, NUMBER_OF_ROUNDS and so on, can still not be used directly. Therefore, immediate numbers are used.
The second operand of adiw is belong to [0, 63].
Therefore, adiw r28, 176 is wrong(operand is out of range). It can be replaced by:
Implementation problems:
Solution: Something like “#define DFDZero #0xff00” (in msp) and “#define CONST_F0 0xf0”(in avr) is used.
AVR: Error: register r24, r26, r28 or r30 required
Solution: Change sbiw z, 16 to sbiw r30, 16 of deckeyxor_flash in avr_basic_asm_macros.h. See AVR-GCC Inline Assembler Cookbook
AVR: decrypt.c(.text.Decrypt+0xae): relocation truncated to fit: R_AVR_7_PCREL against ‘no symbol’
mov r4, #0xdbac65e0 gives the error message invalid constant (dbac65e0) after fieup.
The following instructions can implement it:
mov r4, #0xdb
lsl r4, #8
eor r4, r4, #0xac
lsl r4, #8
eor r4, r4, #0x65
lsl r4, #8
eor r4, r4, #0xe0
stmia(stmib), stmdb(stmda), ldmia(ldmib), ldmdb(ldmda)