Skip to content

Commit 2110dc5

Browse files
committed
docs: Add reference for Thumb2 inline assembler.
Thanks to Peter Hinch for contributing this.
1 parent aef3846 commit 2110dc5

14 files changed

Lines changed: 800 additions & 0 deletions

docs/pyboard/tutorial/assembler.rst

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,3 +1,5 @@
1+
.. _pyboard_tutorial_assembler:
2+
13
Inline assembler
24
================
35

Lines changed: 50 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,50 @@
1+
Arithmetic instructions
2+
=======================
3+
4+
Document conventions
5+
--------------------
6+
7+
Notation: ``Rd, Rm, Rn`` denote ARM registers R0-R7. ``immN`` denotes an immediate
8+
value having a width of N bits e.g. ``imm8``, ``imm3``. ``carry`` denotes
9+
the carry condition flag, ``not(carry)`` denotes its complement. In the case of instructions
10+
with more than one register argument, it is permissible for some to be identical. For example
11+
the following will add the contents of R0 to itself, placing the result in R0:
12+
13+
* add(r0, r0, r0)
14+
15+
Arithmetic instructions affect the condition flags except where stated.
16+
17+
Addition
18+
--------
19+
20+
* add(Rdn, imm8) ``Rdn = Rdn + imm8``
21+
* add(Rd, Rn, imm3) ``Rd = Rn + imm3``
22+
* add(Rd, Rn, Rm) ``Rd = Rn +Rm``
23+
* adc(Rd, Rn) ``Rd = Rd + Rn + carry``
24+
25+
Subtraction
26+
-----------
27+
28+
* sub(Rdn, imm8) ``Rdn = Rdn - imm8``
29+
* sub(Rd, Rn, imm3) ``Rd = Rn - imm3``
30+
* sub(Rd, Rn, Rm) ``Rd = Rn - Rm``
31+
* sbc(Rd, Rn) ``Rd = Rd - Rn - not(carry)``
32+
33+
Negation
34+
--------
35+
36+
* neg(Rd, Rn) ``Rd = -Rn``
37+
38+
Multiplication and division
39+
---------------------------
40+
41+
* mul(Rd, Rn) ``Rd = Rd * Rn``
42+
43+
This produces a 32 bit result with overflow lost. The result may be treated as
44+
signed or unsigned according to the definition of the operands.
45+
46+
* sdiv(Rd, Rn, Rm) ``Rd = Rn / Rm``
47+
* udiv(Rd, Rn, Rm) ``Rd = Rn / Rm``
48+
49+
These functions perform signed and unsigned division respectively. Condition flags
50+
are not affected.
Lines changed: 90 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,90 @@
1+
Comparison instructions
2+
=======================
3+
4+
These perform an arithmetic or logical instruction on two arguments, discarding the result
5+
but setting the condition flags. Typically these are used to test data values without changing
6+
them prior to executing a conditional branch.
7+
8+
Document conventions
9+
--------------------
10+
11+
Notation: ``Rd, Rm, Rn`` denote ARM registers R0-R7. ``imm8`` denotes an immediate
12+
value having a width of 8 bits.
13+
14+
The Application Program Status Register (APSR)
15+
----------------------------------------------
16+
17+
This contains four bits which are tested by the conditional branch instructions. Typically a
18+
conditional branch will test multiple bits, for example ``bge(LABEL)``. The meaning of
19+
condition codes can depend on whether the operands of an arithmetic instruction are viewed as
20+
signed or unsigned integers. Thus ``bhi(LABEL)`` assumes unsigned numbers were processed while
21+
``bgt(LABEL)`` assumes signed operands.
22+
23+
APSR Bits
24+
---------
25+
26+
* Z (zero)
27+
28+
This is set if the result of an operation is zero or the operands of a comparison are equal.
29+
30+
* N (negative)
31+
32+
Set if the result is negative.
33+
34+
* C (carry)
35+
36+
An addition sets the carry flag when the result overflows out of the MSB, for example adding
37+
0x80000000 and 0x80000000. By the nature of two's complement arithmetic this behaviour is reversed
38+
on subtraction, with a borrow indicated by the carry bit being clear. Thus 0x10 - 0x01 is executed
39+
as 0x10 + 0xffffffff which will set the carry bit.
40+
41+
* V (overflow)
42+
43+
The overflow flag is set if the result, viewed as a two's compliment number, has the "wrong" sign
44+
in relation to the operands. For example adding 1 to 0x7fffffff will set the overflow bit because
45+
the result (0x8000000), viewed as a two's complement integer, is negative. Note that in this instance
46+
the carry bit is not set.
47+
48+
Comparison instructions
49+
-----------------------
50+
51+
These set the APSR (Application Program Status Register) N (negative), Z (zero), C (carry) and V
52+
(overflow) flags.
53+
54+
* cmp(Rn, imm8) ``Rn - imm8``
55+
* cmp(Rn, Rm) ``Rn - Rm``
56+
* cmn(Rn, Rm) ``Rn + Rm``
57+
* tst(Rn, Rm) ``Rn & Rm``
58+
59+
Conditional execution
60+
---------------------
61+
62+
The ``it`` and ``ite`` instructions provide a means of conditionally executing from one to four subsequent
63+
instructions without the need for a label.
64+
65+
* it(<condition>) If then
66+
67+
Execute the next instruction if <condition> is true:
68+
69+
::
70+
71+
cmp(r0, r1)
72+
it(eq)
73+
mov(r0, 100) # runs if r0 == r1
74+
# execution continues here
75+
76+
* ite(<condition>) If then else
77+
78+
If <condtion> is true, execute the next instruction, otherwise execute the
79+
subsequent one. Thus:
80+
81+
::
82+
83+
cmp(r0, r1)
84+
ite(eq)
85+
mov(r0, 100) # runs if r0 == r1
86+
mov(r0, 200) # runs if r0 != r1
87+
# execution continues here
88+
89+
This may be extended to control the execution of upto four subsequent instructions: it[x[y[z]]]
90+
where x,y,z=t/e; e.g. itt, itee, itete, ittte, itttt, iteee, etc.
Lines changed: 36 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,36 @@
1+
Assembler Directives
2+
====================
3+
4+
Labels
5+
------
6+
7+
* label(INNER1)
8+
9+
This defines a label for use in a branch instruction. Thus elsewhere in the code a ``b(INNER1)``
10+
will cause execution to continue with the instruction after the label directive.
11+
12+
Defining inline data
13+
--------------------
14+
15+
The following assembler directives facilitate embedding data in an assembler code block.
16+
17+
* data(size, d0, d1 .. dn)
18+
19+
The data directive creates n array of data values in memory. The first argument specifies the
20+
size in bytes of the subsequent arguments. Hence the first statement below will cause the
21+
assembler to put three bytes (with values 2, 3 and 4) into consecutive memory locations
22+
while the second will cause it to emit two four byte words.
23+
24+
::
25+
26+
data(1, 2, 3, 4)
27+
data(4, 2, 100000)
28+
29+
Data values longer than a single byte are stored in memory in little-endian format.
30+
31+
* align(nBytes)
32+
33+
Align the following instruction to an nBytes value. ARM Thumb-2 instructions must be two
34+
byte aligned, hence it's advisable to issue ``align(2)`` after ``data`` directives and
35+
prior to any subsequent code. This ensures that the code will run irrespective of the
36+
size of the data array.
Lines changed: 77 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,77 @@
1+
Floating Point instructions
2+
==============================
3+
4+
These instructions support the use of the ARM floating point coprocessor
5+
(on platforms such as the Pyboard which are equipped with one). The FPU
6+
has 32 registers known as ``s0-s31`` each of which can hold a single
7+
precision float. Data can be passed between the FPU registers and the
8+
ARM core registers with the ``vmov`` instruction.
9+
10+
Note that MicroPython doesn't support passing floats to
11+
assembler functions, nor can you put a float into ``r0`` and expect a
12+
reasonable result. There are two ways to overcome this. The first is to
13+
use arrays, and the second is to pass and/or return integers and convert
14+
to and from floats in code.
15+
16+
Document conventions
17+
--------------------
18+
19+
Notation: ``Sd, Sm, Sn`` denote FPU registers, ``Rd, Rm, Rn`` denote ARM core
20+
registers. The latter can be any ARM core register although registers
21+
``R13-R15`` are unlikely to be appropriate in this context.
22+
23+
Arithmetic
24+
----------
25+
26+
* vadd(Sd, Sn, Sm) ``Sd = Sn + Sm``
27+
* vsub(Sd, Sn, Sm) ``Sd = Sn - Sm``
28+
* vneg(Sd, Sm) ``Sd = -Sm``
29+
* vmul(Sd, Sn, Sm) ``Sd = Sn * Sm``
30+
* vdiv(Sd, Sn, Sm) ``Sd = Sn / Sm``
31+
* vsqrt(Sd, Sm) ``Sd = sqrt(Sm)``
32+
33+
Registers may be identical: ``vmul(S0, S0, S0)`` will execute ``S0 = S0*S0``
34+
35+
Move between ARM core and FPU registers
36+
---------------------------------------
37+
38+
* vmov(Sd, Rm) ``Sd = Rm``
39+
* vmov(Rd, Sm) ``Rd = Sm``
40+
41+
The FPU has a register known as FPSCR, similar to the ARM core's APSR, which stores condition
42+
codes plus other data. The following instructions provide access to this.
43+
44+
* vmrs(APSR\_nzcv, FPSCR)
45+
46+
Move the floating-point N, Z, C, and V flags to the APSR N, Z, C, and V flags.
47+
48+
This is done after an instruction such as an FPU
49+
comparison to enable the condition codes to be tested by the assembler
50+
code. The following is a more general form of the instruction.
51+
52+
* vmrs(Rd, FPSCR) ``Rd = FPSCR``
53+
54+
Move between FPU register and memory
55+
------------------------------------
56+
57+
* vldr(Sd, [Rn, offset]) ``Sd = [Rn + offset]``
58+
* vstr(Sd, [Rn, offset]) ``[Rn + offset] = Sd``
59+
60+
Where ``[Rn + offset]`` denotes the memory address obtained by adding Rn to the offset. This
61+
is specified in bytes. Since each float value occupies a 32 bit word, when accessing arrays of
62+
floats the offset must always be a multiple of four bytes.
63+
64+
Data Comparison
65+
---------------
66+
67+
* vcmp(Sd, Sm)
68+
69+
Compare the values in Sd and Sm and set the FPU N, Z,
70+
C, and V flags. This would normally be followed by ``vmrs(APSR_nzcv, FPSCR)``
71+
to enable the results to be tested.
72+
73+
Convert between integer and float
74+
---------------------------------
75+
76+
* vcvt\_f32\_s32(Sd, Sm) ``Sd = float(Sm)``
77+
* vcvt\_s32\_f32(Sd, Sm) ``Sd = int(Sm)``

0 commit comments

Comments
 (0)