[STM32] - part 4 - CPU goes brrrrr
This is the final part of a series of articles. I’d suggest going through part 1, part 2 and part 3 first.
I’ve been working on a project (which you can find here), using Bluepill board with a STM32F103C8T6 microcontroller. It’s a USB keyboard project. It uses the libopencm3 project and I’d consider this a pretty minimalistic, baremetal project. It’s great for analyzing what happens when a ARM Cortex-M processor boots.
Let’s analyze the linker script and the final binary of that project to understand more about the boot up of a STM32.
A quick primer on the compilation process:
- compiler goes through all of the source files, generating target architecture instructions and saves those in object files (using the ELF format)
- object files hold symbols and each symbol has been assigned a memory section
- linker grabs all those object files and, according to the linker script’s rules, merges the sections and places them in the proper memory segments
- linker also checks if all the necessary symbols are present (resolving symbols)
You’re still with me?
Don’t worry, we’ll take a look at the examples of compilation and linking phase.
Compilation of a source file would look like so (make syntax):
CFLAGS := -mcpu=cortex-m3 -mthumb
CFLAGS += -Wall -Wextra -Werror -Wno-char-subscripts -Wno-unused-but-set-variable
CFLAGS += -DSTM32F1 -DDISCOVERY_STLINK $(INCLUDE_PATHS)
CFLAGS += -std=gnu11
CFLAGS += -O3 -g3
build:
arm-none-eabi-gcc $(CFLAGS) -c main.c -o main.o
Compilation step is done per source file. As long as you’re not using the extern symbols, the only
dependency here is the header files you include in a source file. That’s why we need to add the
include paths, with $(INCLUDE_PATHS)
.
Let’s assume all the source files have been compiled and make stored the list of the object files
in the OBJ
variable.
I guess, for this simple example, adding:
OBJ = main.o
in the Makefile, would do the job, right?
Exactly. Now for the linking stage:
LDFLAGS :=
LDFLAGS := --specs=nano.specs
#LDFLAGS += --specs=nosys.specs
# libraries
LDFLAGS += -lopencm3_stm32f1
LDFLAGS += -L../libopencm3/lib
# Compiler flags on the linking stage.
LDFLAGS += -mthumb -mcpu=cortex-m3
LDFLAGS += -nostartfiles
LDFLAGS += -lc # use libc
# Linker flags
# Stack grows downwards from the end of RAM (0x2000_0000 + 0x5000).
# The RAM size = 20480 B. In hex that's 0x5000.
# This symbol is also defined in the cortex-m-generic.ld.
LDFLAGS += -Wl,--defsym,_stack=0x20005000
LDFLAGS += -Wl,-T,memory.ld
LDFLAGS += -Wl,-Map=mapfile
LDFLAGS += -Wl,-gc-sections
LDFLAGS += -Wl,--print-memory-usage
LDFLAGS += -O3
link:
arm-none-eabi-gcc -o app.elf $(OBJ) $(LDFLAGS)
You’ve probably noticed that we’re calling the same program arm-none-eabi-gcc
for linking as we
did for compilation.
That’s because the compiler calls the linker by itself. Flags preceded with -Wl
are actually
the linker flags. This time the input is the list of all of the object files - linker’s job is to
put all those together into a single binary.
We’re outputting an ELF file here which isn’t something you can use to flash your microcontroller.
It’s just a convenient format for storing instructions, data and debug information. You can very
easily convert it into either HEX or BIN format (using objcopy
) - both of which you can flash
your microcontroller with.
I won’t explain each of the flags used here. The libopencm3 is linked as a static library (archive
of ELF files - nothing special really). The interesting part is the -T, memory.ld
and the
--defsym,_stack=0x20005000
. The first one specifies the linker script and the second one creates
a symbol. Most of the symbols come from the actual source files. Something like where to put the
stack is very much target specific, since different targets have different memory layouts, so it
makes sense to add it here. We’re already very focused on which processor are we targeting at the
linking stage.
My linker script is divided into two files, where the first one includes the second one. The first one defines the available memory. This’ll differ between different microcontrollers.
/* Define memory regions. */
MEMORY
{
/* FLASH memory 0x0800 0000 : 0801 FFFF (128K) */
rom (rx) : ORIGIN = 0x08000000, LENGTH = 128K
/* SRAM memory (20K = 0x5000) */
ram (rwx) : ORIGIN = 0x20000000, LENGTH = 20K
}
INCLUDE cortex-m-generic.ld
The second one (cortex-m-generic.ld
) instructs linker which sections (collections of symbols),
go where:
/*
* This file gets included in the memory.ld.
*/
/*
* Force symbol to be entered in the output file as an undefined symbol.
* Hope linker will find it in one of the compilation units.
*/
EXTERN(vector_table)
/* Define the entry point of the output file. */
ENTRY(reset_handler)
/*
* The SECTIONS command tells the linker how to map input sections of the object
* files, which are also ELF format, into output sections of the final ELF file
* and how to place the output sections in memory.
*/
SECTIONS
{
/*
* The sections of an object file can printed with: arm-none-eabi-objdump -h
*
* . is a location counter
* * is a wildcard
* *(.text) means all '.text' input sections in all input files
*/
.text : {
*(.vectors) /* All .vectors sections from all files */
*(.text*) /* All the .textANYTHING sections from all files */
. = ALIGN(4);
*(.rodata*) /* All the .rodataANYTHING sections from all files */
. = ALIGN(4);
} > rom /* This section goes into ROM memory */
/*
* C++ Static constructors/destructors, also used for __attribute__
* ((constructor)) and the likes
*/
.preinit_array : {
. = ALIGN(4);
__preinit_array_start = .;
KEEP (*(.preinit_array)) /* Keep the symbols even if they are not referenced */
__preinit_array_end = .;
} > rom
.init_array : {
. = ALIGN(4);
__init_array_start = .;
KEEP (*(SORT(.init_array.*)))
KEEP (*(.init_array))
__init_array_end = .;
} > rom
.fini_array : {
. = ALIGN(4);
__fini_array_start = .;
KEEP (*(.fini_array))
KEEP (*(SORT(.fini_array.*)))
__fini_array_end = .;
} > rom
/*
* Another section used by C++ stuff, appears when using newlib with
* 64bit (long long) printf support
*/
.ARM.extab : {
*(.ARM.extab*)
} > rom
/*
* Index table for C++ exceptions unwinding.
*/
.ARM.exidx : {
__exidx_start = .;
*(.ARM.exidx*)
__exidx_end = .;
} > rom
. = ALIGN(4);
_etext = .;
/*
* .data - Initialized global, static objects.
* i.e.: static int a = 4;
*/
.data : {
_data = .;
*(.data*) /* Read-write initialized data */
. = ALIGN(4);
_edata = .;
} > ram AT > rom
/*
* So the '> ram AT > rom'...
* The VMA (Virtual Memory Address) for the .data section
* is in the ram and the LMA (Load Memory Address) is in
* the rom. This is the data that will be copied from rom
* to ram as one of the first steps of the boot process.
*/
/* Create a symbol which is referenced in the lib/cm3/vector.c */
_data_loadaddr = LOADADDR(.data);
/*
* .bss - Uninitialized global, static objects.
* i.e.: static int a;
*/
.bss : {
*(.bss*) /* Read-write zero initialized data */
*(COMMON)
. = ALIGN(4);
_ebss = .;
} > ram
/*
* The .eh_frame section appears to be used for C++ exception
* handling. You may need to fix this if you're using C++.
*/
/DISCARD/ : { *(.eh_frame) }
. = ALIGN(4);
end = .;
}
/*
* Define a symbol only if it is referenced and is not defined by any
* object included in the link - basically a fallback definition.
*/
PROVIDE(_stack = ORIGIN(ram) + LENGTH(ram));
Linker script’s syntax isn’t that easy to read. I’ve tried to comment this linker script as much as I could. The main pattern is:
.final_section : {
made out of these things;
} > goes into this memory space
Whenever you see names without a dot in front of them, with an assignment operator =
it’s most
probably a new symbol, e.g.: just_a_name = .;
or just_a_name = LOADADDR(.data);
. That means you
can do this in your code and it will point to a valid part of the memory:
extern uint8_t just_a_name[];
Ok lets see if all of this makes sense. Let’s analyze the binary of the project I mentioned at the beginning of this article.
readelf --segments src/pawusb.elf
Here is the output of the command which prints the memory segments of the elf file:
Elf file type is EXEC (Executable file)
Entry point 0x8002b15
There are 3 program headers, starting at offset 52
Program Headers:
Type Offset VirtAddr PhysAddr FileSiz MemSiz Flg Align
EXIDX 0x014070 0x08004070 0x08004070 0x00008 0x00008 R 0x4
LOAD 0x010000 0x08000000 0x08000000 0x04078 0x04078 R E 0x10000
LOAD 0x020000 0x20000000 0x08004078 0x00028 0x00264 RW 0x10000
Section to Segment mapping:
Segment Sections...
00 .ARM.exidx
01 .text .ARM.exidx
02 .data .bss
Notice the VirtAddr
and the PhysAddr
columns. That’s the VMA (Virtual Memory Address) and
the LMA (Load Memory Address). For the first two segments the VMA and LMA are the
same - they start with 0x08
which, if you read the second part
you already know, points to the FLASH memory. If you look at the Section to Segment mapping
output you’ll see that those two segments hold .ARM.exidx
and .text
sections. First section being
a C++ specific index table for exceptions unwinding and the second one being the actual instructions.
Since the microcontroller executes the instructions from the FLASH memory the address starting
with 0x08
makes sense.
The last segment is different. The VirtAddr
points to the 0x2
address space. Sections that it
includes? .data
(variables with known data) and .bss
(variables without values). Those two
sections hold the data. 0x2
address space points to SRAM.
The process of uploading the binary to the processor only writes into the FLASH memory.
Our MCU has to copy the data from PhysAddr
into VirtAddr
whenever it boots.
The actual binary has to have instructions for that.
The copy this amount of data from a fixed address in the FLASH memory into a fixed
address in the RAM memory instructions have been implemented for us in the libopencm3/lib/cm3/vector.c
.
Part of that file is:
void __attribute__ ((weak)) reset_handler(void)
{
volatile unsigned *src, *dest;
funcp_t *fp;
for (src = &_data_loadaddr, dest = &_data;
dest < &_edata;
src++, dest++) {
*dest = *src;
}
while (dest < &_ebss) {
*dest++ = 0;
}
// More stuff...
...
}
Notice that the _data_loadaddr
, _data
, _edata
, and _ebss
symbols aren’t defined in this
file. All of them pop up in the linker script we discussed earlier. Those are the fixed address
I just mentioned. _data
points to the FLASH memory and the _data_loadaddr
points to SRAM.
The dest
doesn’t change between the for
and the while
loop. That’s because the .bss
section’s
data gets copied just after the .data
section.
Nothing happens automagically!
If you want to know more about the EFL’s sections you can run:
readelf --sections src/pawusb.elf
That will print all the sections. You’ll see some extra sections like those starting with .debug
.
Those are the debug symbols used by the debugger.
There are 22 section headers, starting at offset 0x714a8:
Section Headers:
[Nr] Name Type Addr Off Size ES Flg Lk Inf Al
[ 0] NULL 00000000 000000 000000 00 0 0 0
[ 1] .text PROGBITS 08000000 010000 004070 00 AX 0 0 8
[ 2] .preinit_array PREINIT_ARRAY 08004070 020028 000000 04 WA 0 0 1
[ 3] .init_array INIT_ARRAY 08004070 020028 000000 04 WA 0 0 1
[ 4] .fini_array FINI_ARRAY 08004070 020028 000000 04 WA 0 0 1
[ 5] .ARM.exidx ARM_EXIDX 08004070 014070 000008 00 AL 1 0 4
[ 6] .data PROGBITS 20000000 020000 000028 00 WA 0 0 4
[ 7] .bss NOBITS 20000028 020028 00023c 00 WA 0 0 4
[ 8] .debug_info PROGBITS 00000000 020028 00fae5 00 0 0 1
[ 9] .debug_abbrev PROGBITS 00000000 02fb0d 002b85 00 0 0 1
[10] .debug_loc PROGBITS 00000000 032692 00751f 00 0 0 1
[11] .debug_aranges PROGBITS 00000000 039bb1 000848 00 0 0 1
[12] .debug_ranges PROGBITS 00000000 03a3f9 000eb8 00 0 0 1
[13] .debug_macro PROGBITS 00000000 03b2b1 008db5 00 0 0 1
[14] .debug_line PROGBITS 00000000 044066 008976 00 0 0 1
[15] .debug_str PROGBITS 00000000 04c9dc 01fdd3 01 MS 0 0 1
[16] .comment PROGBITS 00000000 06c7af 00004c 01 MS 0 0 1
[17] .ARM.attributes ARM_ATTRIBUTES 00000000 06c7fb 00002b 00 0 0 1
[18] .debug_frame PROGBITS 00000000 06c828 0017fc 00 0 0 4
[19] .symtab SYMTAB 00000000 06e024 002170 10 20 299 4
[20] .strtab STRTAB 00000000 070194 00122a 00 0 0 1
[21] .shstrtab STRTAB 00000000 0713be 0000ea 00 0 0 1
Key to Flags:
W (write), A (alloc), X (execute), M (merge), S (strings), I (info),
L (link order), O (extra OS processing required), G (group), T (TLS),
C (compressed), x (unknown), o (OS specific), E (exclude),
y (purecode), p (processor specific)
If you want to know where each symbols lands in those sections, you can check the mapfile
, which
the linker can generate if you use this linker flag LDFLAGS += -Wl,-Map=mapfile
.
Earlier I’ve mentioned that the ARM Cortex-M processors expect a vector table to be present in the
beginning of the executable memory (that’s FLASH in the most common boot mode). Looking at the
mapfile
I’ve found this part:
.text 0x0000000008000000 0x4070
*(.vectors)
.vectors 0x0000000008000000 0x150 ../libopencm3/lib/libopencm3_stm32f1.a(vector.o)
0x0000000008000000 vector_table
*(.text*)
.text 0x0000000008000150 0x268 main.o
0x00000000080001f0 _putchar
0x00000000080001f4 str_len
A .text
section starts with all the data specified to live in the .vectors
section. Looking
at the libopencm3/lib/cm3/vector.c
one can find this structure:
__attribute__ ((section(".vectors")))
vector_table_t vector_table = {
.initial_sp_value = &_stack,
.reset = reset_handler,
.nmi = nmi_handler,
.hard_fault = hard_fault_handler,
.sv_call = sv_call_handler,
.pend_sv = pend_sv_handler,
.systick = sys_tick_handler,
.irq = {
IRQ_HANDLERS
}
}
First value references the _stack
symbol. This one has been defined both through a command line
parameter --defsym,_stack=0x20005000
and in the linker script PROVIDE(_stack = ORIGIN(ram) + LENGTH(ram));
.
The second field points to the reset_handler
function. We just looked at that function. That’s
the one that copies the data from FLASH to SRAM.
The same function calls the pre_main
and finally the main
function, which enters your
application code.
At this point the microcontroller goes brrrrrrr…
It has been a few years since part 3 of this series. That’s a yikes on my part. Consider this part to be the final one.
Sure, but if you want to learn more, I suggest reading this article: