Overview
Accurate error site recording in embedded programs is crucial for efficient issue localization and analysis when faults occur. The Ameba SoC offers specialized error handling mechanisms, including Crash Dump and Backtrace features:
Crash Dump: Automatically saves the general registers, system registers, and stack information at the moment an exception occurs, enabling the restoration of the error context.
Backtrace: Analyzes and reconstructs the function call path that led to the exception, based on the crash location.
Feature |
ARMv8-M |
RISC-V |
---|---|---|
Backtrace Start PC |
Pushed onto the stack by hardware, needs to be read from the stack |
Stored in the mepc register |
Register Saving |
Hardware automatically saves some core registers (8 registers) |
Hardware does not save general registers, must be saved by software in the Trap Handler |
Frame Pointer (FP) |
Usually R7 |
Usually x8 (s0/fp) |
Return Address (RA) |
LR (Link Register) |
ra (Return Address, x1) |
Core Dependency |
Relies on hardware stack push mechanism and Fault_Handler |
Relies on Trap Handler to save all registers correctly |
Automation Tools |
addr2line, GDB, etc., for ELF/memory dump analysis |
addr2line, GDB, etc., also apply, following the same principle |
Caution
To improve the performance of the SDK in actual operation, the GCC compiler optimization level is set to O2
or Os
, which by default enables the fomit-frame-pointer
option. When this option is enabled, the fp
register is not saved, meaning that stack backtrace via the fp register is not possible.
Crash Dump
The core principle of crash dump is to utilize the CPU’s internal exception detection mechanism—when an exception occurs, the CPU hardware automatically saves key information at the moment of the exception, then jumps to the exception handling function. The main steps are as follows:
Exception Context Capture: When an exception occurs (such as hard fault, illegal instruction, access exception, etc.), the CPU automatically saves key information of the current execution context, including parts of the general registers, program counter (PC), and exception-related status registers.
Special Register Collection: Core diagnostic information such as error type and fault address is extracted by reading platform-specific exception registers (e.g. ARM’s SCB registers: CFSR, HFSR, MMFAR, BFAR; or RISC-V’s mepc, mcause, mtval, etc.).
Stack Collection: A snapshot of the runtime stack at the exception site is captured through the current stack pointer (ARM: MSP/PSP; RISC-V: sp) for subsequent function backtrace and call chain analysis.
Dump Implementation: During the Fault/Trap handling process, general registers, special registers, and stack data are structured and stored or output (e.g., UART/log/flash), for offline or remote fault analysis.
Hint
Crash dump makes use of the CPU’s built-in exception detection. This means the following program anomalies cannot be detected by this method:
Program deadlock or infinite loops: Commonly due to logic errors, accesses to non-existent address spaces, or accesses to peripherals without clock enabled, etc.
Resource exhaustion: Typical example is heap exhaustion causing malloc failure
When exceptions occur:
ARM Cortex-M will, at the hardware level, save registers R0~R3, R12, LR, PC, xPSR, etc. If an FPU exists and an exception occurs in a floating-point context, floating-point registers are also saved. See ARM-v8M Exception Stack.
In RISCV, the hardware does not automatically save registers upon exception. They must be saved manually in the Trap Handler. The Ameba SDK already implements saving of X0-X31 and the MSTATUS register, as depicted in RISCV-IMACF Exception Stack.
ARM-v8M Exception Stack
RISCV-IMACF Exception Stack
Note
The stack grows downward and is aligned by 4 Bytes.
Backtrace
The core principle of function backtrace is to utilize the stack frame structure—each time a function is called, the compiler automatically generates code to create a stack frame, saving the return address, callee-saved arguments, and local variables. The main steps are as follows:
Exception Site Capture: After an exception, the instruction address (PC) at which the exception occurred and the stack must be saved.
Find the Return Address: According to the stack frame size set by the current function and the instruction that manipulates the return address register, locate the previous caller’s return address. Repeat this process until the desired backtrace depth is reached or until the bottom of the stack.
As shown in Function Call Stack, the stack frame structure differs between ISAs:
Function Call Stack
By combining the function stack and specific stack operation instructions in the ELF/AXF/BIN files, the return address of each stack frame can be found, and the complete function call path can be reconstructed.
Caution
Under the following conditions, backtrace may fail:
If a crash occurs while processing an interrupt, stack call relationships may be missing, so only the registers at the exception site can be printed. In other words, interrupt handlers should avoid deep call chains.
When Secure features are enabled, backtrace may fail due to access-protected address ranges.
Analysis and Troubleshooting
By default, when an exception occurs, the crash dump and stack backtrace information are printed via LOGUART:
ARM-v8M
When an ARMv8M CPU exception occurs, the system outputs a detailed crash log. Below is a sample log for an exception caused by Accessing a Non-existent Address:
========== Crash Dump ==========
------------Task Info------------
Fault on task <shell_task>
Task ID: 1
Task TCB:0x2000e720
Current State: 0 (Running)
Base Priority: 5
Current Priority: 5
Run Time Counter: 2
StackTop: 0x2000e630, StackBase: 0x2000d620, StackEnd: 0x2000e6e0, StackSize=1073(word)
Stack High WaterMark: 1001(word)
------------Task Info------------
Exception caught on 0e00ed0e
BFSR: [0x00000004] -> Bus fault is caused by imprecise data access violation
========== Register Dump ==========
[ LR] 0x0e00ed07
[ PC] 0x0e00ed0e
[xPSR] 0x21000000
[EXCR] 0xfffffffd
[ R0] 0x0000000b
[ R1] 0x0e01becb
[ R2] 0x00000041
[ R3] 0xdeadbeef
[ R4] 0x00000000
[ R5] 0x2ffffffc
[ R6] 0x00000001
[ R7] 0x30000000
[ R8] 0x0e01becb
[ R9] 0x0e01c202
[ R10] 0x0e01c204
[ R11] 0x0e01d920
[ R12] 0x00045418
==========KM4 Stack Dump ==========
Current Stack Pointer = 0x2000e630, and dump stack depth = 128
[0x2000e630] 0000000b 0e01becb 00000041 deadbeef
[0x2000e640] 00045418 0e00ed07 0e00ed0e 21000000
[0x2000e650] 2ffffffc 00000001 00000001 e000ed00
[0x2000e660] 20004080 00000000 00011721 20004161
[0x2000e670] 00011671 0e00196b 00000002 0e01d930
[0x2000e680] 2000407c 00000001 00000006 0e00bc77
[0x2000e690] 00000041 200040bc 20004161 00010219
[0x2000e6a0] 2000c3f0 20004160 00000000 2000c470
[0x2000e6b0] 11111111 0e00be41 00000000 01010101
[0x2000e6c0] 04040404 05050505 06060606 07070707
[0x2000e6d0] 08080808 09090909 10101010 0e01040b
[0x2000e6e0] a5a5a5a5 a5a5a5a5 a5a5a5a5 a5a5a5a5
[0x2000e6f0] a5a5a5a5 a5a5a5a5 a5a5a5a5 a5a5a5a5
[0x2000e700] 00000000 800001a0 25447cfa d2e11067
[0x2000e710] f5ffef97 38213187 53b359bc c8cadae8
[0x2000e720] 2000e630 79b46159 2000c570 2000c570
[0x2000e730] 2000e720 2000c568 00000006 2000c638
[0x2000e740] 2000c638 2000e720 00000000 00000005
[0x2000e750] 2000d620 6c656873 61745f6c 38006b73
[0x2000e760] 14c74143 3fc569e7 00ad1c45 2000e6e0
[0x2000e770] 00000001 09d35ecf 00000005 00000000
[0x2000e780] 00000000 00000000 00000000 00000002
[0x2000e790] 00000000 2000d00c 2000d074 2000d0dc
[0x2000e7a0] 00000000 00000000 00000000 00000000
[0x2000e7b0] 00000000 00000000 00000000 00000000
[0x2000e7c0] 00000000 00000000 00000000 00000000
[0x2000e7d0] 00000000 00000000 00000000 00000000
[0x2000e7e0] 00000000 00000000 00000000 00000000
[0x2000e7f0] 00000000 00000000 00000000 00000000
[0x2000e800] 00000000 00000000 00000000 00000000
[0x2000e810] 00000000 00000000 00000000 00000000
[0x2000e820] 00000000 00000000 00000000 00000000
========== Stack Trace ==========
Start stack backtracing for sp 0x2000e650, pc 0x0e00ed0e, lr 0x0e00ed07
/opt/rtk-toolchain/asdk-10.3.1-4354/linux/newlib/bin/arm-none-eabi-addr2line -e /home/user_name/sdk/amebalite_gcc_project/project_km4/asdk/image/target_img2.axf -afpiC 0x0e00ed0e 0x0e001966 0x0e00bc74 0x0e00be3c
========== End of Stack Trace ==========
========== End of Crash Dump ==========
[FAULT-A] SHCSR = 0x000f0002
[FAULT-A] AIRCR = 0xfa054000
[FAULT-A] CONTROL = 0x00000000
Bus Fault:
Secure State: 1
Stacked:
R0 = 0x0000000b
R1 = 0x0e01becb
R2 = 0x00000041
R3 = 0xdeadbeef
R12 = 0x00045418
LR = 0x0e00ed07
PC = 0x0e00ed0e
PSR = 0x21000000
Current:
EXC_RETURN = 0xfffffffd
MSP = 0x20003fe0
PSP = 0x2000e630
xPSR = 0xa0000005
CFSR = 0x00000400
HFSR = 0x00000000
DFSR = 0x00000000
MMFAR = 0x00000000
BFAR = 0x00000000
AFSR = 0x00000000
PriMask = 0x00000000
SVC priority: 0x00
PendSVC priority: 0xe0
Systick priority: 0xe0
MSP_NS = 0x20004000
PSP_NS = 0x00000000
CFSR_NS = 0x00000000
HFSR_NS = 0x00000000
DFSR_NS = 0x00000000
MMFAR_NS = 0x00000000
BFAR_NS = 0x00000000
AFSR_NS = 0x00000000
SVC priority NS: 0x00
PendSVC priority NS: 0x00
Systick priority NS: 0x00
Classification and Troubleshooting of LOG Information:
Task Information: Task Control Block (TCB) information is only printed when an exception occurs during task execution; it will not be printed in bare-metal mode. The task name helps identify the specific task where the error occurred, and by checking the stack usage (e.g., High WaterMark), one can determine if a stack overflow happened. In the above example, the error occurred in the shell task.
Exception Type and Reason: Users can determine the specific error type from the printed message. For example, the printed result
BFSR: [0x00000004] -> Bus fault is caused by imprecise data access violation
indicates a bus fault.General Registers: When an exception occurs, the general-purpose registers capture what the CPU was accessing. The PC register points to the address of the instruction where the error happened, and the LR register points to the calling function. By locating the PC value
e00ed0e
in the corresponding assembly file, one can find that the exception occurred inrtk_log_memory_dump_word()
.
0e00ecc8 <rtk_log_memory_dump_word>:
e00ecc8: e92d 47f3 stmdb sp!, {r0, r1, r4, r5, r6, r7, r8, r9, sl, lr}
...
e00ed0e: 2001 movs r0, #1
...
Exception Stack: Records the call stack of function calls at the moment the exception occurred. The stack contains the registers R0~R3, R12, LR, PC, xPSR, etc., that were pushed during the exception.
Stack Backtrace Information: The stack backtrace mechanism can reconstruct the function call sequence. Users can copy this information into the compilation environment of the SDK to directly obtain the function names and their call relationships where errors occurred.
>>> $ /opt/rtk-toolchain/asdk-10.3.1-4354/linux/newlib/bin/arm-none-eabi-addr2line -e /home/user_name/sdk/amebalite_gcc_project/project_km4/asdk/image/target_img2.axf -afpiC 0x0e00ed0e 0x0e001966 0x0e00bc74 0x0e00be3c
0x0e00ed0e: rtk_log_memory_dump_word at /home/user_name/sdk/component/soc/amebalite/swlib/log.c:152 (discriminator 3)
0x0e001966: cmd_dump_word at /home/user_name/sdk/component/at_cmd/monitor.c:136
0x0e00bc74: shell_cmd_exec_ram at /home/user_name/sdk/component/soc/amebalite/app/monitor/ram/shell_ram.c:68
0x0e00be3c: shell_task_ram at /home/user_name/sdk/component/soc/amebalite/app/monitor/ram/shell_ram.c:236
System Registers: System registers record auxiliary information such as stack pointers, system fault functions, and priority settings.
Caution
The KM4 CPU in Ameba SoC supports stack backtrace functionality, while KM0 does not support it.
RISCV
When a RISCV CPU encounters an exception, the system outputs a detailed crash log. The following LOG showcases an exception caused by Hitting Debug Breakpoint:
------------------------------------
Have a test on crash_dump and back trace
crash_SysBreak()
crash_SysBreak1()
crash_SysBreak2()==> Issue BREAK instruction
========== Crash Dump ==========
------------Task Info------------
Fault on task <shell_task>
Task ID: 1
Task TCB:0x2006ba40
Current State: 0 (Running)
Base Priority: 5
Current Priority: 5
Run Time Counter: 2
StackTop: 0x2006b5b8, StackBase: 0x2006b480, StackEnd: 0x2006ba00, StackSize=353(word)
Stack High WaterMark: 78(word)
------------Task Info------------
Exception caught on 0x0c00961e with reason [0x3] -> [Breakpoint]
========== Register Dump ==========
[mscratch] 0x00000000
[mepc] 0x0c00961e
[mcause] 0x00000003
[mtval] 0x00000000
[x0 -> zero] 0x00000000
[x1 -> ra] 0x0c00961e
[x2 -> sp] 0x2006b970
[x3 -> gp] 0x20068f24
[x4 -> tp] 0xffffffff
[x5 -> t0] 0xa5a5a5a5
[x6 -> t1] 0x000bee33
[x7 -> t2] 0xa5a5a5a5
[x8 -> s0/fp] 0x2006b980
[x9 -> s1] 0x00000002
[x10 -> a0] 0x0000002e
[x11 -> a1] 0x00000000
[x12 -> a2] 0x00000000
[x13 -> a3] 0x2006ba40
[x14 -> a4] 0x00000000
[x15 -> a5] 0x00000000
[x16 -> a6] 0x00200000
[x17 -> a7] 0x41014000
[x18 -> s2] 0x20000b04
[x19 -> s3] 0x00000005
[x20 -> s4] 0x00000006
[x21 -> s5] 0x20000be9
[x22 -> s6] 0x0c016374
[x23 -> s7] 0x0c013000
[x24 -> s8] 0xa5a5a5a5
[x25 -> s9] 0xa5a5a5a5
[x26 -> s10] 0xa5a5a5a5
[x27 -> s11] 0xa5a5a5a5
[x28 -> t3] 0x0000000f
[x29 -> t4] 0xa5a5a5a5
[x30 -> t5] 0xa5a5a5a5
[x31 -> t6] 0xa5a5a5a5
========== Stack Trace ==========
Start stack backtracing for sp 0x2006b970, pc 0x0c00961e
[frame #0] sp-> 0x2006b970, pc-> 0x0c00961e, stack_size-> 16, ra-> 0x0c00964a
[frame #1] sp-> 0x2006b980, pc-> 0x0c00964a, stack_size-> 16, ra-> 0x0c009674
[frame #2] sp-> 0x2006b990, pc-> 0x0c009674, stack_size-> 16, ra-> 0x0c0096a2
[frame #3] sp-> 0x2006b9a0, pc-> 0x0c0096a2, stack_size-> 16, ra-> 0x0c004544
========== End of Stack Trace ==========
========== End of Crash Dump ==========
Classification and Troubleshooting of LOG Information:
Task Information: Task Control Block information is only printed when an exception occurs during task execution; it will not be printed in bare-metal mode. The task name helps to confirm that the above error (the debug breakpoint is an RISCV exception of number 3) occurred in the shell task.
Exception Type and Reason: Users can determine the specific error type from the printed message. For example, the output
Exception caught on 0x0c00961e with reason [0x3] -> [Breakpoint]
indicates a debug exception triggered by theebreak
instruction.General Registers: General-purpose registers record what the CPU was accessing before the exception. The MEPC register points to the address of the instruction where the error occurred, and the X1 (RA) register points to the calling function. By locating the PC value
0x0c00961e
in the corresponding assembly file, it can be seen that the exception occurred incrash_SysBreak2()
.
0c009602 <crash_SysBreak2>:
c009602: 1141 c.addi sp,-16
c009604: c606 c.swsp ra,12(sp)
c009606: c422 c.swsp s0,8(sp)
c009608: 0800 c.addi4spn s0,sp,16
c00960a: 0c0157b7 lui a5,0xc015
c00960e: cd878593 addi a1,a5,-808 # c014cd8 <__func__.2>
c009612: 0c0157b7 lui a5,0xc015
c009616: bbc78513 addi a0,a5,-1092 # c014bbc <pmap_func+0x648>
c00961a: ebefe0ef jal ra,c007cd8 <__wrap_printf>
c00961e: 9002 c.ebreak # Exception Location
c009620: 0001 c.addi zero,0
c009622: 40b2 c.lwsp ra,12(sp)
c009624: 4422 c.lwsp s0,8(sp)
c009626: 0141 c.addi sp,16
c009628: 8082 c.jr ra
Exception Stack: Records the call stack at the time the exception occurred. Not printed in this example; users can modify
CONFIG_DEBUG_BACK_TRACE
to#undef
to enable stack printing.Stack Backtrace Information: The stack backtrace mechanism helps reconstruct the function call sequence. Users can use the PC values found from the log to locate the corresponding functions in the assembly files.
Common Exceptions and Faults
Common ARMv8-M Faults
Typical FAULT on ARMv8-M include Bus Fault, Usage Fault, MemManage Fault, etc. When the CPU implements the Secure extension, Secure Fault may also occur during use.
No. |
Name / English |
Brief Description |
---|---|---|
3 |
HardFault |
All non-maskable unexpected errors, not covered by other faults |
4 |
MemManage Fault |
Memory management error; code/data access rights violation |
5 |
BusFault |
Bus access error (e.g., invalid address) |
6 |
UsageFault |
Usage error (misalignment, illegal instruction, stack overflow, etc.) |
7 |
SecureFault |
Security error (illegal access to protected area) |
Note
KM0 does not support bus fault, usage fault, and memory management fault. All faults are handled by Hard Fault.
If the enable bits for Bus Fault, Usage Fault, or MemManage Fault are not set in the System Handler Control and State Register (SCHSR), all such faults will be handled as Hard Faults.
If a new Fault occurs while handling a Usage Fault or MemManage Fault, it will escalate to a Hard Fault.
MemManage Fault usually occurs after the MPU is enabled when accessing protected address space.
Common RISC-V Exceptions
The exception mechanism in RISC-V architecture is quite different from that of ARM. RISC-V uses Exception Codes and Cause to identify and locate errors. Common exceptions include illegal memory access, stack overflows, illegal instructions, environment calls, and more.
No. / Code |
Name |
Brief Description |
---|---|---|
0 |
Instruction address misaligned |
Misaligned instruction address access |
1 |
Instruction access fault |
Instruction fetch error (e.g., invalid memory) |
2 |
Illegal instruction |
Illegal or unimplemented instruction |
3 |
Breakpoint |
Breakpoint exception, typically for debugging |
4 |
Load address misaligned |
Misaligned memory access on load |
5 |
Load access fault |
Load (read) access to invalid or forbidden memory |
6 |
Store/AMO address misaligned |
Misaligned address access on store/AMO |
7 |
Store/AMO access fault |
Store/AMO access to invalid or forbidden memory |
8 |
Environment call from U-mode |
Environment call (ecall) from User mode |
9 |
Environment call from S-mode |
Environment call from Supervisor mode |
11 |
Environment call from M-mode |
Environment call from Machine mode |
12 |
Instruction page fault |
Instruction page fault (e.g., non-existent or no permission page) |
13 |
Load page fault |
Data load page fault |
15 |
Store/AMO page fault |
Store/AMO page fault |
Note
KR4 is a RISC-V IMACF architecture CPU, and only implements User and Machine modes; Supervisor mode is not implemented.
Similarities and Differences
Type |
RISC-V Exception Example |
ARMv8-M Exception Example |
---|---|---|
Misaligned Address |
Instruction/Load/Store misaligned |
UsageFault: Unaligned Access |
Access Violation / Invalid Address |
Load/Store Access Fault |
MemManage/BusFault |
Illegal / Undefined Instruction |
Illegal instruction |
UsageFault: Undefined Instruction |
System Call / Software Interrupt |
Environment call (ecall) |
SVCall (SVC) |
Debug Exception |
Breakpoint |
Debug Monitor / Breakpoint |
Page Table Fault |
Series of Page Faults |
N/A (Cortex-M lacks MMU, only supports MPU) |
Stack Overflow |
Fault caused by access violation |
UsageFault: Stack overflow |