Overview

Accurate error site recording in embedded programs is crucial for efficient issue localization and analysis when faults occur. The Ameba SoC offers specialized error handling mechanisms, including Crash Dump and Backtrace features:

  • Crash Dump: Automatically saves the general registers, system registers, and stack information at the moment an exception occurs, enabling the restoration of the error context.

  • Backtrace: Analyzes and reconstructs the function call path that led to the exception, based on the crash location.

Exception Handling Characteristics for Different Instruction Set Architectures

Feature

ARMv8-M

RISC-V

Backtrace Start PC

Pushed onto the stack by hardware, needs to be read from the stack

Stored in the mepc register

Register Saving

Hardware automatically saves some core registers (8 registers)

Hardware does not save general registers, must be saved by software in the Trap Handler

Frame Pointer (FP)

Usually R7

Usually x8 (s0/fp)

Return Address (RA)

LR (Link Register)

ra (Return Address, x1)

Core Dependency

Relies on hardware stack push mechanism and Fault_Handler

Relies on Trap Handler to save all registers correctly

Automation Tools

addr2line, GDB, etc., for ELF/memory dump analysis

addr2line, GDB, etc., also apply, following the same principle

Caution

To improve the performance of the SDK in actual operation, the GCC compiler optimization level is set to O2 or Os, which by default enables the fomit-frame-pointer option. When this option is enabled, the fp register is not saved, meaning that stack backtrace via the fp register is not possible.

Crash Dump

The core principle of crash dump is to utilize the CPU’s internal exception detection mechanism—when an exception occurs, the CPU hardware automatically saves key information at the moment of the exception, then jumps to the exception handling function. The main steps are as follows:

  • Exception Context Capture: When an exception occurs (such as hard fault, illegal instruction, access exception, etc.), the CPU automatically saves key information of the current execution context, including parts of the general registers, program counter (PC), and exception-related status registers.

  • Special Register Collection: Core diagnostic information such as error type and fault address is extracted by reading platform-specific exception registers (e.g. ARM’s SCB registers: CFSR, HFSR, MMFAR, BFAR; or RISC-V’s mepc, mcause, mtval, etc.).

  • Stack Collection: A snapshot of the runtime stack at the exception site is captured through the current stack pointer (ARM: MSP/PSP; RISC-V: sp) for subsequent function backtrace and call chain analysis.

  • Dump Implementation: During the Fault/Trap handling process, general registers, special registers, and stack data are structured and stored or output (e.g., UART/log/flash), for offline or remote fault analysis.

Hint

Crash dump makes use of the CPU’s built-in exception detection. This means the following program anomalies cannot be detected by this method:

  • Program deadlock or infinite loops: Commonly due to logic errors, accesses to non-existent address spaces, or accesses to peripherals without clock enabled, etc.

  • Resource exhaustion: Typical example is heap exhaustion causing malloc failure

When exceptions occur:

  • ARM Cortex-M will, at the hardware level, save registers R0~R3, R12, LR, PC, xPSR, etc. If an FPU exists and an exception occurs in a floating-point context, floating-point registers are also saved. See ARM-v8M Exception Stack.

  • In RISCV, the hardware does not automatically save registers upon exception. They must be saved manually in the Trap Handler. The Ameba SDK already implements saving of X0-X31 and the MSTATUS register, as depicted in RISCV-IMACF Exception Stack.

../../rst_rtos/9_crash/figures/crash_arm_v8m_exception_stack.svg

ARM-v8M Exception Stack

../../rst_rtos/9_crash/figures/crash_riscv_exception_stack.svg

RISCV-IMACF Exception Stack

Note

The stack grows downward and is aligned by 4 Bytes.

Backtrace

The core principle of function backtrace is to utilize the stack frame structure—each time a function is called, the compiler automatically generates code to create a stack frame, saving the return address, callee-saved arguments, and local variables. The main steps are as follows:

  • Exception Site Capture: After an exception, the instruction address (PC) at which the exception occurred and the stack must be saved.

  • Find the Return Address: According to the stack frame size set by the current function and the instruction that manipulates the return address register, locate the previous caller’s return address. Repeat this process until the desired backtrace depth is reached or until the bottom of the stack.

As shown in Function Call Stack, the stack frame structure differs between ISAs:

../../rst_rtos/9_crash/figures/crash_stack_frame.svg

Function Call Stack

By combining the function stack and specific stack operation instructions in the ELF/AXF/BIN files, the return address of each stack frame can be found, and the complete function call path can be reconstructed.

Caution

Under the following conditions, backtrace may fail:

  • If a crash occurs while processing an interrupt, stack call relationships may be missing, so only the registers at the exception site can be printed. In other words, interrupt handlers should avoid deep call chains.

  • When Secure features are enabled, backtrace may fail due to access-protected address ranges.

Analysis and Troubleshooting

By default, when an exception occurs, the crash dump and stack backtrace information are printed via LOGUART:

ARM-v8M

When an ARMv8M CPU exception occurs, the system outputs a detailed crash log. Below is a sample log for an exception caused by Accessing a Non-existent Address:

Accessing a Non-existent Address
========== Crash Dump ==========
------------Task Info------------
Fault on task <shell_task>
Task ID: 1
Task TCB:0x2000e720
Current State: 0 (Running)
Base Priority: 5
Current Priority: 5
Run Time Counter: 2
StackTop: 0x2000e630, StackBase: 0x2000d620, StackEnd: 0x2000e6e0, StackSize=1073(word)
Stack High WaterMark: 1001(word)
------------Task Info------------
Exception caught on 0e00ed0e
BFSR: [0x00000004] -> Bus fault is caused by imprecise data access violation
========== Register Dump ==========
[  LR] 0x0e00ed07
[  PC] 0x0e00ed0e
[xPSR] 0x21000000
[EXCR] 0xfffffffd
[ R0] 0x0000000b
[ R1] 0x0e01becb
[ R2] 0x00000041
[ R3] 0xdeadbeef
[ R4] 0x00000000
[ R5] 0x2ffffffc
[ R6] 0x00000001
[ R7] 0x30000000
[ R8] 0x0e01becb
[ R9] 0x0e01c202
[ R10] 0x0e01c204
[ R11] 0x0e01d920
[ R12] 0x00045418
==========KM4 Stack Dump ==========
Current Stack Pointer = 0x2000e630, and dump stack depth = 128
[0x2000e630] 0000000b 0e01becb 00000041 deadbeef
[0x2000e640] 00045418 0e00ed07 0e00ed0e 21000000
[0x2000e650] 2ffffffc 00000001 00000001 e000ed00
[0x2000e660] 20004080 00000000 00011721 20004161
[0x2000e670] 00011671 0e00196b 00000002 0e01d930
[0x2000e680] 2000407c 00000001 00000006 0e00bc77
[0x2000e690] 00000041 200040bc 20004161 00010219
[0x2000e6a0] 2000c3f0 20004160 00000000 2000c470
[0x2000e6b0] 11111111 0e00be41 00000000 01010101
[0x2000e6c0] 04040404 05050505 06060606 07070707
[0x2000e6d0] 08080808 09090909 10101010 0e01040b
[0x2000e6e0] a5a5a5a5 a5a5a5a5 a5a5a5a5 a5a5a5a5
[0x2000e6f0] a5a5a5a5 a5a5a5a5 a5a5a5a5 a5a5a5a5
[0x2000e700] 00000000 800001a0 25447cfa d2e11067
[0x2000e710] f5ffef97 38213187 53b359bc c8cadae8
[0x2000e720] 2000e630 79b46159 2000c570 2000c570
[0x2000e730] 2000e720 2000c568 00000006 2000c638
[0x2000e740] 2000c638 2000e720 00000000 00000005
[0x2000e750] 2000d620 6c656873 61745f6c 38006b73
[0x2000e760] 14c74143 3fc569e7 00ad1c45 2000e6e0
[0x2000e770] 00000001 09d35ecf 00000005 00000000
[0x2000e780] 00000000 00000000 00000000 00000002
[0x2000e790] 00000000 2000d00c 2000d074 2000d0dc
[0x2000e7a0] 00000000 00000000 00000000 00000000
[0x2000e7b0] 00000000 00000000 00000000 00000000
[0x2000e7c0] 00000000 00000000 00000000 00000000
[0x2000e7d0] 00000000 00000000 00000000 00000000
[0x2000e7e0] 00000000 00000000 00000000 00000000
[0x2000e7f0] 00000000 00000000 00000000 00000000
[0x2000e800] 00000000 00000000 00000000 00000000
[0x2000e810] 00000000 00000000 00000000 00000000
[0x2000e820] 00000000 00000000 00000000 00000000
========== Stack Trace ==========
Start stack backtracing for sp 0x2000e650, pc 0x0e00ed0e, lr 0x0e00ed07
/opt/rtk-toolchain/asdk-10.3.1-4354/linux/newlib/bin/arm-none-eabi-addr2line -e /home/user_name/sdk/amebalite_gcc_project/project_km4/asdk/image/target_img2.axf -afpiC 0x0e00ed0e 0x0e001966 0x0e00bc74 0x0e00be3c
========== End of Stack Trace ==========
========== End of Crash Dump ==========

[FAULT-A] SHCSR = 0x000f0002
[FAULT-A] AIRCR = 0xfa054000
[FAULT-A] CONTROL = 0x00000000

Bus Fault:
Secure State: 1

Stacked:
R0 = 0x0000000b
R1 = 0x0e01becb
R2 = 0x00000041
R3 = 0xdeadbeef
R12 = 0x00045418
LR = 0x0e00ed07
PC = 0x0e00ed0e
PSR = 0x21000000

Current:
EXC_RETURN = 0xfffffffd
MSP = 0x20003fe0
PSP = 0x2000e630
xPSR = 0xa0000005
CFSR  = 0x00000400
HFSR  = 0x00000000
DFSR  = 0x00000000
MMFAR = 0x00000000
BFAR  = 0x00000000
AFSR  = 0x00000000
PriMask = 0x00000000
SVC priority: 0x00
PendSVC priority: 0xe0
Systick priority: 0xe0
MSP_NS = 0x20004000
PSP_NS = 0x00000000
CFSR_NS  = 0x00000000
HFSR_NS  = 0x00000000
DFSR_NS  = 0x00000000
MMFAR_NS = 0x00000000
BFAR_NS  = 0x00000000
AFSR_NS  = 0x00000000
SVC priority NS: 0x00
PendSVC priority NS: 0x00
Systick priority NS: 0x00

Classification and Troubleshooting of LOG Information:

  • Task Information: Task Control Block (TCB) information is only printed when an exception occurs during task execution; it will not be printed in bare-metal mode. The task name helps identify the specific task where the error occurred, and by checking the stack usage (e.g., High WaterMark), one can determine if a stack overflow happened. In the above example, the error occurred in the shell task.

  • Exception Type and Reason: Users can determine the specific error type from the printed message. For example, the printed result BFSR: [0x00000004] -> Bus fault is caused by imprecise data access violation indicates a bus fault.

  • General Registers: When an exception occurs, the general-purpose registers capture what the CPU was accessing. The PC register points to the address of the instruction where the error happened, and the LR register points to the calling function. By locating the PC value e00ed0e in the corresponding assembly file, one can find that the exception occurred in rtk_log_memory_dump_word().

0e00ecc8 <rtk_log_memory_dump_word>:
e00ecc8:     e92d 47f3       stmdb   sp!, {r0, r1, r4, r5, r6, r7, r8, r9, sl, lr}
...
e00ed0e:     2001            movs    r0, #1
...
  • Exception Stack: Records the call stack of function calls at the moment the exception occurred. The stack contains the registers R0~R3, R12, LR, PC, xPSR, etc., that were pushed during the exception.

  • Stack Backtrace Information: The stack backtrace mechanism can reconstruct the function call sequence. Users can copy this information into the compilation environment of the SDK to directly obtain the function names and their call relationships where errors occurred.

>>> $ /opt/rtk-toolchain/asdk-10.3.1-4354/linux/newlib/bin/arm-none-eabi-addr2line -e /home/user_name/sdk/amebalite_gcc_project/project_km4/asdk/image/target_img2.axf -afpiC 0x0e00ed0e 0x0e001966 0x0e00bc74 0x0e00be3c
0x0e00ed0e: rtk_log_memory_dump_word at /home/user_name/sdk/component/soc/amebalite/swlib/log.c:152 (discriminator 3)
0x0e001966: cmd_dump_word at /home/user_name/sdk/component/at_cmd/monitor.c:136
0x0e00bc74: shell_cmd_exec_ram at /home/user_name/sdk/component/soc/amebalite/app/monitor/ram/shell_ram.c:68
0x0e00be3c: shell_task_ram at /home/user_name/sdk/component/soc/amebalite/app/monitor/ram/shell_ram.c:236
  • System Registers: System registers record auxiliary information such as stack pointers, system fault functions, and priority settings.

Caution

The KM4 CPU in Ameba SoC supports stack backtrace functionality, while KM0 does not support it.

RISCV

When a RISCV CPU encounters an exception, the system outputs a detailed crash log. The following LOG showcases an exception caused by Hitting Debug Breakpoint:

Hitting Debug Breakpoint
------------------------------------
Have a test on crash_dump and back trace
crash_SysBreak()
crash_SysBreak1()
crash_SysBreak2()==> Issue BREAK instruction
========== Crash Dump ==========
------------Task Info------------
Fault on task <shell_task>
Task ID: 1
Task TCB:0x2006ba40
Current State: 0 (Running)
Base Priority: 5
Current Priority: 5
Run Time Counter: 2
StackTop: 0x2006b5b8, StackBase: 0x2006b480, StackEnd: 0x2006ba00, StackSize=353(word)
Stack High WaterMark: 78(word)
------------Task Info------------
Exception caught on 0x0c00961e with reason [0x3] -> [Breakpoint]
========== Register Dump ==========
[mscratch] 0x00000000
[mepc]     0x0c00961e
[mcause]   0x00000003
[mtval]    0x00000000
[x0 -> zero] 0x00000000
[x1 -> ra] 0x0c00961e
[x2 -> sp] 0x2006b970
[x3 -> gp] 0x20068f24
[x4 -> tp] 0xffffffff
[x5 -> t0] 0xa5a5a5a5
[x6 -> t1] 0x000bee33
[x7 -> t2] 0xa5a5a5a5
[x8 -> s0/fp] 0x2006b980
[x9 -> s1] 0x00000002
[x10 -> a0] 0x0000002e
[x11 -> a1] 0x00000000
[x12 -> a2] 0x00000000
[x13 -> a3] 0x2006ba40
[x14 -> a4] 0x00000000
[x15 -> a5] 0x00000000
[x16 -> a6] 0x00200000
[x17 -> a7] 0x41014000
[x18 -> s2] 0x20000b04
[x19 -> s3] 0x00000005
[x20 -> s4] 0x00000006
[x21 -> s5] 0x20000be9
[x22 -> s6] 0x0c016374
[x23 -> s7] 0x0c013000
[x24 -> s8] 0xa5a5a5a5
[x25 -> s9] 0xa5a5a5a5
[x26 -> s10] 0xa5a5a5a5
[x27 -> s11] 0xa5a5a5a5
[x28 -> t3] 0x0000000f
[x29 -> t4] 0xa5a5a5a5
[x30 -> t5] 0xa5a5a5a5
[x31 -> t6] 0xa5a5a5a5
========== Stack Trace ==========
Start stack backtracing for sp 0x2006b970, pc 0x0c00961e
[frame #0] sp-> 0x2006b970, pc-> 0x0c00961e, stack_size-> 16, ra-> 0x0c00964a
[frame #1] sp-> 0x2006b980, pc-> 0x0c00964a, stack_size-> 16, ra-> 0x0c009674
[frame #2] sp-> 0x2006b990, pc-> 0x0c009674, stack_size-> 16, ra-> 0x0c0096a2
[frame #3] sp-> 0x2006b9a0, pc-> 0x0c0096a2, stack_size-> 16, ra-> 0x0c004544
========== End of Stack Trace ==========
========== End of Crash Dump ==========

Classification and Troubleshooting of LOG Information:

  • Task Information: Task Control Block information is only printed when an exception occurs during task execution; it will not be printed in bare-metal mode. The task name helps to confirm that the above error (the debug breakpoint is an RISCV exception of number 3) occurred in the shell task.

  • Exception Type and Reason: Users can determine the specific error type from the printed message. For example, the output Exception caught on 0x0c00961e with reason [0x3] -> [Breakpoint] indicates a debug exception triggered by the ebreak instruction.

  • General Registers: General-purpose registers record what the CPU was accessing before the exception. The MEPC register points to the address of the instruction where the error occurred, and the X1 (RA) register points to the calling function. By locating the PC value 0x0c00961e in the corresponding assembly file, it can be seen that the exception occurred in crash_SysBreak2().

0c009602 <crash_SysBreak2>:
c009602:     1141                    c.addi  sp,-16
c009604:     c606                    c.swsp  ra,12(sp)
c009606:     c422                    c.swsp  s0,8(sp)
c009608:     0800                    c.addi4spn      s0,sp,16
c00960a:     0c0157b7                lui     a5,0xc015
c00960e:     cd878593                addi    a1,a5,-808 # c014cd8 <__func__.2>
c009612:     0c0157b7                lui     a5,0xc015
c009616:     bbc78513                addi    a0,a5,-1092 # c014bbc <pmap_func+0x648>
c00961a:     ebefe0ef                jal     ra,c007cd8 <__wrap_printf>
c00961e:     9002                    c.ebreak  # Exception Location
c009620:     0001                    c.addi  zero,0
c009622:     40b2                    c.lwsp  ra,12(sp)
c009624:     4422                    c.lwsp  s0,8(sp)
c009626:     0141                    c.addi  sp,16
c009628:     8082                    c.jr    ra
  • Exception Stack: Records the call stack at the time the exception occurred. Not printed in this example; users can modify CONFIG_DEBUG_BACK_TRACE to #undef to enable stack printing.

  • Stack Backtrace Information: The stack backtrace mechanism helps reconstruct the function call sequence. Users can use the PC values found from the log to locate the corresponding functions in the assembly files.

Common Exceptions and Faults

Common ARMv8-M Faults

Typical FAULT on ARMv8-M include Bus Fault, Usage Fault, MemManage Fault, etc. When the CPU implements the Secure extension, Secure Fault may also occur during use.

Common ARMv8-M Faults

No.

Name / English

Brief Description

3

HardFault

All non-maskable unexpected errors, not covered by other faults

4

MemManage Fault

Memory management error; code/data access rights violation

5

BusFault

Bus access error (e.g., invalid address)

6

UsageFault

Usage error (misalignment, illegal instruction, stack overflow, etc.)

7

SecureFault

Security error (illegal access to protected area)

Note

  • KM0 does not support bus fault, usage fault, and memory management fault. All faults are handled by Hard Fault.

  • If the enable bits for Bus Fault, Usage Fault, or MemManage Fault are not set in the System Handler Control and State Register (SCHSR), all such faults will be handled as Hard Faults.

  • If a new Fault occurs while handling a Usage Fault or MemManage Fault, it will escalate to a Hard Fault.

  • MemManage Fault usually occurs after the MPU is enabled when accessing protected address space.

Common RISC-V Exceptions

The exception mechanism in RISC-V architecture is quite different from that of ARM. RISC-V uses Exception Codes and Cause to identify and locate errors. Common exceptions include illegal memory access, stack overflows, illegal instructions, environment calls, and more.

Common RISC-V Exceptions and Errors

No. / Code

Name

Brief Description

0

Instruction address misaligned

Misaligned instruction address access

1

Instruction access fault

Instruction fetch error (e.g., invalid memory)

2

Illegal instruction

Illegal or unimplemented instruction

3

Breakpoint

Breakpoint exception, typically for debugging

4

Load address misaligned

Misaligned memory access on load

5

Load access fault

Load (read) access to invalid or forbidden memory

6

Store/AMO address misaligned

Misaligned address access on store/AMO

7

Store/AMO access fault

Store/AMO access to invalid or forbidden memory

8

Environment call from U-mode

Environment call (ecall) from User mode

9

Environment call from S-mode

Environment call from Supervisor mode

11

Environment call from M-mode

Environment call from Machine mode

12

Instruction page fault

Instruction page fault (e.g., non-existent or no permission page)

13

Load page fault

Data load page fault

15

Store/AMO page fault

Store/AMO page fault

Note

KR4 is a RISC-V IMACF architecture CPU, and only implements User and Machine modes; Supervisor mode is not implemented.

Similarities and Differences

Similarities and Differences Between RISC-V and ARMv8-M

Type

RISC-V Exception Example

ARMv8-M Exception Example

Misaligned Address

Instruction/Load/Store misaligned

UsageFault: Unaligned Access

Access Violation / Invalid Address

Load/Store Access Fault

MemManage/BusFault

Illegal / Undefined Instruction

Illegal instruction

UsageFault: Undefined Instruction

System Call / Software Interrupt

Environment call (ecall)

SVCall (SVC)

Debug Exception

Breakpoint

Debug Monitor / Breakpoint

Page Table Fault

Series of Page Faults

N/A (Cortex-M lacks MMU, only supports MPU)

Stack Overflow

Fault caused by access violation

UsageFault: Stack overflow