Overview

Accurate error site recording in embedded programs is crucial for efficient issue localization and analysis when faults occur. The Ameba SoC offers specialized error handling mechanisms, including Crash Dump and Backtrace features:

  • Crash Dump: Automatically saves the general registers, system registers, and stack information at the moment an exception occurs, enabling the restoration of the error context.

  • Backtrace: Analyzes and reconstructs the function call path that led to the exception, based on the crash location.

Hint

Crash dump makes use of the CPU’s built-in exception detection. This means the following program anomalies cannot be detected by this method:

  • Program deadlock or infinite loops: Commonly due to logic errors, accesses to non-existent address spaces, or accesses to peripherals without clock enabled, etc.

  • Resource exhaustion: Typical example is heap exhaustion causing malloc failure

Exception Handling Characteristics for Different Instruction Set Architectures

Feature

ARMv8-M

RISC-V

Backtrace Start PC

Pushed onto the stack by hardware, needs to be read from the stack

Stored in the mepc register

Register Saving

Hardware automatically saves some core registers (8 registers)

Hardware does not save general registers, must be saved by software in the Trap Handler

Frame Pointer (FP)

Usually R7

Usually x8 (s0/fp)

Return Address (RA)

LR (Link Register)

ra (Return Address, x1)

Core Dependency

Relies on hardware stack push mechanism and Fault_Handler

Relies on Trap Handler to save all registers correctly

Automation Tools

addr2line, GDB, etc., for ELF/memory dump analysis

addr2line, GDB, etc., also apply, following the same principle

Caution

To improve the performance of the SDK in actual operation, the GCC compiler optimization level is set to O2 or Os, which by default enables the fomit-frame-pointer option. When this option is enabled, the fp register is not saved, meaning that stack backtrace via the fp register is not possible.

ARM Cortex-M

Crash Dump

ARM Cortex-M processors (such as ARM v8M) rely on hardware exception detection and automatic context saving mechanisms. The main principles are as follows:

  • Exception Context Capture: When an exception occurs, the hardware automatically pushes registers R0~R3, R12, LR, PC, and xPSR (a total of 8 registers) onto the memory area pointed to by the current stack pointer (MSP or PSP). If the FPU is enabled and the exception occurs during a floating-point context, the hardware also saves floating-point registers automatically. See ARM-v8M Exception Stack.

  • Special Register Collection: Additional information such as exception type and fault address can be obtained by reading CFSR, HFSR, MMFAR, BFAR, and other registers in the SCB (System Control Block).

  • Stack Snapshot: According to the MSP/PSP at the exception trigger point, a snapshot of the stack is exported to provide data for subsequent function backtrace and call chain analysis.

  • Dump implementation: During the Fault handling process, the above on-site information is exported in a structured manner (e.g., via UART, log, or flash) for offline analysis.

../../_images/crash_arm_v8m_exception_stack.svg

ARM-v8M Exception Stack

Backtrace

  • Principle: ARM Cortex-M (such as ARMv8M) relies on hardware auto-stacking (e.g., LR, PC, R0~R3, R12, xPSR, etc.) and software stack frames generated by the compiler during function calls and exception entry. The return address (LR, Link Register) is usually pushed onto the stack at each function call, and some ABIs may also use a frame pointer (R7/R11).

  • Exception Context Capture: During an exception, the CPU automatically pushes the current PC and some registers onto the stack, allowing the exception handler to obtain the exception PC and the current stack pointer (MSP/PSP).

  • Backtrace Procedure:

    1. Start from the PC at the time of the exception, along with the stack top address.

    2. According to the function’s stack frame structure and the LR saved on the stack, step by step locate the address of the previous caller.

    3. If the frame pointer is not optimized out, it can be used to find the previous stack frame; otherwise, disassembly and symbol information are needed.

    4. Repeat this process until reaching the stack bottom or the specified backtrace depth.

  • Tools and Dependencies: Symbol tables and disassembled code from ELF/AXF/BIN files are required for validation and location. If the compiler has enabled -fomit-frame-pointer optimization, backtrace becomes more difficult.

As shown in ARM Cortex-M Function Call Stack, the structure of the ARM Cortex-M function stack frame is illustrated:

../../_images/crash_arm_stack_frame.svg

ARM Cortex-M Function Call Stack

Analysis and Troubleshooting

When an ARMv8M CPU exception occurs, the system outputs a detailed crash log. The following LOG illustrates an exception caused by Accessing a Non-existent Address:

Accessing a Non-existent Address
========== Crash Dump ==========
------------Task Info------------
Fault on task <shell_task>
Task ID: 1
Task TCB:0x2000e720
Current State: 0 (Running)
Base Priority: 5
Current Priority: 5
Run Time Counter: 2
StackTop: 0x2000e630, StackBase: 0x2000d620, StackEnd: 0x2000e6e0, StackSize=1073(word)
Stack High WaterMark: 1001(word)
------------Task Info------------
Exception caught on 0e00ed0e
BFSR: [0x00000004] -> Bus fault is caused by imprecise data access violation
========== Register Dump ==========
[  LR] 0x0e00ed07
[  PC] 0x0e00ed0e
[xPSR] 0x21000000
[EXCR] 0xfffffffd
[ R0] 0x0000000b
[ R1] 0x0e01becb
[ R2] 0x00000041
[ R3] 0xdeadbeef
[ R4] 0x00000000
[ R5] 0x2ffffffc
[ R6] 0x00000001
[ R7] 0x30000000
[ R8] 0x0e01becb
[ R9] 0x0e01c202
[ R10] 0x0e01c204
[ R11] 0x0e01d920
[ R12] 0x00045418
==========KM4 Stack Dump ==========
Current Stack Pointer = 0x2000e630, and dump stack depth = 128
[0x2000e630] 0000000b 0e01becb 00000041 deadbeef
[0x2000e640] 00045418 0e00ed07 0e00ed0e 21000000
[0x2000e650] 2ffffffc 00000001 00000001 e000ed00
[0x2000e660] 20004080 00000000 00011721 20004161
[0x2000e670] 00011671 0e00196b 00000002 0e01d930
[0x2000e680] 2000407c 00000001 00000006 0e00bc77
[0x2000e690] 00000041 200040bc 20004161 00010219
[0x2000e6a0] 2000c3f0 20004160 00000000 2000c470
[0x2000e6b0] 11111111 0e00be41 00000000 01010101
[0x2000e6c0] 04040404 05050505 06060606 07070707
[0x2000e6d0] 08080808 09090909 10101010 0e01040b
[0x2000e6e0] a5a5a5a5 a5a5a5a5 a5a5a5a5 a5a5a5a5
[0x2000e6f0] a5a5a5a5 a5a5a5a5 a5a5a5a5 a5a5a5a5
[0x2000e700] 00000000 800001a0 25447cfa d2e11067
[0x2000e710] f5ffef97 38213187 53b359bc c8cadae8
[0x2000e720] 2000e630 79b46159 2000c570 2000c570
[0x2000e730] 2000e720 2000c568 00000006 2000c638
[0x2000e740] 2000c638 2000e720 00000000 00000005
[0x2000e750] 2000d620 6c656873 61745f6c 38006b73
[0x2000e760] 14c74143 3fc569e7 00ad1c45 2000e6e0
[0x2000e770] 00000001 09d35ecf 00000005 00000000
[0x2000e780] 00000000 00000000 00000000 00000002
[0x2000e790] 00000000 2000d00c 2000d074 2000d0dc
[0x2000e7a0] 00000000 00000000 00000000 00000000
[0x2000e7b0] 00000000 00000000 00000000 00000000
[0x2000e7c0] 00000000 00000000 00000000 00000000
[0x2000e7d0] 00000000 00000000 00000000 00000000
[0x2000e7e0] 00000000 00000000 00000000 00000000
[0x2000e7f0] 00000000 00000000 00000000 00000000
[0x2000e800] 00000000 00000000 00000000 00000000
[0x2000e810] 00000000 00000000 00000000 00000000
[0x2000e820] 00000000 00000000 00000000 00000000
========== Stack Trace ==========
Start stack backtracing for sp 0x2000e650, pc 0x0e00ed0e, lr 0x0e00ed07
/opt/rtk-toolchain/asdk-10.3.1-4354/linux/newlib/bin/arm-none-eabi-addr2line -e /home/user_name/sdk/amebalite_gcc_project/project_km4/asdk/image/target_img2.axf -afpiC 0x0e00ed0e 0x0e001966 0x0e00bc74 0x0e00be3c
========== End of Stack Trace ==========
========== End of Crash Dump ==========

[FAULT-A] SHCSR = 0x000f0002
[FAULT-A] AIRCR = 0xfa054000
[FAULT-A] CONTROL = 0x00000000

Bus Fault:
Secure State: 1

Stacked:
R0 = 0x0000000b
R1 = 0x0e01becb
R2 = 0x00000041
R3 = 0xdeadbeef
R12 = 0x00045418
LR = 0x0e00ed07
PC = 0x0e00ed0e
PSR = 0x21000000

Current:
EXC_RETURN = 0xfffffffd
MSP = 0x20003fe0
PSP = 0x2000e630
xPSR = 0xa0000005
CFSR  = 0x00000400
HFSR  = 0x00000000
DFSR  = 0x00000000
MMFAR = 0x00000000
BFAR  = 0x00000000
AFSR  = 0x00000000
PriMask = 0x00000000
SVC priority: 0x00
PendSVC priority: 0xe0
Systick priority: 0xe0
MSP_NS = 0x20004000
PSP_NS = 0x00000000
CFSR_NS  = 0x00000000
HFSR_NS  = 0x00000000
DFSR_NS  = 0x00000000
MMFAR_NS = 0x00000000
BFAR_NS  = 0x00000000
AFSR_NS  = 0x00000000
SVC priority NS: 0x00
PendSVC priority NS: 0x00
Systick priority NS: 0x00

Classification and Troubleshooting of LOG Information:

  • Task Information: Task Control Block (TCB) information is only printed when an exception occurs during task execution; it will not be printed in bare-metal mode. The task name helps identify the specific task where the error occurred, and by checking the stack usage (e.g., High WaterMark), one can determine if a stack overflow happened. In the above example, the error occurred in the shell task.

  • Exception Type and Reason: Users can determine the specific error type from the printed message. For example, the printed result BFSR: [0x00000004] -> Bus fault is caused by imprecise data access violation indicates a bus fault.

  • General Registers: When an exception occurs, the general-purpose registers capture what the CPU was accessing. The PC register points to the address of the instruction where the error happened, and the LR register points to the calling function. By locating the PC value e00ed0e in the corresponding assembly file, one can find that the exception occurred in rtk_log_memory_dump_word().

0e00ecc8 <rtk_log_memory_dump_word>:
e00ecc8:     e92d 47f3       stmdb   sp!, {r0, r1, r4, r5, r6, r7, r8, r9, sl, lr}
...
e00ed0e:     2001            movs    r0, #1
...
  • Exception Stack: Records the call stack of function calls at the moment the exception occurred. The stack contains the registers R0~R3, R12, LR, PC, xPSR, etc., that were pushed during the exception.

  • Stack Backtrace Information: The stack backtrace mechanism can reconstruct the function call sequence. Users can copy this information into the compilation environment of the SDK to directly obtain the function names and their call relationships where errors occurred.

>>> $ /opt/rtk-toolchain/asdk-10.3.1-4354/linux/newlib/bin/arm-none-eabi-addr2line -e /home/user_name/sdk/amebalite_gcc_project/project_km4/asdk/image/target_img2.axf -afpiC 0x0e00ed0e 0x0e001966 0x0e00bc74 0x0e00be3c
0x0e00ed0e: rtk_log_memory_dump_word at /home/user_name/sdk/component/soc/amebalite/swlib/log.c:152 (discriminator 3)
0x0e001966: cmd_dump_word at /home/user_name/sdk/component/at_cmd/monitor.c:136
0x0e00bc74: shell_cmd_exec_ram at /home/user_name/sdk/component/soc/amebalite/app/monitor/ram/shell_ram.c:68
0x0e00be3c: shell_task_ram at /home/user_name/sdk/component/soc/amebalite/app/monitor/ram/shell_ram.c:236
  • System Registers: System registers record auxiliary information such as stack pointers, system fault functions, and priority settings.

Caution

The KM4 CPU in Ameba SoC supports stack backtrace functionality, while KM0 does not support it.

Common Faults

Typical FAULT on ARMv8-M include Bus Fault, Usage Fault, MemManage Fault, etc. When the CPU implements the Secure extension, Secure Fault may also occur during use.

Common ARMv8-M Faults

No.

Name / English

Brief Description

3

HardFault

All non-maskable unexpected errors, not covered by other faults

4

MemManage Fault

Memory management error; code/data access rights violation

5

BusFault

Bus access error (e.g., invalid address)

6

UsageFault

Usage error (misalignment, illegal instruction, stack overflow, etc.)

7

SecureFault

Security error (illegal access to protected area)

Note

  • KM0 does not support bus fault, usage fault, and memory management fault. All faults are handled by Hard Fault.

  • If the enable bits for Bus Fault, Usage Fault, or MemManage Fault are not set in the System Handler Control and State Register (SCHSR), all such faults will be handled as Hard Faults.

  • If a new Fault occurs while handling a Usage Fault or MemManage Fault, it will escalate to a Hard Fault.

  • MemManage Fault usually occurs after the MPU is enabled when accessing protected address space.

BusFault

The most common BusFault is an Imprecise Data Access Violation. This typically occurs when the processor attempts to access a memory region that does not exist or is not mapped. Common scenarios include:

  • The SRAM block is mapped only between 0x2000_0000 and 0x2008_0000 (512KB total), but an address outside this range (e.g., 0x2FFF_FFFF) is accessed.

  • An invalid address is fetched from memory (for example, 0xA54E_D25A, which is not defined in the address map), resulting in a BusFault upon access. This can happen due to:

    • The memory device is not powered up (e.g., code runs from SRAM but instructions are fetched from FLASH which is unpowered)

    • Memory clock settings are incorrect (e.g., memory expects a 20MHz clock but is provided only 10MHz)

    • The memory contains bad blocks or is otherwise defective

  • A corrupted return address stored on the stack, causing the processor to branch to an invalid or unmapped address space.

UsageFault

Common causes of UsageFault are:

  • Undefined instruction: The processor fetches an opcode that is not valid in the instruction set. Typical reasons are:

    • Memory device is not powered (e.g., code execution in SRAM, next instruction needs to be read from FLASH, but FLASH is unpowered)

    • Memory clock configuration is mismatched (for example, 20MHz expected but 10MHz supplied)

    • Flash or memory device has failed cells (bad blocks)

__ASM volatile(".hword 0xde00\n");
  • Stack overflow: Occurs when large local variables or deep/nested function calls exhaust the allocated stack size. For example, if a large buffer is declared inside a task function, it can exceed the stack space and trigger a UsageFault.

void crash_U_StackOverflowTask(void *param)
{
    (void)param;
    uint8_t CPU_RunInfo[512]; // Local variable
    while (1) {
        memset(CPU_RunInfo, 0, 512);
        rtos_time_delay_ms(200); // simulate workload
    }
}

void crash_U_StackOverflow(void)
{
    if (rtos_task_create(NULL, "example_player_thread", crash_U_StackOverflowTask, NULL, 32, 1) != RTK_SUCCESS) {
        DiagPrintf("StackOverflowTest task creation failed!\r\n");
    }
}
  • Invalid state: Occurs when the PC is loaded with an address whose LSB (bit 0) is 0, causing a transition from Thumb state to ARM state, which is not supported on Cortex-M cores.

__ASM volatile("MOV     R0,   0x20000000        \n\t"
               "BX      R0                      \n\t"); // From Thumb state to ARM state (invalid)
  • Unaligned access: If the UNALIGN_TRP bit in SCB->CCR (0xE000ED14[3]) is set, any unaligned memory access (such as a word access at an address not divisible by 4) will generate a UsageFault.

volatile int *p;
volatile int value;
/* UNALIGN_TRP at SCB->CCR bit 3 */
volatile int *SCB_CCR = (volatile int *) 0xE000ED14;
*SCB_CCR |= (1 << 3);

p = (int *) 0x20000001;
value = *p;
printf("addr:0x%02X value:0x%08X\r\n", (int)p, value);
p = (int *) 0x20000004;
value = *p;
printf("addr:0x%02X value:0x%08X\r\n", (int)p, value);
p = (int *) 0x20000003;
value = *p;
printf("addr:0x%02X value:0x%08X\r\n", (int)p, value);

Note

KM4 supports unaligned access by default; KM0 does not support unaligned access.

MemManage Fault

A MemManage fault occurs when the Memory Protection Unit (MPU) is enabled and an unauthorized access is performed. The most common case is a Data Access Violation, for instance, if a memory region [0x20050020, 0x20050200] is set as read-only in the MPU and a write access is attempted, a MemManage fault is raised.

Secure Fault

Secure faults typically occur on cores supporting the Security Extension (e.g., ARM TrustZone-M), such as KM4, when TrustZone is enabled. The most common secure fault is an Attribution Unit Violation. Examples include:

  • A non-secure access to a memory region that is protected as secure by the TrustZone Attribution Unit. For example, if region [0x0, 0x20000000] is set as Secure, any access from non-secure code will trigger an Attribution Unit Violation.

Caution

Null pointer: When TrustZone is enabled on KM4, address 0x00000000 is marked as Secure. If a function dereferences a NULL pointer (i.e., points to 0x00000000) while executing in non-secure context, an Attribution Unit Violation will occur.

Bus Fault

The most common bus fault is the imprecise data access violation. This error usually occurs when accessing a non-existent address, with possible scenarios as follows:

  • The SRAM address range is only 0x2000_0000 ~ 0x2008_0000 (512KB total), but an address outside this range is accessed (such as 0x2FFF_FFFF).

  • An incorrect address is read from memory (for example, 0xA54E_D25A, which is undefined), causing a CPU error when the address is accessed. Common reasons for reading an incorrect address from memory include:

    • Memory is not powered: For example, code is running in SRAM, but the next instruction needs to be read from FLASH, and FLASH is not powered.

    • Memory clock mismatch: The memory requires a 20MHz clock, but the actual clock is set to 10MHz.

    • The memory itself contains bad blocks.

  • A stack-stored address is corrupted and points to an undefined address range.

Usage Fault

Common usage faults include:

  • Undefined instruction. The instruction read from memory does not conform to the instruction set. Common causes include:

    • Memory is not powered: For example, code executes in SRAM, but the next instruction needs to be read from FLASH and FLASH is not powered.

    • Memory clock mismatch: The memory requires a 20MHz clock, but the actual clock is set to 10MHz.

    • The memory itself contains bad blocks.

__ASM volatile(".hword 0xde00\n");
  • Stack overflow. Stack overflow often occurs when a function declares a large local variable, and the total of function call stack and local variable size exceeds the allocated stack size. For instance, during audio playback, a large array may be used. It is better to declare such arrays as global variables, otherwise stack overflow may occur.

void crash_U_StackOverflowTask(void *param)
{
    (void)param;
    uint8_t CPU_RunInfo[512]; // Local variable
    /* Infinite loop */
    while (1) {
        memset(CPU_RunInfo, 0, 512);

        rtos_time_delay_ms(200); /* delay 500 ticks */
    }
}

void crash_U_StackOverflow(void)
{
    if (rtos_task_create(NULL, ((const char *)"example_player_thread"), crash_U_StackOverflowTask, NULL, 32, 1) != RTK_SUCCESS) {
        DiagPrintf("StackOverflowTest task creates failed!\r\n");
    }
}
  • Invalid state. This error usually occurs when the PC instruction is incorrect, for example, BIT0 is 0 and a forced jump from Thumb state to ARM state happens.

__ASM volatile("MOV     R0,   0x20000000        \n\t"
                             "BX      R0                      \n\t"); //From Thumb state to Arm state
  • Unaligned access: When the unaligned access TRAP function (0xE000ED14[3] = 1) is enabled, accessing unaligned addresses will cause a fault.

volatile int *p;
volatile int value;
/* bit3: UNALIGN_TRP */
volatile int *SCB_CCR = (volatile int *) 0xE000ED14;  // SCB->CCR
*SCB_CCR |= (1 << 3);

p = (int *) 0x20000001;
value = *p;
printf("addr:0x%02X value:0x%08X\r\n", (int) p, value);
p = (int *) 0x20000004;
value = *p;
printf("addr:0x%02X value:0x%08X\r\n", (int) p, value);
p = (int *) 0x20000003;
value = *p;
printf("addr:0x%02X value:0x%08X\r\n", (int) p, value);

Note

KM4 supports unaligned access by default, whereas KM0 does not support unaligned access.

Memory Management Fault

Memory management faults generally occur after the MPU feature is enabled. The most common memory management fault is a data access violation. For example, if the MPU sets the address range [0x20050020, 0x20050200] as read-only, writing to this address range will trigger a data access violation.

Secure Fault

Secure faults generally occur on CPUs supporting the Secure extension (such as KM4) when Trustzone is enabled. The most common secure fault is an Attribution Unit Violation, which can happen in the following scenarios:

  • Explicitly accessing a Trustzone-protected address range from the non-secure world. For example, the range [0, 0x20000000] is protected by Trustzone, and directly accessing this range will trigger an attribution unit violation.

  • Null pointer: typically address 0x00000000 is protected by Trustzone. If a pointer is null during code execution and dereferenced, it will trigger an attribution unit violation.

RISC-V

Crash Dump

In the RISC-V architecture, exception context preservation mainly relies on the Trap Handler actively saving state; the hardware does not automatically save general-purpose registers:

  • Exception Context Capture: When a trap occurs, hardware automatically saves the address of the instruction causing the exception into the mepc register, and only a few special registers (mcause, mtval, mstatus, mepc, etc.) are saved; general-purpose registers (x1–x31, with x0 always zero and can be omitted) must be actively saved by the Trap Handler. See RISCV-IMACF Exception Stack.

  • Special Register Collection: Error types and context can be parsed through mcause (exception type), mepc (exception instruction address), mtval (related value), etc.

  • Stack Snapshot: The Trap Handler should export the data pointed by the current stack pointer (context before and after the exception).

  • Dump Implementation: In the SDK, the Trap Handler has implemented the saving of x1–x31 and mstatus, and outputs them in a structured way for analysis.

../../_images/crash_riscv_exception_stack.svg

RISCV-IMACF Exception Stack

Backtrace

  • Principle: During function calls in RISC-V, the return address is stored in the ra (x1) register, and typically, both ra and necessary general-purpose registers are pushed onto the stack according to compiler output. Since there is no hardware auto-stacking, backtracing relies heavily on compiler-generated prologue/epilogue code, and is generally performed by analyzing the stack’s ra and possible frame pointer (x8/fp).

  • Exception Context Capture: On exception, the CPU does not automatically save general-purpose registers; the Trap Handler must manually save the PC (mepc) and general-purpose registers, and record stack content at the exception.

  • Backtrace Steps:

    1. Start from the exception PC (saved in mepc), with the current sp (stack pointer).

    2. According to the RISC-V ABI and compiler conventions, locate the ra (x1) on the stack to infer the previous caller.

    3. If frame pointer (x8/fp) is enabled, it can be used to locate the start of the previous stack frame.

    4. Repeat the iteration until the stack bottom is reached or the maximum backtrace depth.

  • Tools and Dependencies: Symbol tables and disassembly from ELF/AXF/BIN files should be used in conjunction with tools such as addr2line, GDB, etc., to locate the call chain. Enabling -fomit-frame-pointer will significantly reduce backtrace accuracy.

As shown in RISC-V Function Call Stack, the structure of the RISC-V function stack frame is illustrated:

../../_images/crash_riscv_stack_frame.svg

RISC-V Function Call Stack

Caution

In the following scenarios, backtrace may fail:

  • If a crash occurs during interrupt processing, due to the lack of a stack call relationship, only the register values at the time of the exception can be printed, and stack backtrace is not possible. In other words, interrupt functions should avoid deep call hierarchies.

  • When Secure features are enabled, certain address regions are protected, which may also cause backtrace failures.

Analysis and Troubleshooting

When a RISC-V CPU exception occurs, the system outputs a detailed crash log via LOGUART. The following LOG shows an exception caused by Hitting Debug Breakpoint:

Hitting Debug Breakpoint
------------------------------------
Have a test on crash_dump and back trace
crash_SysBreak()
crash_SysBreak1()
crash_SysBreak2()==> Issue BREAK instruction
========== Crash Dump ==========
------------Task Info------------
Fault on task <shell_task>
Task ID: 1
Task TCB:0x2006ba40
Current State: 0 (Running)
Base Priority: 5
Current Priority: 5
Run Time Counter: 2
StackTop: 0x2006b5b8, StackBase: 0x2006b480, StackEnd: 0x2006ba00, StackSize=353(word)
Stack High WaterMark: 78(word)
------------Task Info------------
Exception caught on 0x0c00961e with reason [0x3] -> [Breakpoint]
========== Register Dump ==========
[mscratch] 0x00000000
[mepc]     0x0c00961e
[mcause]   0x00000003
[mtval]    0x00000000
[x0 -> zero] 0x00000000
[x1 -> ra] 0x0c00961e
[x2 -> sp] 0x2006b970
[x3 -> gp] 0x20068f24
[x4 -> tp] 0xffffffff
[x5 -> t0] 0xa5a5a5a5
[x6 -> t1] 0x000bee33
[x7 -> t2] 0xa5a5a5a5
[x8 -> s0/fp] 0x2006b980
[x9 -> s1] 0x00000002
[x10 -> a0] 0x0000002e
[x11 -> a1] 0x00000000
[x12 -> a2] 0x00000000
[x13 -> a3] 0x2006ba40
[x14 -> a4] 0x00000000
[x15 -> a5] 0x00000000
[x16 -> a6] 0x00200000
[x17 -> a7] 0x41014000
[x18 -> s2] 0x20000b04
[x19 -> s3] 0x00000005
[x20 -> s4] 0x00000006
[x21 -> s5] 0x20000be9
[x22 -> s6] 0x0c016374
[x23 -> s7] 0x0c013000
[x24 -> s8] 0xa5a5a5a5
[x25 -> s9] 0xa5a5a5a5
[x26 -> s10] 0xa5a5a5a5
[x27 -> s11] 0xa5a5a5a5
[x28 -> t3] 0x0000000f
[x29 -> t4] 0xa5a5a5a5
[x30 -> t5] 0xa5a5a5a5
[x31 -> t6] 0xa5a5a5a5
========== Stack Trace ==========
Start stack backtracing for sp 0x2006b970, pc 0x0c00961e
[frame #0] sp-> 0x2006b970, pc-> 0x0c00961e, stack_size-> 16, ra-> 0x0c00964a
[frame #1] sp-> 0x2006b980, pc-> 0x0c00964a, stack_size-> 16, ra-> 0x0c009674
[frame #2] sp-> 0x2006b990, pc-> 0x0c009674, stack_size-> 16, ra-> 0x0c0096a2
[frame #3] sp-> 0x2006b9a0, pc-> 0x0c0096a2, stack_size-> 16, ra-> 0x0c004544
========== End of Stack Trace ==========
========== End of Crash Dump ==========

Classification and Troubleshooting of LOG Information:

  • Task Information: Task Control Block information is only printed when an exception occurs during task execution; it will not be printed in bare-metal mode. The task name helps to confirm that the above error (the debug breakpoint is an RISCV exception of number 3) occurred in the shell task.

  • Exception Type and Reason: Users can determine the specific error type from the printed message. For example, the output Exception caught on 0x0c00961e with reason [0x3] -> [Breakpoint] indicates a debug exception triggered by the ebreak instruction.

  • General Registers: General-purpose registers record what the CPU was accessing before the exception. The MEPC register points to the address of the instruction where the error occurred, and the X1 (RA) register points to the calling function. By locating the PC value 0x0c00961e in the corresponding assembly file, it can be seen that the exception occurred in crash_SysBreak2().

0c009602 <crash_SysBreak2>:
c009602:     1141                    c.addi  sp,-16
c009604:     c606                    c.swsp  ra,12(sp)
c009606:     c422                    c.swsp  s0,8(sp)
c009608:     0800                    c.addi4spn      s0,sp,16
c00960a:     0c0157b7                lui     a5,0xc015
c00960e:     cd878593                addi    a1,a5,-808 # c014cd8 <__func__.2>
c009612:     0c0157b7                lui     a5,0xc015
c009616:     bbc78513                addi    a0,a5,-1092 # c014bbc <pmap_func+0x648>
c00961a:     ebefe0ef                jal     ra,c007cd8 <__wrap_printf>
c00961e:     9002                    c.ebreak  # Exception Location
c009620:     0001                    c.addi  zero,0
c009622:     40b2                    c.lwsp  ra,12(sp)
c009624:     4422                    c.lwsp  s0,8(sp)
c009626:     0141                    c.addi  sp,16
c009628:     8082                    c.jr    ra
  • Exception Stack: Records the call stack at the time the exception occurred. Not printed in this example; users can modify CONFIG_DEBUG_BACK_TRACE to #undef to enable stack printing.

  • Stack Backtrace Information: The stack backtrace mechanism helps reconstruct the function call sequence. Users can use the PC values found from the log to locate the corresponding functions in the assembly files.

Common Exceptions

The exception mechanism in RISC-V architecture is quite different from that of ARM. RISC-V uses Exception Codes and Cause to identify and locate errors. Common exceptions include illegal memory access, stack overflows, illegal instructions, environment calls, and more.

Common RISC-V Exceptions and Errors

No. / Code

Name

Brief Description

0

Instruction address misaligned

Misaligned instruction address access

1

Instruction access fault

Instruction fetch error (e.g., invalid memory)

2

Illegal instruction

Illegal or unimplemented instruction

3

Breakpoint

Breakpoint exception, typically for debugging

4

Load address misaligned

Misaligned memory access on load

5

Load access fault

Load (read) access to invalid or forbidden memory

6

Store/AMO address misaligned

Misaligned address access on store/AMO

7

Store/AMO access fault

Store/AMO access to invalid or forbidden memory

8

Environment call from U-mode

Environment call (ecall) from User mode

9

Environment call from S-mode

Environment call from Supervisor mode

11

Environment call from M-mode

Environment call from Machine mode

12

Instruction page fault

Instruction page fault (e.g., non-existent or no permission page)

13

Load page fault

Data load page fault

15

Store/AMO page fault

Store/AMO page fault

Note

KR4 is a RISC-V IMACF architecture CPU, and only implements User and Machine modes; Supervisor mode is not implemented.

Instruction Address Misaligned

Instruction unaligned access occurs when the value of the PC register has BIT0 set to 1, that is, the processor attempts to fetch an instruction from a non-2-byte aligned address. For example, attempting to jump to the address of crash_unaligned1 + 1:

__ASM volatile(
    "crash_unaligned0:                      \n\t"
    "la     t1, crash_unaligned1            \n\t"
    "addi   t1, t1, 1                       \n\t" // Set PC unaligned
    "jalr   t1                              \n\t"
    "crash_unaligned1:                      \n\t"
);
  • If the compressed instruction set (16-bit instructions) is implemented, the PC address can be aligned to either 2 bytes or 4 bytes.

  • If only the basic integer instruction set is implemented, the PC address must be aligned to 4 bytes.

Caution

The compiler generally does not generate unaligned PC addresses unless such cases are intentionally constructed.

Instruction Access Fault

Instruction access faults may occur in the following scenarios:

  • Attempting to forcibly switch modes and execute code in a mode that is not currently active. The following example force-switches to user mode, but then executes code intended for machine mode.

    __ASM volatile(
        "csrrw  t5, mstatus, zero               \n\t" // Read mstatus
        "la     t1, 0x00                        \n\t"
        "slli   t1, t1, 11                      \n\t"
        "xor    t5, t5, t1                      \n\t" // Clear Mstatus.MPP
        "csrrw  zero, mstatus, t5               \n\t"
        "la     t0, crash_temp                  \n\t"
        "csrw   mepc, t0                        \n\t"
        "mret                                   \n\t" // Return
        "crash_temp:                            \n\t"
    );
    
  • Jumping to address regions that only support data access, such as memory-mapped UART register regions.

Illegal Instruction

Illegal instruction exceptions may occur in the following scenarios:

  • Executing unsupported instruction sets. For example, executing floating-point instructions on a CPU that does not support floating-point operations.

  • Memory corruption, incorrect or invalid clock settings, leading to instructions fetched from memory being invalid or illegal.

The example below shows an illegal instruction:

__ASM volatile(".word 0xFFFFFFFF       \n\t");

Breakpoint Exception for Debugging

RISC-V uses the ebreak instruction to trigger a breakpoint exception, primarily for debugging purposes.

__ASM volatile("EBREAK");

Load Address Misaligned

In the RISC-V architecture, loading multi-byte data (read operations such as lw/ld instructions) requires the target address to be aligned to the data width. Otherwise, a Load address misaligned exception will occur.

The following example shows reading data from unaligned addresses; the first two accesses are not aligned and will cause a fault:

int a = *((volatile u16 *) 0x20000FFD);   // addr not 4-byte aligned
u16 b = *((volatile u16 *) 0x20000FFD);   // addr not 2-byte aligned
u8  c = *((volatile u8  *) 0x20000FFD);   // OK

When reading data from memory, the data type width must match the address alignment, for example:

  • 4-byte data (uint32_t/32-bit int) must be at addresses divisible by 4 (0x0, 0x4, 0x8, 0xC, …).

  • 2-byte data (uint16_t/16-bit int) must be at addresses divisible by 2 (0x0, 0x2, 0x4, …).

Load Access Fault

A data access fault occurs during a read operation when the address is invalid or there is a read access error. For example, mcause = 5.

uint32_t *p = (uint32_t *)0x90000000;  // 0x90000000 is not mapped
uint32_t  v = *p; // This operation will cause a load access fault

Accessing unmapped memory regions in general can also cause this type of error.

Store Address Misaligned

In the RISC-V architecture, this type of error occurs when writing data (such as sb, sh, sw, sd) to an address that does not satisfy the data width alignment.

  • sw requires the target address to be 4-byte aligned.

  • sh requires the target address to be 2-byte aligned.

  • sd requires the target address to be 8-byte aligned.

    uint8_t buffer[10];
    uint32_t *a = (uint32_t *)(buffer + 1); // a = buffer + 1, assume buffer = 0x1000, so a = 0x1001
    uint16_t *b = (uint16_t *)(buffer + 1); // b = buffer + 1, assume buffer = 0x1000, so b = 0x1001
    uint8_t  *c = (uint8_t  *)(buffer + 1);
    
    *a = 0x12345678; // This will trigger a store address misaligned exception
    *b = 0x1234;     // This will trigger a store address misaligned exception
    *c = 0x12;       // OK
    

Store Access Fault

A data store access fault occurs during a write operation when the address is invalid, illegal (no permission), or a bus error occurs.

uint32_t *p = (uint32_t *)0x90000000; // 0x90000000 is not mapped
*p = 0x12345678; // Triggers a Store access fault

Note

mtval usually saves the bogus or illegal address that caused the exception. For misaligned exceptions, it is always the address of the misaligned instruction/data; for access fault exceptions, it is typically the faulting address (though for some implementations, it may be set to 0).

Similarities and Differences

Similarities and Differences Between RISC-V and ARMv8-M

Type

RISC-V Exception Example

ARMv8-M Exception Example

Misaligned Address

Instruction/Load/Store misaligned

UsageFault: Unaligned Access

Access Violation / Invalid Address

Load/Store Access Fault

MemManage/BusFault

Illegal / Undefined Instruction

Illegal instruction

UsageFault: Undefined Instruction

System Call / Software Interrupt

Environment call (ecall)

SVCall (SVC)

Debug Exception

Breakpoint

Debug Monitor / Breakpoint

Page Table Fault

Series of Page Faults

N/A (Cortex-M lacks MMU, only supports MPU)

Stack Overflow

Fault caused by access violation

UsageFault: Stack overflow