Asymmetric Crypto Engine

Introduction

The Asymmetric Crypto Engine (PKE) is a dedicated hardware acceleration unit within the SoC for handling public-key cryptographic operations. Compared with software implementations, the hardware acceleration engine can significantly improve the execution efficiency of asymmetric cryptographic algorithms while protecting key security through a physical isolation mechanism, playing an important role in IoT security systems.

Application Background

In IoT device security scenarios, asymmetric cryptographic algorithms are widely used in critical security processes such as digital signatures, key exchange, and identity authentication. However, such algorithms involve a large number of complex mathematical operations (e.g., large-number modular exponentiation and elliptic curve point multiplication). If implemented entirely in software, they consume significant CPU resources with low execution efficiency. The Asymmetric Crypto Engine implements these core operations through dedicated hardware circuits, greatly improving operation speed while ensuring security.

Working Principle

The Asymmetric Crypto Engine connects to the CPU through the APB bus and internally contains dedicated computation units and storage units. When the application layer needs to perform an asymmetric cryptographic operation, the CPU writes the operation parameters into the engine’s registers and memory area, configures the control register to start the operation, and the engine hardware automatically completes the core cryptographic operation and returns the result. The entire process is transparent to the upper-layer application, and the user can complete the operation through APIs.

Security Features

The Asymmetric Crypto Engine fully considers security protection requirements in its architecture design:

  • Physical isolation: The engine’s internal storage unit is isolated from the system bus, preventing keys from being stolen by bus sniffing attacks.

  • OTP key support: Supports pre-burning private keys into the OTP (One Time Programmable) region. The OTP key is directly connected to the engine and cannot be read or tampered with externally.

  • Side-channel attack protection: Some chip models support DPA (Differential Power Analysis), SPA (Simple Power Analysis), and Timing Attack protection.

  • TrustZone support: Some chip models support ARM TrustZone technology and can automatically identify the CPU’s secure/non-secure access state.

Advantages

Compared with a pure software implementation, the Asymmetric Crypto Engine offers the following advantages:

  • High performance: Hardware acceleration significantly shortens operation time and reduces CPU usage.

  • Low power consumption: Dedicated hardware circuits consume less power than running software algorithms on a general-purpose CPU.

  • High security: The physical isolation mechanism and OTP key storage effectively prevent key leakage.

  • Ease of use: Provides well-encapsulated API interfaces, so users do not need to be concerned with low-level hardware details.

Functional Architecture

The following lists the functional specifications of the Asymmetric Crypto Engine by chip series, including the basic functions.

RTL8721Dx:

Not supported.

OTP Keys

In addition to software keys, the engine also supports pre-burning private keys into the OTP region. The OTP key is directly connected to the engine through a physical isolation mechanism and cannot be read or tampered with externally. It is the recommended solution for protecting core private keys in production environments.

RTL8721Dx:

Not supported.

Working Principle

The Asymmetric Crypto Engine connects to the CPU through the APB bus and operates in Slave mode. The engine internally contains computation units, storage units, and control registers. The CPU completes operation parameter configuration and result retrieval by reading and writing these registers.

To ensure concurrency safety in a multitasking environment, the engine has a built-in hardware mutex mechanism. When a Secure-state CPU holds the lock, all Non-secure accesses are blocked; if a Non-secure CPU holds the lock, the Secure CPU can forcibly acquire usage rights through a dedicated preemption register, ensuring priority for secure tasks.

Workflow

The standard operation flow of the engine is as follows:

  1. Acquire the mutex: The CPU acquires the engine’s hardware mutex.

  2. Write operation parameters: Write the algorithm parameters into the engine’s storage unit.

  3. Configure operation mode: Set the control register to select the required operation mode.

  4. Start the operation: Enable the engine to start the computation.

  5. Monitor operation progress: Poll the status register to monitor operation progress.

  6. Retrieve operation result: After detecting the completion flag bit, read the operation result from the storage unit.

  7. Release the mutex: Release the mutex so that other tasks can use the engine.

Exception Handling

The engine provides the following exception handling mechanisms:

  • Error identification: The status register contains error flag bits; if an error is detected during polling, the process can be terminated immediately.

  • Error feedback: The API returns predefined error codes (a non-zero value indicates an error, and 0 indicates normal completion).

Usage

After introducing the engine’s working principle and flow, the following describes how to use the engine. The engine provides two usage methods: the low-level direct API and the MbedTLS integrated API.

Development Phase

During the development phase, users typically use software keys for functional verification and debugging:

  1. Select the API type:

    • Use the low-level API to directly control the engine, which offers more comprehensive functionality but requires an understanding of the engine’s working principle.

    • Use the MbedTLS integrated API, which is more general but only supports software keys.

  2. Configure the SDK:

    • Enable the Asymmetric Crypto Engine related options in SDK menuconfig.

    • Select the supported algorithm curves according to requirements.

  3. Call the API:

    • Refer to the example code to call the corresponding API to complete key generation, signing, or verification operations.

Production Phase

During the production phase, OTP keys are recommended for higher security:

  1. Generate the key pair:

    • Generate the public/private key pair during the development phase.

    • Save the public key for signature verification and the private key for OTP burning.

  2. Burn the OTP key:

    • Burn the private key to the specified OTP address (see the OTP key table above).

    • Configure the key read protection/write protection bits.

    Warning

    The OTP (One Time Programmable) region can be written only once and cannot be erased or revoked. Carefully verify the address and data before executing the write command.

  3. Use the low-level API:

    • The low-level API must be used when using OTP keys.

    • The MbedTLS API does not support OTP keys.

API

Whether using software keys or OTP keys, Realtek provides comprehensive API interfaces, so users do not need to be concerned with low-level register operation details.

Low-level API

The low-level API provides full control over the engine and supports both software keys and OTP keys:

  • Key generation API: Generates ECC or RSA key pairs.

  • Signing API: Uses the private key to digitally sign data.

  • Verification API: Uses the public key to verify signature validity.

  • Key exchange API: Performs ECDH key exchange.

  • OTP key configuration API: Configures the access permissions of OTP keys.

MbedTLS Integrated API

To improve compatibility, Realtek has integrated the hardware acceleration engine into the MbedTLS API. Users can use the standard MbedTLS ECDSA/ECDH API, with hardware acceleration automatically invoked underneath. The MbedTLS API only supports software keys, not OTP keys. Due to hardware limitations, MbedTLS support for the SECP521R1 curve has been disabled.

Supported Algorithms and Curves

The algorithm curves supported by each chip are listed below. Common curve parameters are pre-built into the ROM, so users do not need to provide them additionally:

RTL8721Dx:

Not supported.