TensorFlowLite-Micro (TFLM)

Supported ICs

RTL8721Dx (Supported Kernel: KM4)
RTL8721F (Supported Kernel: KM4)
RTL8720E (Supported Kernel: KM4)
RTL8710E (Supported Kernel: KM4)
RTL8726E (Supported Kernel: KM4, DSP)
RTL8713E (Supported Kernel: KM4, DSP)
RTL8730E (Supported Kernel: CA32)

Overview

TensorFlowLite-Micro (TFLM) is an open-source library, it is a port of TensorFlow Lite designed to run machine learning models on DSPs, microcontrollers and other devices with limited memory.

Ameba-tflite-micro is a version of the TFLM for Realtek Ameba SoCs with platform specific optimizations.

More Information

Repository and Dependencies

Code Repository

Ameba-tflite-micro source code is hosted on GitHub: ameba-tflite-micro.

Dependencies

Ameba-tflite-micro requires one of the following SDKs:

SDK selection depends on the SoC and kernel.

Download Methods

Auto Download: Use git clone --recursive to fetch the SDK, which includes ameba-tflite-micro as a submodule.
Manual Download: Clone ameba-tflite-micro separately and place it in the SDK’s specified path.

SoC	Kernel	SDK	TFLM Path
RTL8721Dx	KM4	ameba-rtos	{SDK}/component/tflite_micro
RTL8721F	KM4	ameba-rtos	{SDK}/component/tflite_micro
RTL8720E	KM4	ameba-rtos	{SDK}/component/tflite_micro
RTL8730E	CA32	ameba-rtos	{SDK}/component/tflite_micro
RTL8713E/RTL8726E	KM4	ameba-rtos	{SDK}/component/tflite_micro
RTL8713E/RTL8726E	DSP	ameba-dsp	{SDK}/lib/tflite_micro

Examples

The examples are located in the examples directory of the ameba-tflite-micro repository.

Example	Description
tflm_hello_world	Basic TFLM inference demo with float and quantized models for model loading and inference
tflm_micro_speech	Audio keyword recognition (“yes”/”no”) using spectrogram-based audio preprocessing
tflm_mnist	Image classification: handwritten digit recognition (0-9) using CNN model

Build TFLM

RTL8721Dx:

Before building TFLM Example, please refer to SDK Download to download the XDK, and successfully build the default application of the SDK according to CLI Build and Download .

Enter SDK directory and run env.sh to set up environment

cd {SDK}/
source env.sh

Run ameba.py menuconfig to enter configuration interface and enable TFLM

ameba.py menuconfig <ic_name>  ;; Replace with chip name, e.g. RTL8721Dx

Menu path:

--------MENUCONFIG FOR General---------
CONFIG TrustZone  --->
...
CONFIG APPLICATION  --->
   GUI Config  --->
   ...
   AI Config  --->
      [*] Enable TFLITE MICRO

Build example (replace tflm_hello_world with your target example name)

ameba.py build -a tflm_hello_world

More examples are in {SDK}/component/tflite_micro/examples.

RTL8720E:

Before building TFLM Example, please refer to SDK Download to download the XDK, and successfully build the default application of the SDK according to CLI Build and Download .

Enter SDK directory and run env.sh to set up environment

cd {SDK}/
source env.sh

Run ameba.py menuconfig to enter configuration interface and enable TFLM (KM4 core)

ameba.py menuconfig <ic_name>  ;; Replace with chip name, e.g. RTL8726E, RTL8710E

Menu path:

--------MENUCONFIG FOR General---------
CONFIG TrustZone  --->
...
CONFIG APPLICATION  --->
   GUI Config  --->
   ...
   AI Config  --->
      [*] Enable TFLITE MICRO

Build example (replace tflm_hello_world with your target example name)

ameba.py build -a tflm_hello_world

More examples are in {SDK}/component/tflite_micro/examples.

RTL8710E:

Before building TFLM Example, please refer to SDK Download to download the XDK, and successfully build the default application of the SDK according to CLI Build and Download .

Enter SDK directory and run env.sh to set up environment

cd {SDK}/
source env.sh

Run ameba.py menuconfig to enter configuration interface and enable TFLM (KM4 core)

ameba.py menuconfig <ic_name>  ;; Replace with chip name, e.g. RTL8726E, RTL8710E

Menu path:

--------MENUCONFIG FOR General---------
CONFIG TrustZone  --->
...
CONFIG APPLICATION  --->
   GUI Config  --->
   ...
   AI Config  --->
      [*] Enable TFLITE MICRO

Build example (replace tflm_hello_world with your target example name)

ameba.py build -a tflm_hello_world

More examples are in {SDK}/component/tflite_micro/examples.

RTL8726E:

DSP:

Enter TFLM directory in DSP SDK

cd {DSPSDK}/lib/tflite_micro

Run build script to build TFLM library

./build/build_amebalite_dsp.sh

Output library will be saved to {DSPSDK}/lib/tflite_micro/amebalite_dsp-out/.

To build example firmware, refer to DSP Build for steps and the README in example directory for software configurations.

More examples are in {DSPSDK}/lib/tflite_micro/examples.

KM4:

Before building TFLM Example, please refer to SDK Download to download the XDK, and successfully build the default application of the SDK according to CLI Build and Download .

Enter SDK directory and run env.sh to set up environment

cd {SDK}/
source env.sh

Run ameba.py menuconfig to enter configuration interface and enable TFLM (KM4 core)

ameba.py menuconfig <ic_name>  ;; Replace with chip name, e.g. RTL8726E, RTL8710E

Menu path:

--------MENUCONFIG FOR General---------
CONFIG TrustZone  --->
...
CONFIG APPLICATION  --->
   GUI Config  --->
   ...
   AI Config  --->
      [*] Enable TFLITE MICRO

Build example (replace tflm_hello_world with your target example name)

ameba.py build -a tflm_hello_world

More examples are in {SDK}/component/tflite_micro/examples.

RTL8713E:

DSP:

Enter TFLM directory in DSP SDK

cd {DSPSDK}/lib/tflite_micro

Run build script to build TFLM library

./build/build_amebalite_dsp.sh

Output library will be saved to {DSPSDK}/lib/tflite_micro/amebalite_dsp-out/.

To build example firmware, refer to DSP Build for steps and the README in example directory for software configurations.

More examples are in {DSPSDK}/lib/tflite_micro/examples.

KM4:

Before building TFLM Example, please refer to SDK Download to download the XDK, and successfully build the default application of the SDK according to CLI Build and Download .

Enter SDK directory and run env.sh to set up environment

cd {SDK}/
source env.sh

Run ameba.py menuconfig to enter configuration interface and enable TFLM (KM4 core)

ameba.py menuconfig <ic_name>  ;; Replace with chip name, e.g. RTL8726E, RTL8710E

Menu path:

--------MENUCONFIG FOR General---------
CONFIG TrustZone  --->
...
CONFIG APPLICATION  --->
   GUI Config  --->
   ...
   AI Config  --->
      [*] Enable TFLITE MICRO

Build example (replace tflm_hello_world with your target example name)

ameba.py build -a tflm_hello_world

More examples are in {SDK}/component/tflite_micro/examples.

RTL8730E:

Before building TFLM Example, please refer to SDK Download to download the XDK, and successfully build the default application of the SDK according to CLI Build and Download .

Enter SDK directory and run env.sh to set up environment

cd {SDK}/
source env.sh

Run ameba.py menuconfig to enter configuration interface and enable TFLM

ameba.py menuconfig <ic_name>  ;; Replace with chip name, e.g. RTL8730E

Menu path:

--------MENUCONFIG FOR General---------
CONFIG TrustZone  --->
...
CONFIG APPLICATION  --->
   GUI Config  --->
   ...
   AI Config  --->
      [*] Enable TFLITE MICRO

Build example (replace tflm_hello_world with your target example name)

ameba.py build -a tflm_hello_world

More examples are in {SDK}/component/tflite_micro/examples.

RTL8721F:

Before building TFLM Example, please refer to SDK Download to download the XDK, and successfully build the default application of the SDK according to CLI Build and Download .

Enter SDK directory and run env.sh to set up environment

cd {SDK}/
source env.sh

Run ameba.py menuconfig to enter configuration interface and enable TFLM

ameba.py menuconfig <ic_name>  ;; Replace with chip name, e.g. RTL8721F

Menu path:

--------MENUCONFIG FOR General---------
CONFIG TrustZone  --->
...
CONFIG APPLICATION  --->
   GUI Config  --->
   ...
   AI Config  --->
      [*] Enable TFLITE MICRO

Build example (replace tflm_hello_world with your target example name)

ameba.py build -a tflm_hello_world

More examples are in {SDK}/component/tflite_micro/examples.

Tutorial

MNIST Introduction

The MNIST database (Modified National Institute of Standards and Technology database) is a large collection of handwritten digits, widely used for training and validating machine learning models.

In this tutorial, MNIST database is used to show a full workflow from training a model to deploying it and run inference on Ameba SoCs with TFLM.

Example codes: {SDK}/component/tflite_micro/examples/tflm_mnist.

Experimental Steps

Note

Step 1-4 are for preparing necessary files on a development machine (server or PC etc.). You can skip these steps and use the prepared files in SDK to build and download in Step 5.

Step 1: Environment Setup

Navigate to the example directory:

cd {SDK}/component/tflite_micro/examples/tflm_mnist

First time setup: Create a virtual environment and install dependencies:

python -m venv venv
source venv/bin/activate  # on Windows: venv\Scripts\activate
pip install -r requirements.txt

Existing environment: If you have already created the virtual environment, simply activate it:

source venv/bin/activate  # on Windows: venv\Scripts\activate

Note

Due to requirements of some libraries, Python version needs to be between 3.8-3.11.

Step 2: Train a Model

Use Keras (Tensorflow) or PyTorch to train a classification model for 10 digits of MNIST dataset.

The example uses a simple convolution based model, it will train several epochs and then test accuracy.

Keras:

Run script

python keras_train_eval.py --output keras_mnist_conv

Output files:

keras_mnist_conv/saved_model/ - Keras SavedModel format

Note

Due to the limited computation resources and memory of microcontrollers, it is recommended to pay attention to model size and operation numbers. In keras_train_eval.py, keras_flops library is used:

from keras_flops import get_flops

model.summary()
flops = get_flops(model, batch_size=1)

PyTorch:

Run script

python torch_train_eval.py --output torch_mnist_conv

Output files:

torch_mnist_conv/model.pt - PyTorch model checkpoint
torch_mnist_conv/model.onnx - ONNX format for later conversion

Note

Due to the limited computation resources and memory of microcontrollers, it is recommended to pay attention to model size and operation numbers. In torch_train_eval.py, ptflops library is used:

from ptflops import get_model_complexity_info

macs, params = get_model_complexity_info(model, (1,28,28), as_strings=False)

Step 3: Convert to Tflite

Apply post-training integer quantization on the trained model and output .tflite format.

Float model inference is also supported on Ameba SoCs, however, it is recommended to use integer quantization which can extremely reduce computation and memory with little performance degradation.

Refer to tflite official site for more details about integer-only quantization.

Keras:

Run script

python convert.py --input-path keras_mnist_conv/saved_model --output-path keras_mnist_conv

Note

convert.py script details:

tf.lite.TFLiteConverter is used to convert SavedModel to .tflite format
Below settings for int8 quantization:

converter = tf.lite.TFLiteConverter.from_saved_model(saved_model_dir)
converter.optimizations = [tf.lite.Optimize.DEFAULT]
converter.representative_dataset = repr_dataset
converter.target_spec.supported_ops = [tf.lite.OpsSet.TFLITE_BUILTINS_INT8]
converter.inference_input_type = tf.int8
converter.inference_output_type = tf.int8
tflite_int8_model = converter.convert()

PyTorch:

Run script

python convert.py --input-path torch_mnist_conv/model.onnx --output-path torch_mnist_conv

Note

convert.py script details:

onnx-tensorflow library is used to convert .onnx to SavedModel format
tf.lite.TFLiteConverter is used to convert SavedModel to .tflite format
Below settings for int8 quantization:

converter = tf.lite.TFLiteConverter.from_saved_model(saved_model_dir)
converter.optimizations = [tf.lite.Optimize.DEFAULT]
converter.representative_dataset = repr_dataset
converter.target_spec.supported_ops = [tf.lite.OpsSet.TFLITE_BUILTINS_INT8]
converter.inference_input_type = tf.int8
converter.inference_output_type = tf.int8
tflite_int8_model = converter.convert()

Tip

Recommended libraries for .onnx model conversion:

After conversion, the performance on test set will be validated using the TFLite model.

Output files:

{output}/model_int8.tflite - Int8 quantized TFLite model
testdata/input_int8.npy - Input test data (100 samples)
testdata/label_int8.npy - Label test data (100 samples)

Step 4: Convert to C++ and Prepare for Deployment

Run below commands to convert .tflite model and .npy test data to .cc and .h files for deployment:

cp {output}/model_int8.tflite models/model_int8.tflite
python generate_cc_arrays.py models models/model_int8.tflite
python generate_cc_arrays.py testdata testdata/input_int8.npy testdata/input_int8.npy testdata/label_int8.npy testdata/label_int8.npy

Output files:

models/model_int8_model_data.{cc,h} - Model data
testdata/input_int8_test_data.{cc,h} - Input test data
testdata/label_int8_test_data.{cc,h} - Label test data

Note

If using your own trained model, replace the generated files in models/ and testdata/ directories before building.
The generated files will be used in example_tflm_mnist.cc to run inference on the SoC, calculate accuracy, and profile memory/latency.
Use netron to visualize the .tflite file and check the operations used by the model. When implementing inference, instantiate the operations resolver to register required operations:

using MnistOpResolver = tflite::MicroMutableOpResolver<4>;

TfLiteStatus RegisterOps(MnistOpResolver& op_resolver) {
    TF_LITE_ENSURE_STATUS(op_resolver.AddFullyConnected());
    TF_LITE_ENSURE_STATUS(op_resolver.AddConv2D());
    TF_LITE_ENSURE_STATUS(op_resolver.AddMaxPool2D());
    TF_LITE_ENSURE_STATUS(op_resolver.AddReshape());
    return kTfLiteOk;
}

Refer to tflite-micro official site for more details.

Step 5: Build and Download Firmware

Follow steps in Build TFLM to build the example image and download image using Flash Program Tool .

Step 6: View Logs

After flashing, open the serial terminal to view the inference results:

[TFLITE-MICRO] ~~~TESTS START~~~
[TFLITE-MICRO] model arena size = 16204
[TFLITE-MICRO] Accuracy: 100/100

"Unique Tag","Total ticks across all events with that tag."
CONV_2D, 29763
MAX_POOL_2D, 3422
RESHAPE, 339
FULLY_CONNECTED, 28897
"total number of ticks", 62421

[RecordingMicroAllocator] Arena allocation total 16204 bytes
[RecordingMicroAllocator] Arena allocation head 13968 bytes
[RecordingMicroAllocator] Arena allocation tail 2236 bytes

[TFLITE-MICRO] Total Time: 72.185 ms

[TFLITE-MICRO] ~~~ALL TESTS PASSED~~~

Log Explanation:

model arena size = 16204 - Memory pool size required by the model (in bytes)
Accuracy: 100/100 - Number of correct predictions out of 100 test samples
CONV_2D, MAX_POOL_2D, etc. - Time spent on each operation type (in microseconds)
total number of ticks = 62421 - Total inference time
Total Time: 72.185 ms - Total inference time in milliseconds

Note

The inference time and memory usage may vary on different chips (e.g., RTL8721Dx, RTL8730E), cores (KM4, CA32, DSP), and SDK versions. The values above are for reference only.