TensorFlow Lite (TFLite)

Supported ICs

RTL8730E (Supported Kernel: CA32)

Overview

TFLite, or LiteRT, is Google’s on-device framework for high-performance ML & GenAI deployment on edge platforms, using efficient conversion, runtime, and optimization.

Ameba-tflite is a version of the TFLite built for Realtek Ameba SoCs.

More Information

Repository and Dependencies

Code Repository

Ameba-tflite is hosted on GitHub: ameba-tflite.

This library is designed to be used with Realtek’s Linux SDK .

Code Structure

The repository is organized as follows:

Directory	Description
inc	Header files for the Ameba-tflite library
lib	Precompiled libraries for Ameba SoC
examples	Example applications demonstrating usage
bin	Precompiled tools for testing and debugging
build	The precompiled libraries’ build scripts
patch	Patch applied to TensorFlow Lite source code
License	License information: Apache License 2.0
Makefile	Makefile for building examples under Ameba Linux SDK
README.md	Documentation and tutorials
version.txt	Version information

Note

Ameba-tflite library is built using the scripts in the build directory, based on the TensorFlow Lite source code version specified in version.txt, with patches from the patch directory applied. Ameba-tflite does not include TensorFlow Lite source code. Please refer to the official TensorFlow repository for source code details.

Download

Auto Download: refer to Linux SDK Download to download the Linux SDK, which includes ameba-tflite as a submodule.
Manual Download: Clone ameba-tflite-micro separately and place it in the SDK’s specified path.

SoC	Kernel	SDK	TFLite Path
RTL8730E	CA32	ameba-linux	{SDK}/sources/development/tflite

Tutorial

TensorFlow Lite C++ minimal example

This example shows how you can build a simple TensorFlow Lite application.

Example codes: {SDK}/sources/development/tflite/examples/minimal.

Build the example executable

First modify the EXAMPLE_NAME in {SDK}/sources/development/tflite/Makefile to minimal.

Then run the following commands to build the example executable:

cd {SDK}/
. ./envsetup.sh
bitbake rtk-tflite-algo

After building, the executable file will be located at build_rtl8730elh-va7-full/tmp/work/rtl8730elh_va7-rtk-linux-gnueabi/rtk-tflite-algo/1.0/image/bin/rtk_tflite_algo.

Run the example executable

Push the executable minimal and your model.tflite to the device using ADB , then run it with:

./minimal model.tflite

TensorFlow Lite C++ image classification demo

This example shows how you can load a pre-trained and converted TensorFlow Lite model and use it to recognize objects in images.

Example codes: {SDK}/sources/development/tflite/examples/label_image.

Build the example executable

First modify the EXAMPLE_NAME in {SDK}/sources/development/tflite/Makefile to label_image.

Then run the following commands to build the example executable:

cd {SDK}/
. ./envsetup.sh
bitbake rtk-tflite-algo

After building, the executable file will be located at build_rtl8730elh-va7-full/tmp/work/rtl8730elh_va7-rtk-linux-gnueabi/rtk-tflite-algo/1.0/image/bin/rtk_tflite_algo.

Download sample model and labels

You can use any compatible model, but the following MobileNet v1 model offers a good demonstration of a model trained to recognize 1,000 different objects.

# Get model
curl https://storage.googleapis.com/download.tensorflow.org/models/mobilenet_v1_2018_02_22/mobilenet_v1_1.0_224.tgz | tar xzv -C ./tmp

# Get labels
curl https://storage.googleapis.com/download.tensorflow.org/models/mobilenet_v1_1.0_224_frozen.tgz  | tar xzv -C ./tmp  mobilenet_v1_1.0_224/labels.txt

mv ./tmp/mobilenet_v1_1.0_224/labels.txt ./tmp/

Run the example executable

Push data on devices using ADB , e.g.:

adb push build_rtl8730elh-va7-full/tmp/work/rtl8730elh_va7-rtk-linux-gnueabi/rtk-tflite-algo/1.0/image/bin/rtk_tflite_algo  /data/local/tmp
adb push tmp/mobilenet_v1_1.0_224.tflite  /data/local/tmp
adb push examples/label_image/testdata/grace_hopper.bmp  /data/local/tmp
adb push tmp/labels.txt /data/local/tmp

Run it on device with:

/data/local/tmp/label_image \
 -m /data/local/tmp/mobilenet_v1_1.0_224.tflite \
 -i /data/local/tmp/grace_hopper.bmp \
 -l /data/local/tmp/labels.txt

The output should look something like this:

Loaded model /data/local/tmp/mobilenet_v1_1.0_224.tflite
resolved reporter
INFO: Initialized TensorFlow Lite runtime.
invoked
average time: 25.03 ms
0.907071: 653 military uniform
0.0372416: 907 Windsor tie
0.00733753: 466 bulletproof vest
0.00592852: 458 bow tie
0.00414091: 514 cornet

With -h or any other unsupported flags, label_image will list supported options:

./label_image -h

label_image
--accelerated, -a: [0|1], use Android NNAPI or not
--old_accelerated, -d: [0|1], use old Android NNAPI delegate or not
--allow_fp16, -f: [0|1], allow running fp32 models with fp16 or not
--count, -c: loop interpreter->Invoke() for certain times
--gl_backend, -g: [0|1]: use GPU Delegate on Android
--hexagon_delegate, -j: [0|1]: use Hexagon Delegate on Android
--input_mean, -b: input mean
--input_std, -s: input standard deviation
--image, -i: image_name.bmp
--labels, -l: labels for the model
--tflite_model, -m: model_name.tflite
--profiling, -p: [0|1], profiling or not
--num_results, -r: number of results to show
--threads, -t: number of threads
--verbose, -v: [0|1] print more information
--warmup_runs, -w: number of warmup runs
--xnnpack_delegate, -x [0:1]: xnnpack delegate

See the label_image.cc source code for other command line options.

Note

This demo executable doesn’t support XNNPACK delegate or Hexagon delegate or GPU delegate, since they are not built into TensorFlow Lite library.

Tools

TFLite Model Benchmark Tool with C++ Binary

A prebuilt C++ binary to benchmark a TFLite model and its individual operators on devices. The binary takes a TFLite model, generates random inputs and then repeatedly runs the model for specified number of runs. Aggregate latency statistics are reported after running the benchmark.

Usage

Push the executable file benchmark_model and your model file to the device directly using ADB . Then run the following command on the device:

./benchmark_model --graph=<your_model.tflite>

The output will look similar to the following:

./benchmark_model --graph=conv_1024x1x1x1024_input1x1_int8.tflite

INFO: STARTING!
INFO: Log parameter values verbosely: [0]
INFO: Graph: [conv_1024x1x1x1024_input1x1_int8.tflite]
INFO: Signature to run: []
INFO: Loaded model conv_1024x1x1x1024_input1x1_int8.tflite
INFO: The input model file size (MB): 1.07843
INFO: Initialized session in 2.962ms.
INFO: Running benchmark for at least 1 iterations and at least 0.5 seconds but terminate if exceeding 150 seconds.
INFO: count=270 first=2147 curr=1839 min=1832 max=2147 avg=1849.66 std=27

INFO: Running benchmark for at least 50 iterations and at least 1 seconds but terminate if exceeding 150 seconds.
INFO: count=539 first=1861 curr=1835 min=1830 max=2137 avg=1848.98 std=25

INFO: Inference timings in us: Init: 2962, First inference: 2147, Warmup (avg): 1849.66, Inference (avg): 1848.98
INFO: Note: as the benchmark tool itself affects memory footprint, the following is only APPROXIMATE to the actual memory footprint of the model at runtime. Take the information at your discretion.
INFO: Memory footprint delta from the start of the tool (MB): init=2.875 overall=3.875

You can use --help, and benchmark will list the supported options:

Flags:
--num_runs=50                                        int32   optional        expected number of runs, see also min_secs, max_secs
--graph=                                             string  optional        graph file name
--input_layer=                                       string  optional        input layer names
--enable_op_profiling=false                          bool    optional        enable op profiling
...(more options)

Note

This binary tool does not support using XNNPACK delegate or Hexagon delegate or GPU delegate, as they are not built-in the precompiled Ameba-tflite library.

More Information

Official TensorFlow Lite Benchmark