SpeechMind

Overview

SpeechMind is an intelligent voice assistant based on offline AI algorithms for the Realtek Ameba chip. It aims to support application development in the most convenient way possible.

It includes a comprehensive intelligent voice interaction architecture, encompassing audio capture, processing, music playback, and AI voice functionalities.

Architecture

The applications interact with SpeechMind according to the following figure.

../../_images/speechmind_architecture.svg

SpeechMind architecture

Interfaces

Overview

The SpeechMind provides an interface layer for executing basic operations of the voice assistant, including initialization, start, stop, setting callback functions, and destruction.

API layers

Introduction

SpeechMind_Init

High-level API for applications to initialize speechmind.

SpeechMind_Start

High-level API for applications to start speechmind.

SpeechMind_Stop

High-level API for applications to stop speechmind.

SpeechMind_SetCallback

High-level API for applications to set callback of speechmind.

SpeechMind_Deinit

High-level API for applications to destroy speechmind.

The intelligent voice assistant provides a registration function SpeechMind_SetCallback(SpeechMindCallback *callback) for applications to receive output events when using AiVoice. The application needs to implement the interfaces defined in SpeechMindCallback. The relevant interface descriptions are as follows:

Notifies the application when detects the beginning or end of a speech segment.

void (*OnVad)(SpeechMindCallback *callback, VadInfo *info);

The parameter VadInfo includes

parameter

Descriptions

status

the status of VAD 0: VAD Transition from speech to silence indicating the endpoint of a speech segment 1: VAD Transition from silence to speech indicating the start point of a speech segment

offset_ms

Time offset relative to the reset point

When detects the wake up word, notify the application

void (*OnWakeUp)(SpeechMindCallback *callback, WakeUpInfo *info);

The parameter WakeUpInfo includes

parameter

Descriptions

len

The length of wake up word

wakeup_words

The point of wake up words

When receives each input frame, notify the application

void (*OnAfe)(SpeechMindCallback *callback, AfeInfo *info);

The parameter AfeInfo includes

parameter

Descriptions

len

The length of AFE data

data

The point of AFE data

When detects the command word, notify the application

void (*OnAsr)(SpeechMindCallback *callback, AsrInfo *info);

The parameter AsrInfo includes

parameter

Descriptions

id

ID of the JSON string

Notify the application when ASR/VAD times out

void (*OnAsrRecTimeout)(SpeechMindCallback *callback);