SpeechMind

Overview

SpeechMind is an intelligent voice assistant based on offline AI algorithms for the Realtek Ameba chip. It aims to support application development in the most convenient way possible.

It includes a comprehensive intelligent voice interaction architecture, encompassing audio capture, processing, music playback, and AI voice functionalities.

Architecture

The applications interact with SpeechMind according to the following figure.

Interfaces

Overview

The SpeechMind provides an interface layer for executing basic operations of the voice assistant, including initialization, start, stop, setting callback functions, and destruction.

API layers	Introduction
SpeechMind_Init	High-level API for applications to initialize speechmind.
SpeechMind_Start	High-level API for applications to start speechmind.
SpeechMind_Stop	High-level API for applications to stop speechmind.
SpeechMind_SetCallback	High-level API for applications to set callback of speechmind.
SpeechMind_Deinit	High-level API for applications to destroy speechmind.

The intelligent voice assistant provides a registration function SpeechMind_SetCallback(SpeechMindCallback *callback) for applications to receive output events when using AiVoice. The application needs to implement the interfaces defined in SpeechMindCallback. The relevant interface descriptions are as follows:

Notifies the application when detects the beginning or end of a speech segment.

void (*OnVad)(SpeechMindCallback *callback, VadInfo *info);

The parameter VadInfo includes

parameter	Descriptions
status	the status of VAD 0: VAD Transition from speech to silence indicating the endpoint of a speech segment 1: VAD Transition from silence to speech indicating the start point of a speech segment
offset_ms	Time offset relative to the reset point

When detects the wake up word, notify the application

void (*OnWakeUp)(SpeechMindCallback *callback, WakeUpInfo *info);

The parameter WakeUpInfo includes

parameter	Descriptions
len	The length of wake up word
wakeup_words	The point of wake up words

When receives each input frame, notify the application

void (*OnAfe)(SpeechMindCallback *callback, AfeInfo *info);

The parameter AfeInfo includes

parameter	Descriptions
len	The length of AFE data
data	The point of AFE data

When detects the command word, notify the application

void (*OnAsr)(SpeechMindCallback *callback, AsrInfo *info);

The parameter AsrInfo includes

parameter	Descriptions
id	ID of the JSON string

Notify the application when ASR/VAD times out

void (*OnAsrRecTimeout)(SpeechMindCallback *callback);