SpeechMind
Overview
SpeechMind is an intelligent voice assistant based on offline AI algorithms for the Realtek Ameba chip. It aims to support application development in the most convenient way possible.
It includes a comprehensive intelligent voice interaction architecture, encompassing audio capture, processing, music playback, and AI voice functionalities.
Architecture
The applications interact with SpeechMind according to the following figure.
SpeechMind architecture
Interfaces
Overview
The SpeechMind provides an interface layer for executing basic operations of the voice assistant, including initialization, start, stop, setting callback functions, and destruction.
API layers |
Introduction |
|---|---|
SpeechMind_Init |
High-level API for applications to initialize speechmind. |
SpeechMind_Start |
High-level API for applications to start speechmind. |
SpeechMind_Stop |
High-level API for applications to stop speechmind. |
SpeechMind_SetCallback |
High-level API for applications to set callback of speechmind. |
SpeechMind_Deinit |
High-level API for applications to destroy speechmind. |
The intelligent voice assistant provides a registration function SpeechMind_SetCallback(SpeechMindCallback *callback)
for applications to receive output events when using AiVoice. The application needs to implement the interfaces defined in SpeechMindCallback.
The relevant interface descriptions are as follows:
Notifies the application when detects the beginning or end of a speech segment.
void (*OnVad)(SpeechMindCallback *callback, VadInfo *info);
The parameter VadInfo includes
parameter |
Descriptions |
|---|---|
status |
the status of VAD 0: VAD Transition from speech to silence indicating the endpoint of a speech segment 1: VAD Transition from silence to speech indicating the start point of a speech segment |
offset_ms |
Time offset relative to the reset point |
When detects the wake up word, notify the application
void (*OnWakeUp)(SpeechMindCallback *callback, WakeUpInfo *info);
The parameter WakeUpInfo includes
parameter |
Descriptions |
|---|---|
len |
The length of wake up word |
wakeup_words |
The point of wake up words |
When receives each input frame, notify the application
void (*OnAfe)(SpeechMindCallback *callback, AfeInfo *info);
The parameter AfeInfo includes
parameter |
Descriptions |
|---|---|
len |
The length of AFE data |
data |
The point of AFE data |
When detects the command word, notify the application
void (*OnAsr)(SpeechMindCallback *callback, AsrInfo *info);
The parameter AsrInfo includes
parameter |
Descriptions |
|---|---|
id |
ID of the JSON string |
Notify the application when ASR/VAD times out
void (*OnAsrRecTimeout)(SpeechMindCallback *callback);