Audio Module
Supported ICs: [RTL8735C]
Overview
The audio_module provides AP-side audio capture and playback through the built-in audio hardware. It supports both analog microphone (AMIC) and digital microphone (DMIC) inputs, and drives speaker/headphone output through the DAC.
Within the Multimedia Framework, the audio module acts as a source (capture) for upstream processing such as AAC encoding, and as a sink (playback) for decoded audio from the network or local playback.
[Audio Source] --> [audio_module] --> SISO --> [aac_module] --> ...
(capture) (DMA/ALC)
... --> [aad_module] --> SISO --> [audio_module] --> [Speaker/Headphone]
(playback)
audio_params_t audio_params;
mm_module_ctrl(audio_ctx, CMD_AUDIO_GET_PARAMS, (int)&audio_params);
audio_params.sample_rate = ASR_16KHZ;
audio_params.channel = 1;
audio_params.use_mic_type = USE_AUDIO_AMIC;
audio_params.mic_gain = MIC_20DB;
audio_params.avsync_en = 1;
mm_module_ctrl(audio_ctx, CMD_AUDIO_SET_PARAMS, (int)&audio_params);
mm_module_ctrl(audio_ctx, MM_CMD_SET_QUEUE_LEN, 6);
mm_module_ctrl(audio_ctx, MM_CMD_INIT_QUEUE_ITEMS, MMQI_FLAG_STATIC);
mm_module_ctrl(audio_ctx, CMD_AUDIO_APPLY, 0);
sample_rate
Selects the audio sample rate.
Constant |
Value |
Description |
|---|---|---|
|
0 |
8000 Hz, narrowband voice |
|
1 |
16000 Hz, wideband voice |
|
2 |
32000 Hz |
|
3 |
44100 Hz, CD quality |
|
4 |
48000 Hz |
|
5 |
88200 Hz |
|
6 |
96000 Hz |
Note
ASP algorithms (AEC, NS, AGC) only support 8 kHz and 16 kHz. Higher sample rates bypass ASP processing.
word_length
Constant |
Value |
Description |
|---|---|---|
|
0 |
16-bit PCM (default) |
|
1 |
24-bit PCM |
channel
Number of channels: 1 (mono) or 2 (stereo). When using USE_AUDIO_STEREO_DMIC, the two channels are merged into a single mono stream by default.
use_mic_type
Selects the microphone input type.
Constant |
Value |
Description |
|---|---|---|
|
0 |
Analog microphone |
|
1 |
Left digital microphone |
|
2 |
Right digital microphone |
|
3 |
Stereo digital microphone |
mic_gain
Analog microphone input gain (only applies when use_mic_type = USE_AUDIO_AMIC).
Constant |
Value |
Description |
|---|---|---|
|
0 |
0 dB gain |
|
1 |
20 dB gain |
|
2 |
30 dB gain |
|
3 |
40 dB gain (default) |
dmic_l_gain / dmic_r_gain
Digital microphone boost gain (only applies when using DMIC). Left and right channels are configured independently.
Constant |
Value |
Description |
|---|---|---|
|
0 |
0 dB gain (default) |
|
1 |
12 dB gain |
|
2 |
24 dB gain |
|
3 |
36 dB gain |
ADC_gain
ADC digital volume controls the input (analog-to-digital) gain. Range: -17.625 dB (0x00) ~ 30 dB (0x7F). Default: 0x66 (~20 dB).
This can be changed at runtime using CMD_AUDIO_SET_ADC_GAIN.
DAC_gain
DAC digital volume controls the output (digital-to-analog) gain. Range: -65.625 dB (0x00) ~ 0 dB (0xAF). Default: 0xAF (0 dB).
This can be changed at runtime using CMD_AUDIO_SET_DAC_GAIN.
Note
Configure digital gain first. Only use analog gain (mic_gain) when digital gain alone is insufficient. Excessive analog gain introduces noticeable noise.
mix_mode
When set to 1, the module mixes up to 4 input PCM streams into a single output. This enables multi-stream mixing (e.g., local playback + network audio). Default: 0 (single input).
hpf_set
High-pass filter cutoff index (0–7). The cutoff frequency is approximately:
fc ≈ 5e-3 / (hpf_fc + 1) × fs
Default: 0 (HPF disabled). Use this to filter out DC offset and low-frequency noise.
enable_record
Set to 1 to enable the audio recording data path. Default: 0.
avsync_en
Set to 1 to enable hardware-aligned timestamps for A/V sync. Required when combining audio with video in a MIMO pipeline (MP4 recording, RTSP streaming).
Audio Module Configuration Timing
Configure audio_params_t before CMD_AUDIO_APPLY. After apply and stream start, keep runtime changes to commands that the audio control path can apply safely.
If a running stream must change the sample rate, mic type, channel count, or HPF setting, update audio_params_t and call CMD_AUDIO_SET_RESET to reinitialize the audio codec:
audio_params.sample_rate = ASR_48KHZ;
mm_module_ctrl(audio_ctx, CMD_AUDIO_SET_PARAMS, (int)&audio_params);
mm_module_ctrl(audio_ctx, CMD_AUDIO_SET_RESET, 0); // reinitialize
mm_module_ctrl(audio_ctx, CMD_AUDIO_SET_TRX, 1); // restart TX+RX
Common customer requests
Request |
Supported timing |
What to do |
|---|---|---|
Change sample rate |
Init-time only |
Set |
Change mic type |
Init-time only |
Set |
Tune mic volume |
Runtime modify supported |
Use |
Tune speaker volume |
Runtime modify supported |
Use |
Mute / unmute mic |
Runtime operation supported |
Use |
Mute / unmute speaker |
Runtime operation supported |
Use |
Start / stop capture |
Runtime operation supported |
Use |
Start / stop playback |
Runtime operation supported |
Use |
Enable / disable AEC |
Runtime operation supported |
Use |
Change AEC level |
Runtime modify supported |
Use |
Enable / disable NS |
Runtime operation supported |
Use |
Enable / disable AGC |
Runtime operation supported |
Use |
Update EQ coefficients |
Runtime modify supported |
Update the EQ fields in |
Change HPF cutoff |
Init-time only |
Set |
CMD_AUDIO_* Parameter Reference
The tables below map every audio parameter to its configuration timing and the exact AP-side command to use.
Init-time only — configure in audio_params_t before CMD_AUDIO_APPLY. Changing these at runtime requires calling CMD_AUDIO_SET_RESET.
Parameter |
|
Description |
|---|---|---|
Sample rate |
|
|
Word length |
|
|
Channel count |
|
1 (mono) or 2 (stereo) |
Mic type |
|
|
AMIC analog gain |
|
|
DMIC left boost |
|
|
DMIC right boost |
|
Same options as left DMIC |
HPF cutoff |
|
0–7. fc ≈ 5e-3 / (hpf_fc + 1) × fs |
Mix mode |
|
0 = single input, 1 = mix up to 4 inputs |
Recording path |
|
0 = disabled, 1 = enabled |
A/V sync |
|
0 = disabled, 1 = enabled (for MP4 / RTSP) |
Mic EQ (5 biquads) |
|
Per-biquad enable + coefficients (see EQ Setting) |
Speaker EQ (5 biquads) |
|
Per-biquad enable + coefficients (see EQ Setting) |
Init + Runtime — may be set in audio_params_t before CMD_AUDIO_APPLY as the initial value, and updated after stream start using the listed command.
Init field |
Runtime CMD |
API function |
Description |
|---|---|---|---|
|
|
|
ADC digital volume (−17.625 to 30 dB) |
|
|
|
DAC digital volume (−65.625 to 0 dB) |
RX ASP params |
|
— |
Set RX ASP configuration (all AEC/AGC/NS settings) |
TX ASP params |
|
— |
Set TX ASP configuration (all AGC/NS settings) |
Mic EQ |
|
|
Reapply mic EQ from current |
Speaker EQ |
|
|
Reapply speaker EQ from current |
Runtime only — available only after CMD_AUDIO_APPLY. These have no init-time equivalent in audio_params_t.
CMD |
Arg / usage |
Description |
|---|---|---|
|
0 = mute, 1 = unmute |
Mute/unmute microphone input (zeros the data path) |
|
0 = mute, 1 = unmute |
Mute/unmute speaker output (zeros the data path) |
|
0 = stop, 1 = start |
Start / stop TX (playback) without affecting RX |
|
0 = stop, 1 = start |
Start / stop RX (capture) without affecting TX |
|
0 = stop, 1 = start |
Start / stop both TX and RX simultaneously |
|
0 = disable, 1 = enable |
Dynamically enable / disable AEC processing |
|
0 = disable, 1–3 = aggressiveness |
Dynamically enable / disable NS processing |
|
0 = disable, 1–3 = aggressiveness |
Dynamically enable / disable AGC processing |
|
0 = disable, 1 = enable |
Dynamically enable / disable voice activity detection |
|
1–50 |
Set AEC cancellation strength (higher = more aggressive) |
|
0 = disable, 1 = enable |
Enable / disable AEC init at next reset |
|
0–3 |
Enable / disable NS init at next reset |
|
0–3 |
Enable / disable AGC init at next reset |
|
AEC mode value |
Set AEC mode (NEWAEC only) |
|
0–3 |
Set audio debug log level (0 = none, 1 = all, 2 = warn+err, 3 = err only) |
|
0 |
Reinitialize audio codec with current |
|
ASR constant |
Update sample rate field in params (requires |
|
Returns int (via pointer) |
Get audio frame duration in milliseconds |
|
Returns uint32_t (via pointer) |
Get timestamp of first data frame (useful for A/V sync) |
|
0 |
Force deinitialize audio codec |
Audio Codec
Audio encoding and decoding are provided by standalone MMF modules documented in the MMF Development Guide :
aac_module— AAC encoder (PCM to AAC bitstream)aad_module— AAC decoder (AAC bitstream to PCM)
Bitrate Selection Guide
Sample Rate |
Bitrate |
Use Case |
|---|---|---|
8 kHz |
16 kbps |
Voice, narrowband |
16 kHz |
24–32 kbps |
Voice, wideband |
24 kHz |
48 kbps |
Good quality audio |
48 kHz |
64–128 kbps |
High quality stereo |
Note
Higher bitrates provide better quality but require more bandwidth. For real-time streaming applications, consider network conditions when selecting bitrate.
Audio Optimization
This section describes the software and hardware optimization solutions for audio.
Audio Setting
Gain Setting
Analog Microphone Gain Setting
The audio analog input gain can be divided into analog gain and digital gain.
Analog Mic Gain
It supports 0, 20, 30, 40 dB for gain optimization.
User can use audio_mic_analog_gain or set the parameter mic_gain for audio module to set it.
ADC Gain - ADC Volume
The ADC gain can be used to set the input (analog to digital) gain.
The range is -17.625dB (0x00) ~ 30dB (0x7F).
User can use the function audio_adc_digital_vol or use CMD_AUDIO_SET_ADC_GAIN to control the audio module.
A digital gain configuration is offered to control the audio output gain. Customers can set a reasonable gain value via DAC Volume to obtain the appropriate audio output volume. Basically setting the gain to 0dB (0xAF), the output amplitude will meet the board audio output volume requirements. Note that sound breakage will happen when the output gain is set too large.
If the analog gain is too large, it will affect the sound effect and noise will be obvious.
Recommendation: Customers should first configure the digital gain. If the audio signal gain needs to increase but the digital gain achieves the maximum range, then configure the analog gain.
Digital Microphone Gain Setting
Microphone |
Description |
|---|---|
Left Mic Gain |
Left DMIC gain supports 0, 12, 24, 36 dB for gain optimization. Use |
Right Mic Gain |
Right DMIC gain supports 0, 12, 24, 36 dB for gain optimization. Use |
DAC Gain - DAC Volume
The DAC gain can be used to set the output digital (to analog) gain.
The range is -65.625dB (0x00) ~ 0dB (0xAF).
User can use the function audio_dac_digital_vol or use CMD_AUDIO_SET_DAC_GAIN to control the audio module.
HPF Setting
A high pass filter is provided for users to filter low frequency noise. This is used to filter out noise from DC power; it is suggested to set the default value 0. If users want to use other filters, please refer to EQ setting.
Here is the function:
void audio_adc_l_hpf(audio_t *obj, BOOL en, audio_hpf_fc hpf_fc);
The parameters mean:
Parameter |
Description |
|---|---|
obj |
Audio object defined in application software |
en |
Enable the high pass filter or not |
hpf_fc |
Set the cutoff frequency, value is 0~7; fc ~= 5e-3 / (hpf_fc + 1) * fs |
EQ Setting
Users can use five sets of biquad filters in three sides for left digital mic (analog mic), right digital mic and audio output.
One biquad filter can switch to high-pass, low-pass, band-pass, notch, peak, low shelf, and high shelf filter by register settings.
Here are some tips for users to use the EQ:
Select the Biquad Filter
Users can use the following websites to configure the preferred filter type, sample rate, cutoff frequency, Q value and Gain first:
https://www.earlevel.com/main/2021/09/02/biquad-calculator-v3/
Get the Register Values
Users can use AmebaPro3_EQ_tool.exe to generate register settings. For example, if we choose a high pass filter with cutoff frequency 200Hz and Q value 0.707, users can type the setting and get the register values (0x1e45618, 0x1c000000, 0x2000000, 0x3c72d61, 0x1e35d500) for this setting.
Set the Register Value
After getting the register values, users can use the following functions to apply the filter setting:
void audio_input_l_eq(audio_t *obj, audio_eq eq, BOOL en, u32 h0, u32 b0, u32 b1, u32 a0, u32 a1);
void audio_input_r_eq(audio_t *obj, audio_eq eq, BOOL en, u32 h0, u32 b0, u32 b1, u32 a0, u32 a1);
void audio_output_l_eq(audio_t *obj, audio_eq eq, BOOL en, u32 h0, u32 b0, u32 b1, u32 a0, u32 a1);
Here are the parameters:
Parameter |
Description |
|---|---|
obj |
Audio object defined in application software |
eq |
Select the EQ number, can be 0~4 |
en |
Enable the EQ filter or not |
h0, b0, b1, a0, a1 |
The register values obtained from AmebaPro3_EQ_tool.exe |
Users can also set the EQ parameters in audio_params_t when using MMF settings.
Other Settings
Command |
Description |
|---|---|
CMD_AUDIO_SET_RESET |
Re-initialize audio setting and ASP algorithms. Reset audio when configuration changes |
CMD_AUDIO_SET_SAMPLERATE |
Set sample rate. Reset required to apply |
CMD_AUDIO_SET_TRX |
Stop/start TX and RX without re-initializing |
CMD_AUDIO_SET_MIC_ENABLE |
Mute/unmute microphone input (sets data to 0) |
CMD_AUDIO_SET_SPK_ENABLE |
Mute/unmute speaker output (sets data to 0) |
Note
If using audio codec, be sure the sample rate is fitting the sample rate used in audio codec.
Audio ASP Algorithm
The following table shows some common audio problems with their causes and also the adjustment using ASP algorithm.
Situation |
Algorithm |
Cases |
|---|---|---|
Distortion |
AGC |
|
Low audio volume |
AGC |
|
Echo or howling |
AEC |
|
Intermittent voice |
AEC, NS |
|
Noise floor |
NS |
|
Mechanical sound |
Network, Device |
|
Note
The audio signal processing (ASP) is based on the digital audio signal. If the audio signal has already has the distortion, the ASP has no promise to get the expected result.
Enable ASP Algorithm
For using ASP algorithm, user needs to enable the ASP library in the build configuration.
Enable ENABLE_ASP in module_audio.h and use the 3A (AGC: Automatic gain control; ANS: Adaptive noise suppression; AEC: Acoustic echo cancellation) algorithms to obtain better audio effects.
Note
The parameters, sample_rate and mic_gain, and the initialization of NS, AEC, AGC and other algorithms will be set at CMD_AUDIO_APPLY and CMD_AUDIO_SET_RESET.
To enable ASP function, users can use the following parameters in ASP.h:
// =================== Open ASP algorithm (ASP.h) ================
typedef struct CTNS_cfg_s {
int16_t NS_EN;
int NSLevel;
int16_t HPFEnable;
int16_t QuickConvergenceEnable;
int16_t Reserve1;
} CTNS_cfg_t;
typedef struct CTAGC_cfg_s {
int16_t AGC_EN;
CT_AGC_MODE AGCMode;
int16_t ReferenceLvl;
int16_t RatioFormat; // Ratio format: 0 => integer, range 1~50, 1 => 8.8 fix point, range 26~50*256 (mapping 26/256~50)
int16_t AttackTime;
int16_t ReleaseTime;
int16_t Ratio[3];
int16_t Threshold[3]; // Threshold1, Threshold2, NoiseGateLvl
int16_t KneeWidth;
int16_t NoiseFloorAdaptEnable;
int16_t RMSDetectorEnable;
int16_t MaxGainLimit;
} CTAGC_cfg_t;
typedef struct CTAEC_cfg_s {
int16_t AEC_EN;
int16_t EchoTailLen;
int16_t CNGEnable;
int16_t PPLevel;
int16_t DTControl;
int16_t ConvergenceTime;
int16_t Reserve1;
} CTAEC_cfg_t;
typedef struct VQE_SND_STATE_s {
int16_t DoA; // in degrees
int16_t ERLE; // in dB
int16_t SinLvldB; // in dBFs
int16_t SoutLvldB; // in dBFs after AGC (if AGC is enabled)
int16_t DTState; // 0 = single talk or 1 = double talk
int16_t HCDetectState; // 1 = detected, 0 = not detected
uint8_t AECRun;
uint8_t AGCRun;
uint8_t NSRun;
uint8_t BFRun;
uint8_t Reserve1;
uint8_t Reserve2;
uint8_t Reserve3;
uint8_t Reserve4;
} VQE_SND_STATE_t;
typedef struct VQE_RCV_STATE_s {
int16_t RinLvldB;
int16_t RoutLvldB;
int16_t HCDetectState; // 1 = detected, 0 = not detected
uint8_t AGCRun;
uint8_t NSRun;
uint8_t Reserve1;
uint8_t Reserve2;
uint8_t Reserve3;
uint8_t Reserve4;
} VQE_RCV_STATE_t;
Parameter Details
CTAEC_cfg_t (AEC Configuration)
Parameter |
Description |
|---|---|
AEC_EN |
Enable the AEC module in AEC process |
EchoTailLen |
Buffer length for echo cancel; higher values increase CPU usage. Suggest 64 for 16KHz, 128 for 8KHz. Support 32/64/128 |
CNGEnable |
Enable comfort noise generation (0 or 1) |
PPLevel |
AEC fine tune; higher = more aggressive echo cancel (may cancel more local). Support 1~50 |
DTControl |
AEC coarse tune: 1 (allow some residual), 2 (attenuate up to 6dB local), 3 (attenuate up to 9dB local) |
ConvergenceTime |
AEC initialization convergence time in msec, support 100~1000 |
CTAGC_cfg_t (AGC Configuration)
Parameter |
Description |
|---|---|
AGC_EN |
Enable the AGC module in the AGC process |
AGCMode |
AGC mode: 0 (CT_ALC), 1 (CT_LIMITER) |
ReferenceLvl |
Output target reference level (dBFS), support 0,1,…,30 (0,-1,…,-30dBFs) |
RatioFormat |
Ratio format: 0 = integer (1~50), 1 = 8.8 fix point (26~50*256) |
AttackTime |
Signal amplitude compression transition time (ms), support 1~500 |
ReleaseTime |
Signal amplitude boost transition time (ms), support 1~500 |
Ratio[3] |
Three ratios for adjusting AGC gain curve |
Threshold[3] |
Three thresholds: Threshold1, Threshold2 (0~81), NoiseGateLvl (50~90) |
KneeWidth |
AGC gain curve soft knee width, support 0~10 |
NoiseFloorAdaptEnable |
Enable noise detect on AGC (ignore background noise), 0 or 1 |
RMSDetectorEnable |
0 = peak detection, 1 = RMS detection |
MaxGainLimit |
Maximum gain in dB for AGC, support 6,12,18,24,30 |
CTNS_cfg_t (NS Configuration)
Parameter |
Description |
|---|---|
NS_EN |
Enable the NS module in the NS process |
NSLevel |
NS aggressiveness in dB (higher = more aggressive), support 3~35 |
HPFEnable |
Enable HPF before NS, 0 or 1 |
QuickConvergenceEnable |
NS convergence speed: 1 = immediately suppress (quick), 0 = smooth suppress |
ASP Algorithm Usage
Here are the configurations for ASP algorithm:
Configuration |
Description |
|---|---|
Supported Sample Rates |
8K and 16K audio |
Default Settings |
Defined in module_audio.c as
|
Get RX ASP Parameters |
Use |
Get TX ASP Parameters |
Use |
Set RX ASP Parameters |
Use |
Set TX ASP Parameters |
Use |
AGC Enable (RX/TX) |
Set |
NS Enable (RX/TX) |
Set |
AEC Enable (RX) |
Set |
AEC Setting
The AEC algorithm includes three parts: delay adjustment strategy, linear echo estimation, and nonlinear echo suppression.
Command |
Description |
|---|---|
CMD_AUDIO_RUN_AEC |
Dynamically switch AEC_process() usage |
CMD_AUDIO_SET_AEC_ENABLE |
Enable/disable AEC_init() during audio reset |
CMD_AUDIO_SET_AEC_LEVEL |
Set echo cancellation strength |
NS Setting
The NS algorithm is aimed at decreasing the noise or environment sound, so it is recommended to use before other ASP algorithms.
Command |
Description |
|---|---|
CMD_AUDIO_SET_NS_ENABLE |
Enable/disable NSx_init() during audio reset |
CMD_AUDIO_RUN_NS |
Dynamically switch NSx_process() usage |
AGC Setting
The AGC algorithm is used to balance the audio volume of signal streaming.
Command |
Description |
|---|---|
CMD_AUDIO_SET_AGC_ENABLE |
Enable/disable AGC_init() during audio reset |
CMD_AUDIO_RUN_AGC |
Dynamically switch AGC_process() usage |
For media example usage, see Media Example .