Audio IC Feature Support Table

IC features table
Function	RTL8721Dx	RTL8720E	RTL8726E	RTL8713E	RTL8730E	RTL8721F
I2S	Y	Y	Y	Y	Y	Y
AMIC	N	N	Y	Y	Y	N
DMIC	Y	Y	Y	Y	Y	Y
VAD	N	N	N	N	Y	N
PDM	N	Y	Y	Y	Y	N
LINEOUT	N	N	Y	Y	Y	N
HPO	N	N	N	N	Y	N

Introduction

The audio whole block diagram is illustrated below.

Audio block consists of two parts:

SPORT: 2x
- SPORT0 is for internal digital microphone interface and external I2S interface
- SPORT1 is only for external I2S interface.
DMIC

../../../rst_um/peripherals/8_audio/figures/audio_block_diagram_dplus.svg

Audio block consists of two parts:

SPORT: 1x
- SPORT0 is for internal digital microphone interface and external I2S interface.
DMIC: 2x

../../../rst_um/peripherals/8_audio/figures/audio_block_diagram_green2.svg

Note

SPORT DIN[3:0] and DOUT[3:0] are multiplexed to connect to DIO[3:0] correspondingly.
When connected to external I2S devices, SPORT0 can act as master, also as slave.

SPORT

SPORT Data Path

The data paths of SPORT 0/1/2/3 are shown below respectively.

../../../rst_um/peripherals/8_audio/figures/sport0_data_path.svg — SPORT0 data path

../../../rst_um/peripherals/8_audio/figures/sport1_data_path.svg — SPORT1 data path

../../../rst_um/peripherals/8_audio/figures/sport2_data_path.svg — SPORT2 data path

../../../rst_um/peripherals/8_audio/figures/sport3_data_path.svg — SPORT3 data path

Support SPORT0 and SPORT1.

SPORT Function

SPORT Feature

General features
- Supports up to 8-channel I2S transmitter
- Supports 16/20/24/32 bits data length
- Supports 16/20/24/32 bits channel length
- Works in master and slave mode.
- Supports sampling rate up to 192kHz
- Support Multi-IO mode, fs up to 192kHz
Note
- For RTL8730E, SPORT0/1 not support Multi-IO mode.
General functions
- SPORT fs counter and phase counter under BCLK
  When using phase counter, rx_bclk_div_ratio/ tx_bclk_div_ratio should be configured to 63.
  
  Fs counter is used to count the number of LRCLK.
  
  On every falling edge of LRCLK, phase counter would accumulate once in two BCLK cycles by default. On the next falling edge of LRCLK, phase counter will be reset to 0 and then start counting again. At this time, the maximum accumulated value of phase counter is 31.
  
  Phase counter also can accumulate once in one BCLK cycle, the maximum accumulated value of phase counter is 63.
  
  SPORT fs and phase counter is shown below:
- SPORT direct mode feature:
  Used for data transmission between different sports without CPU and DMA involved in data transfer.
  
  When two sports work in direct mode, clock needs to be at the same frequency.
- SPORT FIFO:
  TX_FIFO_0 and TX_FIFO_1 are two asynchronous ping-pong FIFO. Each FIFO is depth=32 and width=32, so 2*32*4 bytes=64 words.
  
  RX_FIFO_0 and RX_FIFO_1 are two asynchronous ping-pong FIFO. Each FIFO is depth=32 and width=32, so 2*32*4 bytes=64 words.
  
  6/8 channels for data transmission, FIFO0 and FIFO1 would be used. Two FIFOs will request at the same time, and four SPORTs will produce 16 requests at the same time.
- WIFI TSF latch SPORT counter
  
  Hardware latch SPORT counter when it detects a change in the TSFT specified bit, software specifies the bits. The latch period is optional: 1.024/ 2.048/ 4.096 ……/ 131.072.

SPORT0 is for internal digital microphone interface and external I2S interface
SPORT1 is only for external I2S interface.
Not support LRCLK start and stop detect
Not support SPORT fs counter and phase counter under SPORT CLK
Support WIFI TSF，but not support WIFI TSF start SPORT

SPORT0/1 is for internal audio codec and not support Multi-IO mode.
SPORT2/3 is for external I2S interface.
Not support LRCLK start and stop detect
Not support SPORT fs counter and phase counter under SPORT CLK
Only CHIP_DCUT supports WIFI TSF, but not support WIFI TSF start SPORT

SPORT0 is for internal digital microphone interface and external I2S interface.
Not support SPORT direct mode feature
The following additional features are supported:
- SPORT fs counter and phase counter under SPORT CLK
  SPORT CLK=98.304MHz/45.1584MHz. When the SPORT CLK is 98.304M, and on every rising edge or falling edge of LRCLK, the phase counter will accumulate once in one SPORT clock cycle, the accuracy of phase counter can reach 10ns.
  
  Phase counter can start on the falling edge of LRCLK, and it increments by one at each SPORT CLK. On the next falling edge of LRCLK, phase counter will be reset to 0 and then start to count again.
  
  Phase counter also can start on the rising edge of LRCLK, the rest is similar to the description above.
  
  In master mode LRCLK is divided by SPORT CLK. In slave mode SPORT LRCLK is supported by master.
  
  SPORT fs counter and phase counter under SPORT CLK is shown below:
  
  Note
  
  N = SPORT CLK/LRCLK
- LRCLK start and stop detect: in slave mode, SPORT can use SPORT clock to monitor the start and stop of LRCLK.
  Start condition: detect the rising edge or falling edge(default) of LRCLK. The implementation steps are as follows:
  
  Stop condition: the phase counter is accumulated to a settable threshold. The implementation steps are as follows:
- WIFI TSF start SPORT
  MAC sends interrupt to audio, and hardware will automatically start playing in audio side. The hardware delay is tens of nanoseconds. There is no need to wait for software to set MAC to send interrupt to open the following two bit functions.
  
  ((u32)SP0_CTRL0) &= ~ SP_BIT_TX_DISABLE; ((u32)SP0_CTRL0) |= SP_BIT_START_TX;

I2S Signal Introduction

The I2S bus has three lines:

Continuous serial clock (SCK/BCLK)
- One SCK pulse generates a data bit
- Master generates SCK
Word select (WS/LRCLK)
The word select line indicates the channel being transmitted:
WS = 0: channel 1 (left)

WS = 1: channel 2 (right)
Changes one clock period before the MSB is transmitted
Serial data (SD)
SD is transmitted in two complements with the MSB first. The MSB has a fixed position, whereas the position of the LSB depends on the word length.

When the system word length is greater than the transmitter word length, the word is truncated (the least significant data bits are set to 0).
If the receiver is sent more bits than its word length, the bits after the LSB are ignored.

If the receiver is sent fewer bits than its word length, the missing bits are set to zero internally.

I2S Data Format

The I2S interface supports I2S (Philips) format, Left-justified (MSB) format, Right-justified (LSB) format, PCM, and TDM mode. Software can select any mode by setting the I2S control register. The following figures show the I2S data format.

I2S format

Note

Typically, fs = 8/16/32/44.1/48/88.2/96/192/192kHz

Channel length: 16/20/24/32 bits (N+1)

SCK = Arbitrarily cycles within 1/fs, but >= 2*(N+1) * fs, <= 256 * fs

Left-justified format

Note

Typically, fs = 8/16/32/44.1/48/88.2/96/192/192kHz

Channel length: 16/20/24/32 bits (N+1)

SCK >= 2*(N+1)*fs, <= 256 *fs

PCM mode A

Note

Typically, fs = 8/16/32/44.148/88.2/96/192kHz

Channel length: 16/20/24/32 bits (N+1)

SCK >= 2*(N+1)*fs, <= 256 * fs

PCM mode B

Note

Typically, fs = 8/16/32/44.148/88.2/96/192kHz

Channel length: 16/20/24/32 bits (N+1)

SCK >= 2*(N+1)*fs, <= 256 * fs

I2S TDM 8 mode

Note

Typically, fs = 8/16/32/44.1/48kHz

Channel length: 16/20/24/32 bits (N+1)

SCK >= 8*(N+1)*fs, <= 256 * fs

Left-justified TDM 8 mode

Note

Typically, fs = 8/16/32/44.1/48kHz

Channel length: 16/20/24/32 bits (N+1)

SCK >= 8*(N+1)*fs, <= 256 * fs

PCM mode A in TDM 8 mode

Note

Typically, fs = 8/16/32/44.1/48kHz

Channel length: 16/20/24/32 bits (N+1)

SCK >= 8*(N+1)*fs, <= 256 * fs

PCM mode B in TDM 8 mode

Note

Typically, fs = 8/16/32/44.1/48kHz

Channel length: 16/20/24/32 bits (N+1)

SCK >= 8*(N+1)*fs, <= 256 *fs

I2S TDM 6 mode

Note

Typically, fs = 8/16/32/44.1/48kHz

Channel length: 16/20/24/32 bits (N+1)

SCK >= 6*(N+1)*fs, <= 256 * fs

Left-justified TDM 6 mode

Note

Typically, fs = 8/16/32/44.1/48kHz

Channel length: 16/20/24/32 bits (N+1)

SCK >= 6*(N+1)*fs, <= 256 * fs

PCM mode A in TDM 6 mode

Note

Typically, fs = 8/16/32/44.1/48kHz

Channel length: 16/20/24/32 bits (N+1)

SCK >= 6*(N+1)*fs, <= 256 * fs

PCM mode B in TDM 6 mode

Note

Typically, fs = 8/16/32/44.1/48kHz

Channel length: 16/20/24/32 bits (N+1)

SCK >= 6*(N+1)*fs, <= 256 * fs

I2S TDM 4 mode

Note

Typically, fs = 8/16/32/44.1/48/88.2/96kHz

Channel length: 16/20/24/32 bits (N+1)

SCK >= 4*(N+1)*fs, <= 256 * fs

Left-justified TDM 4 mode

Note

Typically, fs = 8/16/32/44.1/48/88.2/96kHz

Channel length: 16/20/24/32 bits (N+1)

SCK >= 4*(N+1)*fs, <= 256 * fs

PCM mode A in TDM 4 mode

Note

Typically, fs = 8/16/32/44.1/48/88.2/96kHz

Channel length: 16/20/24/32 bits (N+1)

SCK >= 4*(N+1)*fs, <= 256 * fs

PCM mode B in TDM 4 mode

Note

Typically, fs = 8/16/32/44.1/48/88.2/96kHz

Channel length: 16/20/24/32 bits (N+1)

SCK >= 4*(N+1)*fs, <= 256 * fs

I2S supports 16/20/24/32 bits channel length, the relationship between audio data length and channel length is illustrated below.

../../../rst_um/peripherals/8_audio/figures/i2s_data_length_and_channel_length.png

SPORT Parameters

SPORT parameters
Interface/Format	Sampling rate	Audio bits	Channel	Channel length	BCLK polarity	Serial data	Mode
I2S	192kHz	16 bits	Stereo	16 bits	BCLK	MSB first	Master
Left-justified	96kHz	20 bits	Mono	20 bits	BCLK inverse	LSB first	Slave
PCM Mode A (Short Frame Sync)	88.2kHz	24 bits		24 bits
PCM Mode A (Short Frame Sync)	48kHz	32 bits		32 bits
PCM Mode B (Short Frame Sync)	44.1kHz
PCM Mode B (Short Frame Sync)	32kHz
	16kHz
	8kHz

I2S PINMUX

The data PIN DIO[3:0] of I2S can be used as output or input. It can be set as input or output in the corresponding part of I2S in the PAD register. I2S data pinmux is shown below.

../../../rst_um/peripherals/8_audio/figures/i2s_data_pinmux.svg

The data PIN DIO[3:0] of I2S can be used as output or input. It can be set as input or output in the corresponding part of I2S in the PAD register. I2S data pinmux is shown below.

The data PIN DIN[3:0] of I2S is used as input.
The data PIN DOUT[3:0] of I2S is used as output.

The data PIN DIO[3:0] of I2S can be used as output or input. It can be set as input or output in the corresponding part of I2S in the PAD register. I2S data pinmux is shown below.

Audio Codec

General Description

The digital microphone (DMIC) interface is for digital microphone, and supports 2-channel digital microphone recording. The following figure shows the details block of digital microphone interface.

../../../rst_um/peripherals/8_audio/figures/audio_recording_path_configuration.svg

The audio codec is a high-performance, low-power, up to 8-channel I2S interface audio codec. The transmitted data can be from analog input or digital microphone input. The received data can stream to line output. Five channels analog ADCs can work in low power mode and normal mode. In low power and normal mode, THD+N of five channels ADCs all are about -80dB, and SNR can reach 98dBA. Two high-performance DACs are included, and THD+N of which are all about -85dB, and SNR can reach 98dBA.

Audio codec integrates five ADCs with independent mic bias voltage and mic boost amplifier to deliver valid channel data that channel crosstalk can be eliminated. The analog input port MIC0_P/N ~ MIC4_P/N is designed as full differential microphone pins or single-ended line-in pins. Four smart digital mic interfaces are supported to make low jitter clock output and decimation filter for up to eight digital mics. Independent digital voice controllers are provided in each channel.

Audio codec integrates two DACs with different output which actions as an input signal of headset or speaker power amplifier. And also a PDM interface is supported for PDM digital speaker power amplifier.

Audio codec includes several DSP features such as a high-pass filter, mixer, Equalizer, and volume control. The 10-band parametric Equalizer contains 10 independent filters with programmable gain, center frequency and bandwidth to tailor the frequency characteristics of the embedded playback system according to user preferences. The 5-band parametric Equalizer contains 5 independent filters with programmable gain, center frequency and bandwidth to tailor the frequency characteristics of the embedded record system according to user preferences.

Features

The DMIC interface has the following features:

8kHz/11.025kHz/12kHz/16kHz/22.5kHz/24kHz/32kHz/44.1kHz/48kHz/88.2kHz/96kHz for digital microphone interface
Asynchronous sample rate converter (ASRC) for each interface
Configurable 0-5 band EQ
Adjustable digital volume control
For digital volume control, supports zero-crossing detection to minimize audible artifacts
DC remove function

Analog features:
- DAC with 98dBA SNR
- ADC with 98dBA SNR
- Differential analog microphone inputs with boost pre-amplifiers and low noise microphone bias
  0/5/10/15/20/25/30/35/40dB microphone boost gain
  
  MIC input to ADC with 0dB boost gain in normal mode, SNR > 98dBA and THD+N is about -80dB
  
  MIC input to ADC with 0dB boost gain in low power mode, SNR > 90dBA and THD+N is about -78dB
  
  -80dB crosstalk between channels
  
  Adjustable MICBIAS with less than -100dBV noise floor and -70dB PSRR
- Dual Stereo DAC outputs with stereo headphone amplifiers
  SNR >= 98dBA (AVDD=1.8V, load=10kΩ, dual differential output)
  
  THD+N is about -85dB (AVDD=1.8V, load=10kΩ, dual differential output)
  
  Dual differential output
  
  -80dB crosstalk between channels
  
  De-pop function in stereo headphone amplifiers
Digital features:
- 8k/11.025k/12k/16k/22.5k/24k/32k/44.1k/48k/88.2k/96k/176.4k/192kHz for DAC path
- 8k/11.025k/12k/16k/22.5k/24k/32k/44.1k/48k/88.2k/96kHz for ADC path
- Digital microphone interface supported
- Asynchronous sample rate converter (ASRC) for each interface
- 10-bands flexible equalizer (EQ) for DAC path
- Configurable 0-5 band EQ in 6 channels for ADC path
- Adjustable digital volume control in ADC, DAC
- For digital volume control, supports zero-crossing detection to minimize audible artifacts
- DC remove function for ADC, DAC
- PDM interface function for external speaker AMP
- Side tone function

Audio Codec Data Path

Recording Data Path

The following figure shows the recording data path of digital microphone interface. In the recording path, the input source is 2-channel DMIC.

Playback Data Path

Not support.

The following figure shows the playback data path of audio codec, which can be SP_L, SP_R, SP_L + SP_R, and reference signal.

SP_L is the data from music channel left.
SP_R is the data from music channel right.
SP_L + SP_R is the data from music channel left added channel right.
Reference signal is the data from reference signal module for test.

../../../rst_um/peripherals/8_audio/figures/audio_codec_playback_data_path.svg

Audio Codec Functional Description

Audio Recording

Audio Recording Block

Not support.

There are five analog ADCs and with up to 5-channel recording paths. You can use five microphones to pass to analog ADCs. Five channel ADCs has two type analog input ports: microphone input and line input, which all support differential and single-ended.

The IN0-4P/N are microphone-type input ports. The input port can be configured to differential input or single-ended input. The microphone input port has its microphone bias and microphone boost. The low noise microphone bias can improve recording performance and enhance recording quality. Build-in short current detection scheme can be used for switch detection. Multi-step microphone boost gain is easy to use for microphone applications. The following figure shows the recording analog block.

A boost amplifier is provided in the input path to the ADC, which can be used manually with 5dB step from 0dB to 40dB, to keep the recording volume constant.

There are up to 8-channel digital microphone interface which shares the digital path with AMIC ADC.

../../../rst_um/peripherals/8_audio/figures/analog_and_digital_mic_recording_path.svg — Analog and digital MIC recording path

The recording part includes five programmable microphone bias outputs (MICBIAS0, MICBIAS1, MICBIAS2, MICBIAS3, MICBIAS4), capable of providing output voltages of 1.8V with 3mA output-current drive capability. In addition, the MICBIAS outputs may be programmed to be switched to AVCC_DRV directly through an on-chip switch, and it can be powered down completely when no need for power saving. The following figure shows the function block of MICBIAS.

../../../rst_um/peripherals/8_audio/figures/micbias_function_block.svg

Note

In low power mode, the power supply of external AMIC should be switched to AVCC_DRV. You can configure GPIO<16:12> to output HIGH to realize it. This is equivalent to that the phase inverter outputs high level, and software conduction resistance is about 33Ω.

Digital Feature of Audio Recording

Recording DMIC/AMIC path for sampling rate 8k/11.025k/12k/16k/32k/22.05k/44.1k/48k/88.2k/96kHz
ASRC (asynchronous sample rate converter)
The ADC digital part supports digital volume control, and the gain is between 48dB and -17.625dB in 0.375dB per step.
There is a high pass filter for DC offset
Zero-crossing function
- If the volume is adjusted while the signal is a non-zero value, an audible click can occur, as shown below.
Click noise without zero crossing
- In order to prevent this click noise, a zero-crossing function is provided. When enabled, this will cause the volume to update only when a zero crossing occurs, minimizing click noise, as shown below.
Minimizing click noise with zero crossing
- When the signal is very quiet and consists of mainly of noise, zero crossing cannot be met, now the gain will change with steps, as shown below.
Gain update with steps as zero crossing
Equalizer block

The equalizer block cascades 0-5 bands of equalizer to tailor the frequency characteristics of the recording system according to user preferences and to emulate environment sound.
DC remove function block

A high pass filter is implemented for dc offset. The high pass filter is mainly for ADC recording used. The cut-off frequency of filter is programmable and is varied according to different sample rates. The filter is used to remove DC offset at normal conditions.
Silence detector block

The Silence detector is used to reduce the noise floor for DAC path or ADC path. When the input signal is below silence level, the input signal will be reduced to suppress the background noise. The reducing level can be set by registers.

Recording DMIC path for sampling rate 8kHz/11.025kHz/12kHz/16kHz/32kHz/22.05kHz/44.1kHz/48kHz/88.2kHz/96kHz
ASRC (asynchronous sample rate converter)
When enabling ASRC function, the clock sources from ad_fs and BCLK0 (or BCLK1) are allowed to be asynchronous. The ASRC technology can ensure data accuracy and keep audio performance under clock source asynchronous.
The ADC digital part supports digital volume control, and the gain is between 48dB and -17.625dB in 0.375dB/step.
There is a high pass filter for DC offset
Zero-crossing function

If the volume is adjusted while the signal is a non-zero value, an audible click can occur, as shown in Figure 1-27.

../../../rst_um/peripherals/8_audio/figures/click_noise_without_zero_crossing.png

Figure 1-27 Click noise without zero-crossing

In order to prevent this click noise, a zero-crossing function is provided. When the function enabled, this will cause the volume to update only when a zero-crossing occurs, minimizing click noise, as shown in Figure 1-28.

../../../rst_um/peripherals/8_audio/figures/minimizing_click_noise_with_zero_crossing.png

Figure 1-28 Minimizing click noise with zero-crossing

When the signal is very quiet and consists of mainly of noise, zero-crossing cannot be met, now the gain will change with steps, as shown in Figure 1-29.

../../../rst_um/peripherals/8_audio/figures/gain_update_with_steps_as_zero_crossing.png

Figure 1-29 Gain update with steps as zero-crossing

Equalizer block

The equalizer block cascades 0-5 bands of equalizer to tailor the frequency characteristics of the recording system according to user preferences and to emulate environment sound.

DC remove function block

A high pass filter is implemented for DC offset. The high pass filter is mainly for ADC recording used. The cut-off frequency of filter is programmable and is varied according to different sample rates. The filter is used to remove DC offset at normal conditions.

Silence detector block

The Silence detector is used to reduce the noise floor for DAC path or ADC path. When the input signal is below silence level, the input signal will be reduced to suppress the background noise. The reducing level can be set by registers.

Audio Playback

Not support.Not support.

Audio Playback Block

Not support.

Digital Feature of Audio Playback

Not support.

Playback path for sample rate 8K,11.025K,12K,16K,32K,22.05K,44.1K 48kHz,88.2K,96kHz, 176.4K,192K
Asynchronous Sample Rate Converters (ASRC)
The DAC digital part support digital volume control, and the gain is between 0dB and -65.625dB in 0.375dB per step.
There is a high pass filter for DC offset, the cut-off frequency of filter is programmable and is varied according to different sample rates.
Zero-crossing function, which is the same as ADC zero-crossing.
PDM interface
Test tone

Built-in a test tone module, the tone frequency can be configured by (fs/192) * (tone_fc_sel+1) kHz, and the tone gain can be configured by 0 ~ 6.02 * (gain_sel) dB
Channel L and channel R mix
Before streaming to DAC, channel L data and channel R data are mixed, then streamed to DAC_L channel and DAC_R channel.

VAD_PITCH

Not support VAD_PITCH

VAD_PITCH Features

VAD (Voice Activity Detection) is a low-energy voice detect IP. It supports voice trigger. Once the VAD function is enabled, it will automatically sample the voice and detect the voice energy above the threshold value or not, even the processor is in sleep mode.

The overall design of VAD mainly includes two aspects, one is the generation of wake-up interrupt, to wake up the processor; the other is the transmission of voice data after wake-up, to let the processor timely access to audio data for keyword recognition.

VAD data source may be up to four analog microphones, and up to eight digital microphones. The VAD can configure software to choose which audio source to use as input. An APB configuration interface is also supported. When the VAD successfully recognizes a human voice in CPU low power mode, an interrupt is generated and reported to the CPU.

SRAM is used to store audio data buffer during power consumption. It is 128KB in size and supports 64 bits read/write. The data source is parallel to the VAD’s audio data source and is also audio data after the MUX.

At the same time, SRAM can also be read and written by KM0, KM4 and CA32 at workflow.

All SoCs

Select SoC via Features

HiFi DSP Series

HiFi DSP Series

Cortex-A Linux Series

Cortex-A Linux Series

Display Series

Display Series

Audio Series

Audio Series

Wi-Fi 6 + BLE Series

Wi-Fi 6 + BLE Series

Wi-Fi 2.4G/5G + BLE Series

Wi-Fi 2.4G/5G + BLE Series

Wi-Fi + Classic BT Series

Wi-Fi + Classic BT Series

Wi-Fi R-MESH Series

Wi-Fi R-MESH Series

Select SoC via Applications

Wi-Fi Low-power

Wi-Fi Low-power

Wi-Fi Data Transmission

Wi-Fi Data Transmission

Wi-Fi Audio

Wi-Fi Audio

Smart Home Appliances

Smart Home Appliances

Smart Voice

Smart Voice

Line Controller

Line Controller

Industrial Energy Control

Industrial Energy Control

Smart Gateway

Smart Gateway

Carplay Box

Carplay Box

Smart Display

Smart Display

SDK

Mass Production

User Manual

SDK and User Guide

FreeRTOS

SDK and User Guide

Peripherals Guide

Linux

SDK and User Guide

Peripherals Guide

HiFi DSP

SDK and User Guide

Peripherals Guide

Zephyr

SDK and User Guide

Peripherals Guide

Important Features Guide

Wi-Fi Guide

RTOS

Linux

Zephyr

Bluetooth Guide

RTOS

Linux

Security Guide

RTOS

Linux

Multimedia Guide

RTOS

Linux

AI Voice Algorithm

USB Guide

RTOS

Linux

Power Saving Guide

RTOS

Linux

OTA Guide

RTOS

Linux

Tools