LLM-Powered Voice Interaction Solution

Top Picture

LLM-Powered Voice Interaction Solution

Hybrid offline-online AI Solution, Bridging Local Efficiency with Cloud Intelligence

Overview

Realtek provides a hybrid offline-online large model voice interaction solution that combines efficient local chip-level voice processing with cloud-based cognitive capabilities, enhancing human-machine interaction experience.

Smart Voice System

Signal Processing

AEC (Acoustic Echo Cancellation)

Dual-stage linear cancellation + residual suppression for effective echo removal

BF (Beamforming)

Multi-microphone spatial filtering for targeted speech enhancement

NS (Noise Suppression)

Supports signal processing and neural network two modes for noise reduction

AGC (Automatic Gain Control)

Fixed + adaptive gain adjustment for stable output levels

SSL (Sound Source Localization)

360° directional tracking with microphone arrays

Automatic Speech Recognition

KWS (Keyword Spotting)
    Supports fixed keywords and user-defined keywords, fast, accurate on-device response
VAD (Voice Activity Detection)
    Accurate speech/silence detection
ASR (Automatic Speech Recognition)
    Offline command recognition and customizable command words for real-time control

Key Advantages

Highly Customizable Local Voice Interaction
  • Custom Wake-up Words: User-level customization for personalized device naming
  • Custom Voice Commands: Define offline instructions via a configuration platform for rapid productization
  • Quick Deployment: One-click configuration to adapt to diverse product forms and scenarios
High-Speed Stable Wi-Fi for Chip-Level Voice Interaction
  • Supports multiple network protocols, compatible with cloud service providers
  • High throughput & low latency for rapid AI response
  • Enhanced network stability ensures smooth AI conversations
Professional & Flexible Multimedia Framework
  • Multi-format audio playback support
  • High-quality audio output for immersive experience
  • Versatile interfaces for diverse application scenarios

Typical Applications

Smart Home

  • Local device control (lights, curtains, AC)
  • Cloud responses for weather, recipes, news

Smart Toys

  • Local media control (playback, volume)
  • Cloud-based Q&A and story telling

Conference Systems

  • Local signal processing & noise reduction
  • Cloud transcription & summary generation

Development Resources

SDK icon SDK Download Link
Doc icon AIVoice Development Guide Link
Doc icon Custom Command Guide Link
Doc icon Audio Hardware Design Requirements Link
App icon Cloud Platform Reference: Coze Link
Contact icon Contact Us Link

Recommended ICs

Features Filter RTL8721Dx RTL8720E RTL8710E RTL8726E RTL8713E RTL8730E RTL8721F RTL872xD RTL8735B
Application
Processor
Cortex-M Cortex-M Cortex-M Cortex-M Cortex-M Cortex-A Cortex-M Cortex-M Cortex-M
DSP
ISP?
Arm TrustZone
Dual Band?
Wi-Fi 6
R-MESH?
Ultra-low Power
Ethernet
BT Dual Mode
HMI?
Audio ADC?
Audio DAC?
SDIO Host
SD/EMMC Host
USB
BT Dedicated
Antenna?
A2C?


Feature RTL8721Dx RTL8726E RTL8713E RTL8730E
AFE Single MIC (Speech Recognition Mode)
AFE Single MIC (Voice Communication Mode)
AFE Dual MIC (Speech Recognition Mode)
AFE Three MIC (Speech Recognition Mode)
AEC (Speech Recognition Mode)
AEC (Voice Communication Mode)
BF (Speech Recognition Mode)
BF (Voice Communication Mode)
NS (Speech Recognition Mode)
NS (Voice Communication Mode)
AGC (Speech Recognition Mode)
AGC (Voice Communication Mode)
SSL (Speech Recognition Mode)
SSL (Voice Communication Mode)
KWS Fixed Keyword
KWS User-defined Keyword
VAD
ASR