LLM-Powered Voice Interaction Solution

Top Picture

LLM-Powered Voice Interaction Solution

Hybrid offline-online AI Solution, Bridging Local Efficiency with Cloud Intelligence

Overview

Realtek provides a hybrid offline-online large model voice interaction solution that combines efficient local chip-level voice processing with cloud-based cognitive capabilities, enhancing human-machine interaction experience.

Smart Voice System

Key Advantages

High-Speed Stable Wi-Fi for Chip-Level Voice Interaction
  • Supports multiple network protocols, compatible with cloud service providers
  • High throughput & low latency for rapid AI response
  • Enhanced network stability ensures smooth AI conversations
Professional & Flexible Multimedia Framework
  • Multi-format audio playback support
  • High-quality audio output for immersive experience
  • Versatile interfaces for diverse application scenarios
Comprehensive Local AI Algorithms
  • AFE (Acoustic Front-End): Echo cancellation, beamforming, noise suppression, AGC, sound localization
  • KWS (Keyword Spotting): Fixed & user-defined wake words with fast local response
  • VAD (Voice Activity Detection): Accurate speech/silence detection
  • ASR (Automatic Speech Recognition): Offline command recognition for real-time control

Typical Applications

Scenario Solution
Smart Home
  • Local device control (lights, curtains, AC)
  • Cloud responses for weather, recipes, news
Smart Toys
  • Local media control (playback, volume)
  • Cloud-based Q&A and story telling
Conference Systems
  • Local signal processing & noise reduction
  • Cloud transcription & summary generation

Development Resources

SDK icon SDK Download Link
Doc icon AIVoice Development Guide Link
Doc icon Audio Hardware Design Requirements Link
App icon Cloud Platform Reference: Coze Link
Contact icon Contact Us Link

Recommended ICs

Feature RTL8721Dx RTL8726E RTL8713E RTL8730E
AFE Single MIC (Speech Recognition Mode) Y Y Y Y
AFE Single MIC (Voice Communication Mode) - Y Y Y
AFE Dual MIC (Speech Recognition Mode) - Y Y Y
AFE Three MIC (Speech Recognition Mode) - Y Y Y
AEC (Speech Recognition Mode) Y Y Y Y
AEC (Voice Communication Mode) - Y Y Y
BF (Speech Recognition Mode) - Y Y Y
BF (Voice Communication Mode) - - - -
NS (Speech Recognition Mode) Y Y Y Y
NS (Voice Communication Mode) - Y Y Y
AGC (Speech Recognition Mode) Y Y Y Y
AGC (Voice Communication Mode) - Y Y Y
SSL (Speech Recognition Mode) - Y Y Y
SSL (Voice Communication Mode) - - - -
KWS Fixed Keyword Y Y Y Y
KWS User-defined Keyword - Y Y Y
VAD Y Y Y Y
ASR - Y Y Y