LLM-Powered Voice Interaction Solution
LLM-Powered Voice Interaction Solution
Hybrid offline-online AI Solution, Bridging Local Efficiency with Cloud Intelligence
Overview
Realtek provides a hybrid offline-online large model voice interaction solution that combines efficient local chip-level voice processing with cloud-based cognitive capabilities, enhancing human-machine interaction experience.
Key Advantages
High-Speed Stable Wi-Fi for Chip-Level Voice Interaction
- Supports multiple network protocols, compatible with cloud service providers
- High throughput & low latency for rapid AI response
- Enhanced network stability ensures smooth AI conversations
Professional & Flexible Multimedia Framework
- Multi-format audio playback support
- High-quality audio output for immersive experience
- Versatile interfaces for diverse application scenarios
Comprehensive Local AI Algorithms
- AFE (Acoustic Front-End): Echo cancellation, beamforming, noise suppression, AGC, sound localization
- KWS (Keyword Spotting): Fixed & user-defined wake words with fast local response
- VAD (Voice Activity Detection): Accurate speech/silence detection
- ASR (Automatic Speech Recognition): Offline command recognition for real-time control
Typical Applications
| Scenario | Solution |
|---|---|
| Smart Home |
|
| Smart Toys |
|
| Conference Systems |
|
Development Resources
![]() |
SDK Download | Link |
![]() |
AIVoice Development Guide | Link |
![]() |
Audio Hardware Design Requirements | Link |
![]() |
Cloud Platform Reference: Coze | Link |
![]() |
Contact Us | Link |
Recommended ICs
| Feature | RTL8721Dx | RTL8726E | RTL8713E | RTL8730E |
|---|---|---|---|---|
| AFE Single MIC (Speech Recognition Mode) | Y | Y | Y | Y |
| AFE Single MIC (Voice Communication Mode) | - | Y | Y | Y |
| AFE Dual MIC (Speech Recognition Mode) | - | Y | Y | Y |
| AFE Three MIC (Speech Recognition Mode) | - | Y | Y | Y |
| AEC (Speech Recognition Mode) | Y | Y | Y | Y |
| AEC (Voice Communication Mode) | - | Y | Y | Y |
| BF (Speech Recognition Mode) | - | Y | Y | Y |
| BF (Voice Communication Mode) | - | - | - | - |
| NS (Speech Recognition Mode) | Y | Y | Y | Y |
| NS (Voice Communication Mode) | - | Y | Y | Y |
| AGC (Speech Recognition Mode) | Y | Y | Y | Y |
| AGC (Voice Communication Mode) | - | Y | Y | Y |
| SSL (Speech Recognition Mode) | - | Y | Y | Y |
| SSL (Voice Communication Mode) | - | - | - | - |
| KWS Fixed Keyword | Y | Y | Y | Y |
| KWS User-defined Keyword | - | Y | Y | Y |
| VAD | Y | Y | Y | Y |
| ASR | - | Y | Y | Y |



