Video Host Solution

Overview

The USB Video Class (UVC) protocol defines the industry standard for transmitting video data over a USB interface. In Host mode, the Ameba platform can identify and drive external USB camera devices through this protocol.

The UVC host protocol stack on the Ameba platform focuses on video capture scenarios. It is based on the USB Video Class (UVC) 1.1 specification and is compatible with the UVC 1.5 specification, and abstracts external USB cameras as local video input interfaces of the system. This solution supports plug-and-play, can seamlessly interface with the system’s built-in video processing framework, and provides convenient and flexible visual perception extension capabilities for devices.

Ameba USB UVC Host

Features

The Ameba UVC host driver is designed to provide efficient and compatible video input capabilities. Its main features are as follows:

  • Extensive Device Compatibility: Supports UVC-compliant USB camera devices (such as USB webcams, industrial cameras, USB microscopes).

  • Automated Enumeration and Configuration: Automatically parses device descriptors, identifies video streaming interfaces and control interfaces, and establishes isochronous transmission pipes.

  • Mainstream Video Format Support:

    • Compressed formats: MJPEG, H.264

    • Uncompressed formats: UVC-compliant uncompressed YUV video formats, with the specific pixel format (e.g., YUY2) determined by the camera’s UVC descriptors.

  • Deep System-Level Integration: Exposes a unified API to upper-layer applications and hides low-level USB transmission details.

  • Hot-Plug Support: Supports plug-and-play and dynamic removal of USB peripherals without restarting the system.

Application Scenarios

As a USB host, Ameba is responsible for enumerating USB cameras, parsing video descriptors, and establishing a stable image data stream. This solution is suitable for embedded applications that require visual data acquisition while also pursuing low power consumption and rapid integration, such as:

  • Intelligent Monitoring and Security: Ameba captures images in real time through USB cameras, combined with network transmission or local storage, for use in home monitoring, doorbell peepholes, or industrial site surveillance.

  • Visual Recognition Terminals: As a front-end acquisition device for edge computing nodes, it acquires image data and passes it to subsequent AI algorithms for face recognition, QR code scanning, or object detection.

  • Video Calling Devices: Combined with Wi-Fi or cellular network modules, it captures user video streams through generic USB cameras to implement low-cost VoIP video intercom functions.

Protocol Introduction

The UVC (USB Video Class) protocol defines standard interfaces within the USB specification framework for implementing Control Management and Video Stream Transmission between a host and video capture devices. The host driver establishes video data channels through this protocol to achieve real-time camera preview, recording, and parameter adjustment.

Protocol Documents

USB-IF has officially released the UVC class base protocol and specifications for multiple Payload formats. Please refer to the following core documents during development:

Specification Type

Document

UVC 1.1 (Video Class Base Protocol)

Video Class v1.1 document set

UVC 1.5 (Video Class Base Protocol)

Video Class v1.5 document set

Payload Specs (Payload Formats)

Contained within the USB_Video_Payload_*.pdf files in the archive above.

Term Definition

The definitions of general UVC (USB Video Class) technical terms used in this document are as follows:

Term

Description

VC Interface (Video Control Interface)

Video Control Interface. As the core control center of the UVC device, it manages the topology of the video device (such as the connection relationship between Units and Terminals). The host sends control requests via this interface, such as adjusting brightness, contrast, or performing Pan/Tilt/Zoom control.

VS Interface (Video Streaming Interface)

Video Streaming Interface. Responsible for the actual transmission of video payload data, usually using Isochronous or Bulk pipes. Each VS Interface contains specific video format information (such as YUV, MJPEG, H.264) and related frame descriptors.

Input Terminal (IT)

Input Terminal. The entry point where the video data stream enters the UVC function topology. Common input terminals include Camera Sensors or Composite Video Input interfaces. It represents the physical source of data.

Output Terminal (OT)

Output Terminal. The exit point where the video data stream leaves the UVC function topology. The most common output terminal is the USB Streaming Terminal, indicating that data will be sent to the host via the USB bus.

Processing Unit (PU)

Processing Unit. A processing node located after the Input Terminal, used to adjust the video image itself. It provides control capabilities over image quality, such as Brightness, Contrast, Hue, Saturation, and Sharpness.

Extension Unit (XU)

Extension Unit. A functional module that the UVC specification allows manufacturers to customize. Through XU, manufacturers can define specific control commands outside the standard UVC specification and access them on the host side via matching drivers or applications.

Probe & Commit Control

Negotiation and Commit Control. This is a critical mechanism in the process of establishing a video stream. The host first sends a “Probe” request to query the bandwidth and parameters supported by the device. After reaching an agreement, it sends a “Commit” request to lock the configuration before the video stream transmission can be started.

Payload Header

Payload Header. Header information in UVC video stream data packets, containing key synchronization information such as frame toggle (Frame ID), timestamp (PTS/SCR), and error flags.

Protocol Framework

The UVC Host protocol stack adopts a layered architecture design, aiming to decouple the USB transport layer from the upper-layer video application or multimedia framework.

../../../_images/usb_uvc_arch.svg

Component Responsibilities

  • Application

    Located at the top layer of the architecture, responsible for specific business logic processing. Includes video preview, recording applications, or AI algorithms and network streaming services based on video streams.

  • Video Middleware

    Acts as an abstraction layer connecting the upper and lower layers. It provides a unified data acquisition interface upwards for the application layer, shielding underlying differences; it is responsible for video stream encoding/decoding processing, format conversion, and buffer queue management.

  • UVC Class Driver

    The core intermediate layer, implementing behaviors defined by the UVC specification:

    • Topology Parsing: Parses the internal topology (Unit/Terminal) of the Video Control Interface (VC) and format descriptors of the Video Streaming Interface (VS).

    • Stream Negotiation: Implements the Probe and Commit processes to negotiate parameters such as resolution, frame rate, and bandwidth.

    • Frame Reassembly and Submission: Parses the UVC Payload Header, handles frame start/end indicators (FID/EOF), reassembles scattered USB packets into complete video frames, and submits them to the middleware layer frame by frame.

  • USB Core & HCD (Host Controller Driver)

    Underlying drivers responsible for handling USB standard enumeration, isochronous pipe management, and scheduling of underlying physical data transmission.

Communication Mechanism

Standard UVC devices aggregate the following interfaces through the Interface Association Descriptor (IAD):

Video Control Interface (VC Interface) - Topology Control

  • Transfer Mechanism: Based on Control Transfer over the default pipe 0 (EP0).

  • Core Function: Sends class-specific requests to control Unit/Terminal attributes, completing parameter negotiation via Probe and Commit processes.

Video Streaming Interface (VS Interface) - Data Pipeline

  • Transfer Mechanism: Uses Isochronous transfer mode to carry high-bandwidth video streams.

  • Features: Guarantees bandwidth and low latency (no retransmission), suitable for real-time video preview.

Descriptor Structure

In addition to standard USB descriptors (such as Device Descriptor, Configuration Descriptor, Endpoint Descriptor), UVC devices also define Class-Specific Descriptors.

These descriptors are classified into Class-Specific Video Control Interface Descriptors (VC) and Class-Specific Video Streaming Interface Descriptors (VS) based on the interface they belong to.

Descriptor Topology

Device Descriptor
└── Identifies basic device information (USB Version 2.00, Composite Device)

Configuration Descriptor
├── Contains total length, power supply (500mA), etc.
│
├── Interface Association Descriptor (IAD)
│       └── Groups Interface 0 (VC) and Interface 1 (VS) as a Video Function
│
├── Video Control (VC) Interface Descriptor (Interface 0)
│       ├── Standard Interface Descriptor (AlternateSetting 0, Video Control Class)
│       ├── Class-Specific VC Header (declares UVC version, clock frequency)
│       ├── Class-Specific Descriptor Collection (Topology)
│       │       ├── Input Terminal (Camera)
│       │       ├── Processing Unit
│       │       ├── Extension Unit (Vendor Specific Controls)
│       │       └── Output Terminal (USB Streaming)
│       └── Standard Endpoint Descriptor (Interrupt IN for Status)
│
└── Video Streaming (VS) Interface Descriptor (Interface 1)
                ├── Alternate Setting 0: Control transfer active state (negotiation only, no data endpoint)
                │ ├── Standard Interface Descriptor (Interface 1, Streaming Class)
                │ ├── Class-Specific VS Header
                │ ├── Format Descriptor
                │ │     └── Frame Descriptor (various resolutions and frame rates)
                │ ├── Format Descriptor
                │ │     └── Frame Descriptor (various resolutions and frame rates)
                │ ├── Still Image Frame Descriptor
                │ └── Color Matching Descriptor
                │
                ├── Alternate Setting 1: Data transfer active state (with data endpoint)
                │ ├── Standard Interface Descriptor (Interface 1, Streaming Class)
                │ └── Standard Endpoint Descriptor (ISO IN endpoint, e.g., Low Bandwidth)
                │
                ├── Alternate Setting 2
                │ ...... Can configure multiple different settings as needed (e.g., Medium to High Bandwidth)
                │
                └── ...... (Other Alternate Settings for different packet sizes)

Device Qualifier Descriptor
└── Device information while running in another speed mode (e.g., High Speed vs Full Speed capability)

Other Speed Configuration Descriptor
├── Configuration information while running in another speed mode
│
├── Interface Association Descriptor (IAD)
│       └── Groups Interface 0 (VC) and Interface 1 (VS) as a Video Function
│
├── Video Control (VC) Interface Descriptor (Interface 0)
│       ├── Standard Interface Descriptor (AlternateSetting 0, Video Control Class)
│       └── Class-Specific Descriptor Collection (Same Topology as main configuration)
│               ├── Input Terminal
│               ├── Processing Unit
│               ├── Extension Unit
│               └── Output Terminal
│
└── Video Streaming (VS) Interface Descriptor (Interface 1)
                ├── Alternate Setting 0: Control transfer active state (negotiation only)
                │ ├── Standard Interface Descriptor (Interface 1, Streaming Class)
                │ ├── Class-Specific VS Header
                │ ├── Format Descriptor
                │ │     └── Frame Descriptor
                │ └── Color Matching Descriptor
                │
                ├── Alternate Setting 1: Data transfer active state
                │ ├── Standard Interface Descriptor (Streaming Class)
                │ └── Standard Endpoint Descriptor (ISO IN endpoint)
                │
                └── ...... (Other alternate settings typically available in other speed modes)

UVC Video Control (VC) Interface

  • Video Control Interface Descriptor

Interface Header Descriptor
├── bLength            : 1 byte  → Total descriptor length (13 bytes)
├── bDescriptorType    : 1 byte  → 0x24 (CS_INTERFACE)
├── bDescriptorSubtype : 1 byte  → 0x01 (VC_HEADER)
├── bcdUVC             : 2 bytes → Video Class Specification Release Number (0x0100 = 1.00)
├── wTotalLength       : 2 bytes → Total number of bytes for all VC descriptors
├── dwClockFreq        : 4 bytes → Clock frequency in Hz (e.g., 0x02DC6C00 = 48 MHz)
├── bInCollection      : 1 byte  → Number of VideoStreaming interfaces
└── baInterfaceNr(1)   : 1 byte  → Interface number of the first VideoStreaming interface (0x01)
  • Input Terminal Descriptor (Camera)

Input Terminal Descriptor
├── bLength            : 1 byte  → Total descriptor length (18 bytes)
├── bDescriptorType    : 1 byte  → 0x24 (CS_INTERFACE)
├── bDescriptorSubtype : 1 byte  → 0x02 (VC_INPUT_TERMINAL)
├── bTerminalID        : 1 byte  → Unique ID of this Terminal (0x01)
├── wTerminalType      : 2 bytes → 0x0201 (ITT_CAMERA)
├── bAssocTerminal     : 1 byte  → ID of associated Output Terminal (0x00 = None)
├── iTerminal          : 1 byte  → String descriptor index
├── wObjectiveFocalMin : 2 bytes → Min focal length (0 = not supported)
├── wObjectiveFocalMax : 2 bytes → Max focal length
├── wOcularFocalLength : 2 bytes → Ocular focal length
├── bControlSize       : 1 byte  → Size of bmControls (3 bytes)
└── bmControls         : 3 bytes → Bitmap of supported controls
                                                        • D1: Auto-Exposure Mode
                                                        • D3: Exposure Time (Absolute)
                                                        • D5: Focus (Absolute)
                                                        • D9: Zoom (Absolute)
                                                        • D11: Pan (Absolute)
                                                        • D15: Tilt (Absolute)
  • Processing Unit Descriptor

Processing Unit Descriptor
├── bLength            : 1 byte  → Total descriptor length (11 bytes)
├── bDescriptorType    : 1 byte  → 0x24 (CS_INTERFACE)
├── bDescriptorSubtype : 1 byte  → 0x05 (VC_PROCESSING_UNIT)
├── bUnitID            : 1 byte  → Unique ID of this Unit (0x02)
├── bSourceID          : 1 byte  → ID of the source connected to this unit (0x01 = Camera IT)
├── wMaxMultiplier     : 2 bytes → Max digital zoom multiplier
├── bControlSize       : 1 byte  → Size of bmControls (2 bytes)
└── bmControls         : 2 bytes → Bitmap of supported image controls
                                                        • D0: Brightness
                                                        • D1: Contrast
                                                        • D2: Hue
                                                        • D3: Saturation
                                                        • D4: Sharpness
                                                        • D6: White Balance Temperature
  • Extension Unit Descriptor

Extension Unit Descriptor
├── bLength            : 1 byte  → Total descriptor length (29 bytes)
├── bDescriptorType    : 1 byte  → 0x24 (CS_INTERFACE)
├── bDescriptorSubtype : 1 byte  → 0x06 (VC_EXTENSION_UNIT)
├── bUnitID            : 1 byte  → Unique ID of this Unit (0x03)
├── guidExtensionCode  : 16 bytes→ Vendor-specific GUID (e.g., {0FB885C3-...})
├── bNumControls       : 1 byte  → Number of controls in this XU (0x05)
├── bNrInPins          : 1 byte  → Number of input pins (0x01)
├── baSourceID[1]      : 1 byte  → ID of the source connected (0x02 = PU)
├── bControlSize       : 1 byte  → Size of bmControls (4 bytes)
└── bmControls         : 4 bytes → Bitmap of supported vendor controls
  • Output Terminal Descriptor

Output Terminal Descriptor
├── bLength            : 1 byte  → Total descriptor length (9 bytes)
├── bDescriptorType    : 1 byte  → 0x24 (CS_INTERFACE)
├── bDescriptorSubtype : 1 byte  → 0x03 (VC_OUTPUT_TERMINAL)
├── bTerminalID        : 1 byte  → Unique ID of this terminal (0x05)
├── wTerminalType      : 2 bytes → 0x0101 (TT_STREAMING)
├── bAssocTerminal     : 1 byte  → Associated Input Terminal ID
├── bSourceID          : 1 byte  → ID of the connected Source (0x04 = XU)
└── iTerminal          : 1 byte  → String descriptor index

UVC Video Stream (VS) Interface

  • Video Stream Interface Descriptor

Class-Specific VS Input Header Descriptor
├── bLength            : 1 byte  → Total descriptor length (15 bytes)
├── bDescriptorType    : 1 byte  → 0x24 (CS_INTERFACE)
├── bDescriptorSubtype : 1 byte  → 0x01 (VS_INPUT_HEADER)
├── bNumFormats        : 1 byte  → Number of video formats supported (0x02)
├── wTotalLength       : 2 bytes → Total length of all VS specific descriptors
├── bEndpointAddress   : 1 byte  → Address of the ISO IN endpoint (0x81)
├── bmInfo             : 1 byte  → Capabilities (0x00)
├── bTerminalLink      : 1 byte  → ID of the Output Terminal in VC interface (0x05)
├── bStillCaptureMethod: 1 byte  → Method of still image capture (0x02)
├── bTriggerUsage      : 1 byte  → Trigger usage (0x00)
├── bControlSize       : 1 byte  → Size of control field (1 byte)
└── bmaControls(n)     : n bytes → Controls for each format
  • Video Stream Format Type Descriptor

Compressed:
Video Streaming Format Type Descriptor(MJPEG)
├── bLength            : 1 byte  → Total descriptor length (11 bytes)
├── bDescriptorType    : 1 byte  → 0x24 (CS_INTERFACE)
├── bDescriptorSubtype : 1 byte  → 0x06 (VS_FORMAT_MJPEG)
├── bFormatIndex       : 1 byte  → Index of this format (0x01)
├── bNumFrameDescriptors: 1 byte → Number of frame descriptors (0x09)
├── bmFlags            : 1 byte  → Characteristics (0x01 = Fixed Sample Size)
├── bDefaultFrameIndex : 1 byte  → Default frame index (0x01)
├── bAspectRatioX      : 1 byte  → X dimension of aspect ratio
├── bAspectRatioY      : 1 byte  → Y dimension of aspect ratio
├── bmInterlaceFlags   : 1 byte  → Interlace information (0x00 = Progressive)
└── bCopyProtect       : 1 byte  → Duplication restrictions (0x00)
  • Video Stream Frame Type Descriptor

Compressed:
Video Streaming Frame Type Descriptor (MJPEG)
├── bLength            : 1 byte  → Total descriptor length (e.g., 34 bytes)
├── bDescriptorType    : 1 byte  → 0x24 (CS_INTERFACE)
├── bDescriptorSubtype : 1 byte  → 0x07 (VS_FRAME_MJPEG)
├── bFrameIndex        : 1 byte  → Index of this frame (0x01)
├── bmCapabilities     : 1 byte  → Still image support, etc.
├── wWidth             : 2 bytes → Frame Width (e.g., 0x0A20 = 2592)
├── wHeight            : 2 bytes → Frame Height (e.g., 0x0798 = 1944)
├── dwMinBitRate       : 4 bytes → Min Bit Rate (bps)
├── dwMaxBitRate       : 4 bytes → Max Bit Rate (bps)
├── dwMaxVideoFrameBuf : 4 bytes → Max Frame Buffer Size (bytes)
├── dwDefaultFrameInterval: 4 bytes→ Default frame interval in 100ns units
├── bFrameIntervalType : 1 byte  → 0 = Continuous, Non-0 = Discrete
└── dwFrameInterval(n) : 4×n bytes→ Discrete frame intervals supported (e.g., 33ms for 30fps)

Note

For detailed field definitions, please refer to the official USB-IF UVC (USB Video Class) protocol documentation.

Note

Note: Confirm Device Capabilities at Current Speed (FS vs HS)

Please be aware that cameras often present different descriptor structures in Full Speed and High Speed modes:

  • Format Support Differences: Due to bandwidth limitations, certain uncompressed formats (like YUY2) may only be visible in High Speed mode. In Full Speed, the device might only support MJPEG or extremely low-resolution YUY2.

  • Descriptor Variations: The aforementioned VS_FRAME_UNCOMPRESSED descriptors may not exist at all when connected at Full Speed, or parameters such as resolution and frame rate may change significantly.

Class-Specific Requests

The UVC Host driver controls device behavior by sending the following requests via control pipe 0. These requests usually target specific Unit IDs or Interfaces.

Request Name

Requirement

Description

SET_CUR

Mandatory

Sets the current attribute value. Used for Probe/Commit negotiation, or controlling PU attributes like brightness and contrast.

GET_CUR

Mandatory

Gets the current attribute value. Reads negotiated parameters or current device status.

GET_MIN / GET_MAX

Optional

Gets the adjustable range of an attribute. The host driver uses this range to limit input values from the application layer.

GET_RES

Optional

Gets the resolution (step size) of an attribute.

GET_DEF

Optional

Gets the default value of an attribute.

GET_LEN

Optional

Gets the data length.

Stream Negotiation

Before officially starting video stream transmission (Stream On), the host and device must strictly follow the UVC state machine to negotiate parameters and bandwidth; otherwise, the device may refuse to start the video stream. According to the official spec, this process consists of two main phases: Probe (Three-step Handshake) and Commit.

Stream Negotiation Timing Diagram

../../../_images/usb_uvc_probecommit.svg

Detailed Core Steps

  1. PROBE1: Initiate Negotiation (SET_CUR)

    The host sends a SET_CUR (PROBE) request to the Video Streaming Interface (VS Interface), carrying the desired video parameter structure.

    • Purpose: The host proposes “I want this resolution, frame rate, and format”.

  2. PROBE2: Get Feedback (GET_CUR)

    The host sends a GET_CUR (PROBE) request to read back the actual parameters supported by the device.

    • Device Correction: If the parameters requested by the host are not supported, the device returns the closest corrected parameters.

    • Key Data: In the returned structure, the dwMaxPayloadTransferSize field is calculated and filled by the device, which is the basis for subsequent bandwidth selection.

  3. PROBE3: Final Confirmation (SET_CUR)

    The host sends the parameters read back in step 2 (which may have been corrected) to the device again via SET_CUR (PROBE).

    • Purpose: Both sides synchronize states, the host confirms acceptance of the device’s correction suggestions, and ends the negotiation phase.

  4. COMMIT: Commit to Effect (SET_CUR)

    After parameters are agreed upon, the host sends a SET_CUR (COMMIT) request.

    • Purpose: Notifies the device that “parameters are determined”.

Note

Key Parameters

In the Probe/Commit process, the host mainly relies on the following three core fields to determine the video stream configuration:

  • bFormatIndex: Format Index

  • bFrameIndex: Frame Index

  • dwFrameInterval: Frame Interval

Other Parameters

Whether other parameters in the structure (such as wCompWindowSize, wCompQuality, etc.) are effective or adjustable depends entirely on the actual support of the specific device. For most standard devices, usually only the above three core parameters need attention, and other parameters can maintain the default values returned by the device.

For the complete definition of the Probe and Commit Control structure and details of all fields, please refer to the official spec.

Stream Bandwidth Matching and Activation (Stream Activation)

When parameter negotiation is complete, the host needs to select appropriate hardware interface settings based on the bandwidth requirements fed back by the device to officially open the transmission pipe.

Stream Bandwidth Allocation Diagram

../../../_images/usb_uvc_select_bw.svg
  • Match Bandwidth

The driver will traverse all Alternate Settings of the video streaming interface based on the dwMaxPayloadTransferSize value obtained during the negotiation phase.

  • Set Interface

Once a suitable Alternate Setting is found, the host sends the standard USB request SET_INTERFACE. This step marks the formal reservation of USB Isochronous Transfer bandwidth, and the video stream transmission begins immediately.

Note

Bandwidth Matching Principle

In isochronous transfer mode, bandwidth selection is crucial. The driver needs to follow the principle of sufficient and minimal:

  • Search Target: Among all Alternate Settings, look for settings where wMaxPacketSize (endpoint maximum packet size) is greater than or equal to the negotiated value dwMaxPayloadTransferSize.

  • Optimal Choice: Among all settings that meet the above conditions, choose the one with the smallest wMaxPacketSize.

If bandwidth is insufficient, it will cause data packet truncation, resulting in screen corruption or frame loss. If bandwidth is excessive, it will occupy valuable periodic USB bus bandwidth, potentially causing other devices on the bus (such as USB audio) to fail due to insufficient bandwidth.

Data Transmission Format

UVC video data is not a pure raw data stream but is encapsulated in data packets with a Payload Header. The host driver must first parse and strip this header to obtain the valid video Payload.

Payload Header Detailed Structure

../../../_images/usb_uvc_payloadheader.png
  • Basic Header Fields

    All Payload Headers start with two core bytes:

    • HLE (Header Length): 1 byte. Specifies the length of the entire header (in bytes), including HLE itself, BFH, and optional PTS/SCR fields.

    • BFH[0] (Bit Field Header): 1 byte. Bit field flags indicating attributes of subsequent data and the presence of optional fields.

  • BFH[0] Bit Field Definition

    Bit

    Name

    Description

    D0

    FID

    Frame Identifier. Used to distinguish different video frames or segments (behavior depends on video format, see below).

    D1

    EOF

    End of Frame. Indicates whether the current packet is the last packet of a frame or segment.

    D2

    PTS

    Presentation Time Stamp. When set to 1, indicates the header contains a 4-byte PTS field.

    D3

    SCR

    Source Clock Reference. When set to 1, indicates the header contains a 6-byte SCR field.

    D4

    RES

    Reserved. Usually set to 0.

    D5

    STI

    Still Image. Set to 1 indicates the sample belongs to a static image (Stream-based formats usually set to 0).

    D6

    ERR

    Error Bit. Set to 1 indicates an error occurred on the device during streaming.

    D7

    EOH

    End of Header. Set to 1 indicates the end of BFH fields (i.e., no extension header).

  • Optional Extension Fields

    • PTS (4 bytes): Exists when BFH[0].D2 (PTS) is set. Represents the presentation time of the video frame.

    • SCR (6 bytes): Exists when BFH[0].D3 (SCR) is set. Source clock reference used for audio/video synchronization.

Video Data Transmission Formats

In the UVC specification, video data transmission formats are mainly divided into two categories: Frame-based and Stream-based. These two modes determine how the host parses video data and processes the Payload Header.

  • Frame-based

    This is the most common mode, applicable to MJPEG, Uncompressed (YUV/NV12), and other formats.

    • Characteristics: Video data is strictly divided into individual independent images (frames).

    • Transmission Logic: The host driver focuses on “frame boundaries”, assembling a complete frame by detecting the FID toggle or EOF flag in the Payload Header.

  • Stream-based

    Mainly used for compressed stream formats like H.264/H.265.

    • Characteristics: Data is treated as a continuous byte stream without a strict physical “frame” boundary concept (or the boundary is handled internally by the decoder).

    • Transmission Logic: The host driver is mainly responsible for transporting the data stream, usually not relying on the Payload Header to determine the start or end of a frame, but passing the data directly to the upper-layer application or decoder to parse the content.

Note

Engineering Practice: Actual Behavior of Stream-based Devices

Although the UVC specification states that Stream-based formats may not use FID and EOF to delimit boundaries, in actual engineering applications:

  • Device Side (Camera):

    Even when outputting Stream-based formats like H.264/H.265, the vast majority of manufacturers still try to follow Frame-based rules as much as possible, marking the boundaries of data blocks (such as NAL Units or frame slices) by toggling FID or setting EOF.

  • Host Side (Host):

    To simplify logic, most generic UVC drivers (such as Linux uvcvideo, Windows system drivers) default to uniformly slicing data packets based on FID jumps or EOF flags.

Tip

Engineering Practice: Host Driver Data Processing Mechanism

In actual engineering implementation, the UVC Host Driver adopts a unified transmission processing strategy for Frame-based and Stream-based formats, without making special distinctions for stream transmission. Its core processing logic is as follows:

  • Format Independence of Transport Layer:

    The driver layer is only responsible for data transport and reassembly and is not aware of the specific video encoding format or stream type. Regardless of the data type declared by the device descriptor, the driver treats it as a generic data payload.

  • Unified Packet Assembly Mechanism:

    The driver strictly relies on the FID (Frame Identifier) state toggle and EOF (End of Frame) flag bit in the Payload Header for data delimitation. The driver assembles data packets belonging to the same logical sequence into a complete Payload Buffer and submits it to the user space as one frame at the driver level.

  • Separation of Parsing Responsibility:

    Parsing of data content is the responsibility of the Application Layer.

    For example, for Stream-based formats like H.264, the Buffer submitted by the driver may contain multiple NAL Units or partial frame data, and the application layer needs to parse the stream data structure itself to obtain the actual video frames.

Class Driver

This section describes in detail the internal implementation of the USB UVC host driver, including the driver architecture, video stream management, support status of class-specific requests, and the pipe resource allocation scheme.

Detailed Implementation

The USB UVC Host driver stack is based on a modular design, achieving efficient interaction between the upper-layer application and the USB hardware controller through a layered architecture. This architecture ensures stable capture and processing of high-bandwidth video data and provides a flexible frame buffer management mechanism.

The core architecture and data flow are shown in the figure below:

RTL8730E:
../../../_images/usb_host_swuvc_flow.svg

Divided by functional responsibilities, it consists of the following core modules:

Application Adapter

Responsible for buffering and retrieval of video frame data (Video Frame).

As the interface layer between the USB driver and the user application (User Application), it manages the final output of video data:

  • Parameter Configuration: Provides the usbh_uvc_set_param() API to allow the application layer to configure resolution, frame rate, and video format.

  • Data Acquisition: Provides the usbh_uvc_get_frame() API. The application layer uses this API to obtain fully assembled video frames from the ready queue (Ready Queue).

  • Stream Control: Provides the usbh_uvc_stream_on() and usbh_uvc_stream_off() APIs to control the start and stop of the underlying video stream transmission.

UVC Class Driver Architecture

On the host side, the UVC driver is implemented by the UVC class protocol driver (UVC Class Driver), which is responsible for standard protocol handshakes, descriptor topology parsing, and video stream maintenance.

This module implements core functionality based on the USB Video Class 1.1 specification, with minimal UVC 1.5 compatibility — only the necessary negotiation fields defined in UVC 1.5 are recognized for device interoperability, while extended features introduced in UVC 1.5 beyond UVC 1.1 are not supported. This module thereby implements the core business logic for interaction between the host and UVC devices. Its main responsibilities include:

  • Enumeration and Interface Binding: Parses the Video Control Interface (VC Interface) and Video Streaming Interface (VS Interface) to establish the internal UVC topology (such as Units and Terminals).

  • Format Negotiation (Probe & Commit): Before the video stream starts, executes the standard Probe/Commit procedure to negotiate resolution, frame rate, and optimal bandwidth settings with the device.

  • Bandwidth Allocation: Before the video stream starts, selects an appropriate Video Streaming interface alternate setting (Alt Setting) based on the Probe/Commit negotiation result, and requests the corresponding endpoint bandwidth resources from the USB stack during configuration. The current version only supports isochronous transfer mode and does not support bulk video streams.

Core Processing Task (Process Task)

Responsible for the core logic and state machine management of the USB protocol stack.

  • State Machine Maintenance: Manages UVC-specific state transitions, including UVC_STATE_CTRL (control transfer handling), UVC_STATE_TRANSFER (data transfer handling), and state transitions during stream start-up.

  • Event Handling: Responds to transfer completion events reported by the lower layer and coordinates the timing between control transfers and data transfers.

  • Dynamic Hot-Plug Handling: Handles device attach and detach events and automatically releases stream buffers and pipe resources upon device removal.

Frame Assembly Service

RTL8730E:

The software driver is responsible for parsing, reassembling, and error checking of raw USB packets.

  • URB Handling: Receives URB (USB Request Block) packets reported by the lower layer and parses the UVC payload header.

  • Frame Assembly: Handles FID (Frame ID) toggle logic, strips protocol headers, and fills the valid video payload into the frame buffer.

  • Error Detection: Parses the bmHeaderInfo field in the UVC payload header to check the ERR flag and other error information. In conjunction with the FID state, determines frame boundaries to ensure the integrity of video frame data. When hardware acceleration is enabled, this module interfaces directly with the hardware decoder.

Class-Specific Request Implementation

The driver stack follows the USB Video Class specification and encapsulates the implementation and transmission flow of the core class-specific requests (Class-Specific Requests).

The current driver mainly implements class-specific requests related to Video Probe Control and Video Commit Control (SET_CUR/GET_CUR). Source code path: {SDK}/component/usb/host/uvc

Class-Specific Request Type

Remarks

SET_CUR

Sets the current attribute. Mainly used in Probe and Commit phases to send desired video stream parameters (e.g., dwFrameInterval, bFormatIndex) to the device.

GET_CUR

Gets the current attribute. Reads the currently active configuration parameters from the device, used to verify settings or retrieve current status.

Pipe Configuration

During device enumeration (usbh_uvc_attach), the UVC host driver parses the configuration descriptor, automatically identifies the video control and video streaming interfaces, and requests the corresponding pipe resources according to the Alt Setting.

Pipe Type

Description

Control IN/OUT Pipe

Default Control Pipe 0 (EP0). Used for sending standard requests and UVC specific requests (e.g., Video Probe and Commit Control).

Isochronous IN Pipe

Belongs to the Video Streaming Interface (VS Interface). Used for receiving high-bandwidth video payload data. The driver supports automatically selecting the appropriate Packet Size (MPS) and transfer mode based on USB speed (Full/High Speed).

API Reference

Driver API

Application Example

Application Design

This section describes in detail the complete development process of the UVC (USB Video Class) host driver, covering driver initialization, hot-plug management, video stream control and data processing, as well as example descriptions for different application scenarios.

Driver Initialization

Before using the UVC host driver, you need to define the UVC configuration structure (such as usbh_uvc_cfg_t) and register the callback structure usbh_uvc_cb_t, then call the core interfaces in sequence to start the USB host stack and the UVC class driver.

Step description:

  • Hardware Configuration: Set the USB speed mode and related interrupt priorities.

  • Callback Registration: Define the usbh_uvc_cb_t structure and register handler functions for various stages (initialization, connection, disconnection, parameter setting completion).

  • Core Initialization: Call usbh_init() to initialize the USB core stack.

  • Class Driver Loading: Call usbh_uvc_init() to initialize the UVC class driver.

/*
 * 1. Configure USB speed, ISR priority, and main task priority.
 */
static usbh_config_t usbh_cfg = {
        .speed = USB_SPEED_HIGH,
        .ext_intr_enable = USBH_SOF_INTR,
        .isr_priority = INT_PRI_MIDDLE,
        .main_task_priority = CONFIG_USBH_UVC_MAIN_THREAD_PRIORITY,
        .tick_source = USBH_SOF_TICK,
};

/*
 * 2. Configure UVC user-level HW priority.
 */
static usbh_uvc_ctx_t uvc_cfg = {
#if USBH_UVC_USE_HW
        .hw_isr_pri = CONFIG_USBH_UVC_HW_IRQ_PRIORITY,
#endif
};

/*
 * 3. Define user callbacks for UVC events.
 */
static usbh_uvc_cb_t uvc_cb = {
        .init = uvc_cb_init,
        .deinit = uvc_cb_deinit,
        .attach = uvc_cb_attach,
        .detach = uvc_cb_detach,
        .setup = uvc_cb_setup,
        .setparam = uvc_cb_setparam,  /* int status: check uvc_setparam_status after semaphore */
};

int ret = 0;

/*
 * 4. Initialize USB host core driver with configuration.
 */
ret = usbh_init(&usbh_cfg, NULL);
if (ret != HAL_OK) {
        goto free_sema_exit;
}

/*
 * 5. Initialize UVC class driver.
 */
ret = usbh_uvc_init(&uvc_cfg, &uvc_cb);
if (ret != HAL_OK) {
        /* If class driver init fails, clean up the core driver */
        usbh_deinit();
        goto usb_deinit_exit;
}

Hot-Plug Event Handling

Monitor UVC camera connection and disconnection by registering the attach and detach callbacks in usbh_uvc_cb_t.

In the example code, a semaphore mechanism is used to synchronize states:

  • Attach: When a camera is inserted and successfully enumerated, the attach callback is triggered, releasing uvc_attach_sema to notify the main thread to create the video capture task.

  • Detach: When the camera is removed, the detach callback is triggered, releasing uvc_detach_sema to wake up the hot-plug management thread (uvc_hotplug_thread) to perform resource cleanup and re-initialization.

/* USB callback */
static usbh_uvc_cb_t uvc_cb = {
        .attach = uvc_cb_attach,
        .detach = uvc_cb_detach,
};

/* Callback executed when device is inserted */
static int uvc_cb_attach(void)
{
        USB_DIAG(USB_LAYER_APP, USB_EVT_LINK, 2);
        rtos_sema_give(uvc_attach_sema);
        return HAL_OK;
}

/* Callback executed when device is removed */
static int uvc_cb_detach(void)
{
        USB_DIAG(USB_LAYER_APP, USB_EVT_LINK, 3);
#if CONFIG_USBH_UVC_HOT_PLUG
        rtos_sema_give(uvc_detach_sema);
#endif
        return HAL_OK;
}

/* Thread Context: Handle the state machine for Hotplug */
static void example_usbh_uvc_hotplug_thread(void *param)
{
        int ret = 0;

        UNUSED(param);

        for (;;) {
                /* Wait for detach signal */
                if (rtos_sema_take(uvc_detach_sema, RTOS_SEMA_MAX_COUNT) == RTK_SUCCESS) {

                        /* Step 1: Set exiting flag so uvc_test can exit gracefully */
                        uvc_task_exiting = 1;

                        /* Step 2: Let uvc_test finish processing the current frame */
                        /* (img_prepare + put_frame) before tearing down stream resources */
                        rtos_time_delay_ms(200);

                        /* Step 3: Stop streaming to wake up any blocked get_frame call.
                        /* After this, get_frame returns NULL (stream_state != STREAMING_ON). */
                        usbh_uvc_stream_off(CONFIG_USBH_UVC_IF_NUM_0);

                        /* Step 4: Wait for uvc_test to detect flag, break out of loop, and self-delete.*/
                        /* No timeout: must not deinit while uvc_test is still running,*/
                        /* as it may call usbh_uvc_put_frame() on already-freed memory. */
                        while (uvc_task != NULL) {
                                rtos_time_delay_ms(100);
                        }

                        RTK_LOGS(TAG, RTK_LOG_INFO, "Hotplug: uvc_test exited\n");

                        /* Step 5: De-init UVC Class and Host Core */
                        usbh_uvc_deinit();
                        usbh_deinit();
                        rtos_time_delay_ms(10);
                        RTK_LOGS(TAG, RTK_LOG_INFO, "Free heap: 0x%x\n", rtos_mem_get_free_heap_size());

                        /* Reset exiting flag before re-init */
                        uvc_task_exiting = 0;

                        /* Step 6: Re-init Host Core and UVC Class for next attach */
                        ret = usbh_init(&usbh_cfg, NULL);
                        if (ret != HAL_OK) {
                                RTK_LOGS(TAG, RTK_LOG_ERROR, "Init USBH fail\n");
                                break;
                        }

                        ret = usbh_uvc_init(&uvc_cfg, &uvc_cb);
                        if (ret < 0) {
                                RTK_LOGS(TAG, RTK_LOG_ERROR, "Init UVC fail\n");
                                usbh_deinit();
                                break;
                        }
                }
        }

        RTK_LOGS(TAG, RTK_LOG_ERROR, "Hotplug thread fail\n");
        rtos_task_delete(NULL);
}

/* Main entry task to initialize USB and wait for connection */
static void example_usbh_uvc_task(void *param)
{
        rtos_task_t hotplug_task;
        int ret = 0;

        /* ... Initialization of semaphores and mutexes ... */

        /* Init USB Host Core */
        ret = usbh_init(&usbh_cfg, NULL);
        if (ret != HAL_OK) {
                goto free_sema_exit;
        }

        /* Init UVC Class Driver */
        ret = usbh_uvc_init(&uvc_cfg, &uvc_cb);
        if (ret != HAL_OK) {
                usbh_deinit();
                goto usb_deinit_exit;
        }

        /* Create Hotplug detection thread */
        ret = rtos_task_create(&hotplug_task, "example_usbh_uvc_hotplug_thread", example_usbh_uvc_hotplug_thread, NULL, 768U, CONFIG_USBH_UVC_HOTPLUG_THREAD_PRIORITY);
        if (ret != RTK_SUCCESS) {
                goto usbh_uvc_deinit_exit;
        }

        /* Wait for device attach callback to release semaphore */
        while (1) {
                if (rtos_sema_take(uvc_attach_sema, RTOS_SEMA_MAX_COUNT) == RTK_SUCCESS) {
                        if (uvc_task == NULL) {
                                /* Create the main UVC test/streaming thread */
                                ret = rtos_task_create(&uvc_task, "example_usbh_uvc_test", example_usbh_uvc_test, NULL,
                                                                           768U, CONFIG_USBH_UVC_TEST_THREAD_PRIORITY);
                                if (ret != RTK_SUCCESS) {
                                        goto delete_hotplug_task_exit;
                                }
                        }
                }
        }
        goto example_exit;

        /* ... Error handling labels ... */
        ......
}

Video Stream Control and Data Processing

After the UVC device is successfully enumerated, the host needs to configure video parameters and start the video stream, then periodically acquire image frame data.

  1. Frame Buffer Size Configuration

The CONFIG_USBH_UVC_FRAME_BUF_SIZE macro determines the size of each image frame buffer. Proper configuration of this value is critical for stable video capture. The recommended size depends on the video format, resolution, bitrate, and frame rate.

The average frame size can be estimated with the following formula:

avg_frame_size (KB) = bitrate (kbps) / framerate (fps) / 8
rec (KB) = avg_frame_size x margin

Where margin is a safety factor that varies by compression format:

Margin by Format

Format

Margin

Explanation

MJPEG

3x

Peak frame fluctuation. A 3x safety margin accommodates typical peak frames.

H264

5x

Peak I-frame size. I-frames are significantly larger than P/B-frames. Increase to 8x for high-motion scenes.

YUY2

N/A

No margin is needed. Frame size is fixed and uncompressed: width x height x 2 bytes.

The following table provides reference calculations for common configurations:

Recommended Frame Buffer Size Reference

Format

Resolution

Bitrate

FPS

Avg Size

Recommended Buffer Size

MJPEG

720p (1280x720)

10 Mbps

30

41 KB

125 KB (3 x 41 KB)

MJPEG

1080p (1920x1080)

12 Mbps

30

50 KB

150 KB (3 x 50 KB)

H264

720p (1280x720)

8 Mbps

30

33 KB

166 KB (5 x 33 KB)

H264

1080p (1920x1080)

12 Mbps

30

50 KB

250 KB (5 x 50 KB)

YUY2

480p (640x480)

N/A

30

600 KB

600 KB (fixed: 640 x 480 x 2)

YUY2

720p (1280x720)

N/A

30

1,800 KB

1,800 KB (fixed: 1280 x 720 x 2)

Note

  • Within USB 2.0 HS limits, YUY2 720p @ 30 fps (~442 Mbps) approaches the limit.

  • If an oversize error occurs at runtime (Frame overflow log), increase CONFIG_USBH_UVC_FRAME_BUF_SIZE.

  • Ensure the target hardware has sufficient heap memory for the configured buffer size.

  1. Parameter Configuration (Set Parameters)

    Before starting the video stream, call usbh_uvc_set_param() to set the desired format (for example, MJPEG, H.264, YUY2, etc.), resolution, and frame rate. After the setting request is issued, you must wait for the uvc_setparam_sema semaphore to confirm that the parameter set has been successfully negotiated with the device via the Probe/Commit procedure and has taken effect.

  2. Start Video Stream (Stream On)

    Call usbh_uvc_stream_on() to start video transmission on the specified interface. The device will begin sending isochronous (Isochronous) data packets on the USB bus.

  3. Frame Data Acquisition and Processing

    During the initialization phase, the image frame buffer pool is allocated uniformly by the driver, and its size is configured by the CONFIG_USBH_UVC_FRAME_BUF_SIZE macro. In the main loop, the application must strictly follow the “Get -> Process -> Put” workflow:

    • Get (Get Frame):

      Call usbh_uvc_get_frame() to obtain one fully filled image frame from the driver.

      RTL8730E:

      Drop Oldest mechanism:

      The driver maintains a frame buffer pool and provides frame-assembling services. When the processing speed of the application layer is slower than the data production speed, the driver automatically discards unread old frames in the queue to ensure that this interface always returns the latest image frame.

    • Process:

      Consume the data in the application layer (for example, copy for display, save to file, upload over the network, or perform statistical analysis only).

      Note

      Data Processing Description:

      The UVC driver, as a producer, delivers data to the consumer in units of frames (that is, all UVC payloads that together make up one complete image frame). The driver does not parse the specific encoded data content inside the UVC payload (such as MJPEG/H.264 bitstreams), so the application layer must implement payload decoding or processing logic by itself according to the specific video format.

    • Put Frame:

      Regardless of how the application layer processes the frame (even if it only counts the data size or decides to discard the frame), usbh_uvc_put_frame() must be called after processing is complete.

      Note

      RTL8730E:

      This function is used to return the buffer to the driver’s internal frame pool. If the buffer is not returned in a timely manner, the frame buffer pool will be exhausted and no new image data can be acquired.

  4. Stop Video Stream (Stream Off)

    When video data is no longer needed or before disconnecting, call usbh_uvc_stream_off(). This notifies the device to stop isochronous (ISOC) transmission, releases isochronous endpoint bandwidth, and resets the driver’s internal stream control state.

/* Define user callbacks for UVC events */
static usbh_uvc_cb_t uvc_cb = {
        .setup = uvc_cb_setup,
        .setparam = uvc_cb_setparam,
};

/* Define USB user-level setup callback */
static int uvc_cb_setup(void)
{
        USB_DIAG(USB_LAYER_APP, USB_EVT_LINK, 4);
        rtos_sema_give(uvc_start_sema);
        return HAL_OK;
}

/* Define USB user-level setparam callback */
static int uvc_cb_setparam(int status)
{
        USB_DIAG(USB_LAYER_APP, USB_EVT_LINK, 5);
        uvc_setparam_status = status;
        rtos_sema_give(uvc_setparam_sema);
        return HAL_OK;
}

/* Main UVC Test Thread */
static void example_usbh_uvc_test(void *param)
{
        ......

        /* Wait for the device to be ready (Enumeration Complete) */
        while (!uvc_task_exiting) {
                if (rtos_sema_take(uvc_start_sema, RTOS_SEMA_MAX_COUNT) != RTK_SUCCESS) {
                        continue;
                }
                /* 1. Configure UVC Parameters */
                /* Set the desired format, resolution, and frame rate */
                uvc_s_ctx.fmt_type = CONFIG_USBH_UVC_FORMAT_TYPE; // e.g., MJPEG
                uvc_s_ctx.width = CONFIG_USBH_UVC_WIDTH;          // e.g., 1080
                uvc_s_ctx.height = CONFIG_USBH_UVC_HEIGHT;        // e.g., 720
                uvc_s_ctx.frame_rate = CONFIG_USBH_UVC_FRAME_RATE;// e.g., 30
                uvc_s_ctx.frame_buf_size = CONFIG_USBH_UVC_FRAME_BUF_SIZE;

                if (uvc_s_ctx.fmt_type == USBH_UVC_FORMAT_MJPEG) {
                        fmt_name = "MJPEG";
                } else if (uvc_s_ctx.fmt_type == USBH_UVC_FORMAT_H264) {
                        fmt_name = "H264";
                } else if (uvc_s_ctx.fmt_type == USBH_UVC_FORMAT_YUV) {
                        fmt_name = "YUV";
                } else {
                        RTK_LOGS(TAG, RTK_LOG_ERROR, "Unsupport type %d\n", uvc_s_ctx.fmt_type);
                        goto exit;
                }

                /* Trigger the UVC parameter setting process */
                ret = usbh_uvc_set_param(&uvc_s_ctx, CONFIG_USBH_UVC_IF_NUM_0);
                if (ret != RTK_SUCCESS) {
                        RTK_LOGS(TAG, RTK_LOG_ERROR, "Set param req failed: %d\n", ret);
                        goto exit;
                }

                /* Wait for the semaphore indicating parameter setting is actually completed */
                if (rtos_sema_take(uvc_setparam_sema, 5000) == RTK_SUCCESS) {
                        /* Check the status returned by the setparam callback */
                        if (uvc_setparam_status != HAL_OK) {
                                RTK_LOGS(TAG, RTK_LOG_ERROR, "Set paras err: %s %d*%d@%dfps status=%d\n",
                                                fmt_name, uvc_s_ctx.width, uvc_s_ctx.height, uvc_s_ctx.frame_rate, uvc_setparam_status);
                                goto exit;
                        }
                        RTK_LOGS(TAG, RTK_LOG_INFO, "Set paras ok: %s %d*%d@%dfps\n",
                                        fmt_name, uvc_s_ctx.width, uvc_s_ctx.height, uvc_s_ctx.frame_rate);
                } else {
                        RTK_LOGS(TAG, RTK_LOG_ERROR, "Set paras timeout: %s %d*%d@%dfps\n",
                                        fmt_name, uvc_s_ctx.width, uvc_s_ctx.height, uvc_s_ctx.frame_rate);
                        goto exit;
                }

                /* ... Initialize consumer tasks (e.g., VFS thread or HTTPC thread) ... */

                /* 2. Start Video Stream */
                RTK_LOGS(TAG, RTK_LOG_INFO, "Stream on\n");
                ret = usbh_uvc_stream_on(&uvc_s_ctx, CONFIG_USBH_UVC_IF_NUM_0);
                if (ret) goto exit;

                /* 3. Main Capture Loop */
                while (img_cnt < CONFIG_USBH_UVC_LOOP && !uvc_task_exiting) {

                        // 3.1 Get Frame from USB Stack
                        /* Retrieve a filled frame buffer from the UVC driver */
                        buf = usbh_uvc_get_frame(CONFIG_USBH_UVC_IF_NUM_0);

                        if (buf == NULL) {
                                /* Frame not ready yet, fail_cnt ++ , wait and retry */
                                rtos_time_delay_ms(1);
                                continue;
                        }

                        // 3.2 Process the Frame Data
                        /* CRITICAL: Buffer overflow detected! */
                        /* This means the camera sent a frame larger than our allocated buffer. */
                        /* ACTION: Please increase 'CONFIG_USBH_UVC_FRAME_BUF_SIZE' in example_usbh_uvc.c */
                        /* to match the camera's actual output size for the current resolution/format. */

                        len = buf->byteused;

                        /* Even on error, we must return the frame buffer */
                        if (len > CONFIG_USBH_UVC_FRAME_BUF_SIZE) {
                                if (usbh_uvc_put_frame(buf, CONFIG_USBH_UVC_IF_NUM_0) != HAL_OK) {
                                        RTK_LOGS(TAG, RTK_LOG_ERROR, "Put frame fail\n");
                                }
                                RTK_LOGS(TAG, RTK_LOG_ERROR, "Frame %d overflow %d > %d\n", img_cnt, len, CONFIG_USBH_UVC_FRAME_BUF_SIZE);
                                goto exit;
                        }

                        /* CONSUMER LOGIC: */
                        /* The actual data processing happens here. */
                        /* - Simple Mode:  Just count bytes and drop data. */
                        /* - VFS Mode: Write `buf->buf` to SD Card. */
                        /* - HTTPC Mode: Send `buf->buf` to Network. */
                        if (len > 0) {
                                RTK_LOGS(TAG, RTK_LOG_INFO, "Captured frame %d, len=%d\n", img_cnt, len);
                                usbh_uvc_img_prepare(buf);
                        }

                        // 3.3 Put Frame back to USB Stack
                        /* CRITICAL: Must return the buffer to driver for the next capture */
                        if (usbh_uvc_put_frame(buf, CONFIG_USBH_UVC_IF_NUM_0) != HAL_OK) {
                                RTK_LOGS(TAG, RTK_LOG_ERROR, "Put frame fail\n");
                        }
                        img_cnt ++;
                }

                /* 4. Calculate throughput */
                uvc_calculate_tp(img_cnt);

                /* 5. Release source when detech*/
                if (uvc_task_exiting) {
                        ......
                        goto exit;
                }
        }

/* exit labels */
}

Driver De-initialization

When the device is disconnected or the USB host function needs to be disabled, the class driver and the host core driver must be unloaded in order, and related system resources must be released.

/* 1. Deinitialize UVC class driver. */
usbh_uvc_deinit();

/* 2. Deinitialize USB host core driver */
usbh_deinit();

Operation method

This example demonstrates how Ameba, acting as a USB UVC host, captures video frames from a camera. To meet different application scenarios, the example provides three working modes:

Macro

Description

Frame Storage / Upload Method

USBH_UVC_APP_SIMPLE

Basic Test Mode

Video frames are captured only for throughput statistics, without post-processing; each session captures CONFIG_USBH_UVC_LOOP (200) frames.

USBH_UVC_APP_VFS

SD Card Storage Mode

Video frames are written to an SD card (FAT32) via VFS; MJPEG: frames are saved as imgX.jpeg; H264: saved as stream.h264 (default limit: 2 MB, adjustable based on hardware capability).

USBH_UVC_APP_HTTPC

Network Upload Mode

Video frames are uploaded to an HTTP server via HTTP POST:

Ringbuffer (RTL8730E, SW UVC) MJPEG: uploaded frame by frame; H264: uploaded in chunked streaming mode; CONFIG_USBH_UVC_LOOP (200) frames are captured.

PSRAM (RTL8721F, HW UVC) Frames are captured within a time window and uploaded in batch (approximately 2 seconds of video).

Users can select the active mode by modifying the CONFIG_USBH_UVC_APP macro definition in the code.

Configuration and Compilation

  • Software Configuration

    Open the example_usbh_uvc.c file and modify the CONFIG_USBH_UVC_APP macro definition according to the test requirements:

    /* Supported application example: USBH_UVC_APP_SIMPLE, USBH_UVC_APP_VFS, USBH_UVC_APP_HTTPC */
    #define CONFIG_USBH_UVC_APP                        USBH_UVC_APP_SIMPLE
    
    /* Supported formats: USBH_UVC_FORMAT_MJPEG, USBH_UVC_FORMAT_YUV, USBH_UVC_FORMAT_H264
    * Note: Users must verify which formats their specific camera supports and
    * adjust the definition below accordingly. */
    #define CONFIG_USBH_UVC_FORMAT_TYPE                USBH_UVC_FORMAT_MJPEG
    
    /* Target resolution and compression ratio.
    * If the specific camera device does not support
    * these values, the host stack will automatically select the closest match.
    * Always check the logs to confirm the actual parameters applied. */
    #define CONFIG_USBH_UVC_WIDTH                      1280
    #define CONFIG_USBH_UVC_HEIGHT                     720
    #define CONFIG_USBH_UVC_FRAME_RATE                 30
    
    /* Frame buffer size in bytes
    * Size depends on format, resolution, and scene complexity.
    * Formula:
    *   avg_frame_size (KB) = bitrate (kbps) / framerate (fps) / 8
    *   rec = avg x margin
    *
    * Margin by format:
    *   MJPEG: 3 x avg (peak frame fluctuation)
    *   H264:  5 x avg (peak I-frame; increase to 8x for high-motion)
    *   YUY2:  no margin needed, set to width x height x 2 (uncompressed, fixed frame size)
    *
    * Examples:
    *   MJPEG @ 720p  30 fps, 10 Mbps : avg = 41 KB,  rec = 125 KB
    *   MJPEG @ 1080p 30 fps, 12 Mbps : avg = 50 KB,  rec = 150 KB
    *   H264  @ 720p  30 fps, 8 Mbps  : avg = 33 KB,  rec = 166 KB
    *   H264  @ 1080p 30 fps, 12 Mbps : avg = 50 KB,  rec = 250 KB
    *   YUY2  @ 480p  30 fps : frame = 640 x 480 x 2 = 600 KB    (USB 2.0 bandwidth: 147 Mbps)
    *   YUY2  @ 720p  30 fps : frame = 1280 x 720 x 2 = 1,800 KB (USB 2.0 bandwidth: 442 Mbps)
    *   YUY2  @ 1080p 30 fps : not listed - exceeds USB 2.0 HS bandwidth (~480 Mbps theoretical);
    *                            1080p YUY2 @ 30 fps requires ~950 Mbps; use MJPEG or H264 instead.
    * Note: Increase this value if an oversize error occurs; Ensure the target hardware has
    *       sufficient heap memory for this buffer. */
    #define CONFIG_USBH_UVC_FRAME_BUF_SIZE             (150 * 1024)
    
    /* Most cameras have a single video stream interface, so use default 0.
    * If the camera supports dual streams, set this to 1.
    * Note: Current protocol stack supports a maximum of 2 video stream interfaces. */
    #define CONFIG_USBH_UVC_IF_NUM_0                   0
    
    /* Hot plug / memory leak test */
    #define CONFIG_USBH_UVC_HOT_PLUG                   1
    
    /* Check image data validity (0: Disable, 1: Enable) */
    #define CONFIG_USBH_UVC_CHECK_MJEPG_DATA           1
    
    /* Number of frames to capture in the loop */
    #define CONFIG_USBH_UVC_LOOP                       200
    
    /* Maximum continuous error count before stopping */
    #define CONFIG_USBH_UVC_MAX_FAIL_COUNT             5
    

    Depending on the selected mode, also pay attention to the following configuration items:

    If using HTTPC mode, configure the target server’s IP address and port, and ensure the Wi-Fi connection information is correct.

    #define USBH_UVC_HTTPC_SERVER            "xx.xx.xx.xx"
    #define USBH_UVC_HTTPC_PORT              5090
    
  • Compilation and Flashing

    Execute the following commands in the SDK root directory to configure the environment, select the target SoC, compile the project, and flash the generated Image file to the development board:

    # Initialize environment (required for every new terminal)
    source env.sh or env.bat(Windows system)
    
    # Select Target SoC (replace xxx with your specific SoCs)
    ameba.py soc xxx
    
    ameba.py build -a usbh_uvc -p
    
  • Confirmation of Menuconfig configuration

    If compilation fails, run ameba.py menuconfig and confirm that USBH UVC has been selected.

    - Choose `CONFIG USB --->`:
    
       [*] Enable USB
             USB Mode (Host)  --->
       [*] UVC
    

Result Verification

General Test Steps:

  1. Connect the USB camera to the USB interface of the development board.

  2. Reset the development board and observe the serial logs.

  3. Ensure there are no USB-related error messages (e.g., enumeration failure).

Check the specific verification steps and expected logs below according to the selected mode:

Simple Mode:

Test Description:

In this mode, after the system detects the camera, it periodically captures images (default 200 frames), only calculating throughput without saving data.

Expected Log:

[UVC-I] Set paras ok: MJPEG 1280*720@60fps
[UVC-I] Stream on
[UVC-I] Captured frame 0, len=20832
[UVC-I] Captured frame 1, len=20912
...
[UVC-I] Captured frame 199, len=108264
[UVC-I] TP 4126 KB/s @ 4953 ms, fps 40/s
[UVC-I] TP 4.0 MB/s-40 (0_20930888/200)
[UVC-I] Stream off
[UVC-I] Test done

Reviewing Captured Media:

  • JPEG images: Open directly with any image viewer on your PC.

  • JPEG sequence to MP4: Use ffmpeg to convert a JPEG frame sequence into a playable MP4 video, replace <framerate> with your actual frame rate and <drive>: with the SD card drive letter on your system:

    ffmpeg -framerate <framerate> -i <drive>:/img%d.jpeg -c:v libx264 -pix_fmt yuv420p output.mp4
    
  • H.264 raw stream: Use ffplay to play the raw H.264 elementary stream directly:

    ffplay -f h264 stream.h264
    
  • YUY2 raw video: Can be played with VLC or ffplay, specify rawvideo parameters such as pixel format, resolution, and framerate:

    ffplay -f rawvideo -pixel_format yuyv422 -video_size <resolution> -framerate <framerate> raw.yuv
    

Note

The above commands are for reference only. For the full command syntax and available options, refer to the FFmpeg documentation .