nvidia-codec API Reference

nvidia_codec

NVIDIA hardware video decoding for Python.

This package provides a Pythonic interface to NVIDIA's NVDEC hardware video
decoder, allowing fast GPU-accelerated video decoding with PyTorch integration.

Quick Start:
    from nvidia_codec.utils import Player
    import torch

    # Stream all frames from a video
    player = Player('/path/to/video.mp4')
    for time, frame in player.frames(torch.float32):
        process(frame)  # frame is [C, H, W] tensor on GPU

    # Extract a single frame
    with Player('/path/to/video.mp4') as player:
        time, frame = player.screenshot(timedelta(seconds=30), torch.uint8)

Requirements:
    - NVIDIA GPU with NVDEC support
    - NVIDIA driver with libnvcuvid.so
    - FFmpeg 8.x (for demuxing)
    - PyTorch with CUDA support

Supported Codecs:
    - H.264 (AVC)
    - HEVC (H.265)
    - VP9
    - AV1 (requires Ampere or newer GPU)
    - MPEG4
    - VC1 / WMV3

Exceptions:
    CodecNotSupportedError: Raised when the GPU doesn't support the video codec.
    NoFrameError: Raised when frame extraction fails.

class CodecNotSupportedError

Raised when the codec is not supported by the GPU's NVDEC.

class NoFrameError

Raised when no frame could be extracted from the video.

nvidia_codec.utils

High-level utilities for video decoding.

- parse, VideoTrack, AudioTrack: Probe metadata without GPU
- Decoder: GPU-accelerated decoder for a single video track
- Player: Convenience — parse + best track + Decoder

nvidia_codec.utils.player

Convenience player: opens a file, picks the best video track, decodes.

class Player

Convenience class: opens a file, picks the best video track, decodes.

Example:
    with Player('/path/to/video.mp4') as player:
        t, frame = player.screenshot(timedelta(seconds=30), torch.uint8)

__init__(self, url, target_size, cropping, target_rect, device, track_idx)

nvidia_codec.core.decode

Low-level NVDEC decoder — thin wrapper around CUVID API.

This module provides the core decoding interface. It does NOT manage
threading or slot ownership; that is the caller's responsibility
(see nvidia_codec.utils.player for the higher-level interface).

Callbacks:
    pre_decode(pic_idx): Called before cuvidDecodePicture. The caller
        should block here if the picture slot is in use.
    post_decode(pic): Called when a decoded frame is ready for display
        (the "display callback"). pic is a Picture or None (EOS).

class Surface

Post-processed decoded frame accessible as a CUDA array.

Do not instantiate directly; use Picture.map() instead.

__init__(self, decoder, index, params, stream, pts)

format(self)

height(self)

width(self)

size(self)

free(self)

shape(self)

class Picture

Decoded frame before post-processing.

A data object — does not own the picture slot. Slot ownership is
managed by the caller via pre_decode/pic_release callbacks.

__init__(self, decoder, index, proc_params, pts)

free(self)

No-op. Slot ownership is managed by the caller (player.py).

map(self, stream)

decide_surface_format(chroma_format, bit_depth, supported_surface_formats, allow_high)

Select the best surface format for the given video parameters.

class BaseDecoder

NVDEC hardware decoder — thin wrapper around CUVID API.

Threading and slot ownership are NOT managed here. The caller provides
callbacks (pre_decode, pic_release, surface_acquire, surface_release)
to handle synchronization.

catch_exception(self, func, return_on_error)

handleVideoSequence(self, pUserData, pVideoFormat)

codec(self)

height(self)

width(self)

target_width(self)

target_height(self)

surface_format(self)

pre_decode(self, idx)

NVDEC is reclaiming picture slot idx. After this returns,
slot ownership is transferred back to NVDEC.

post_decode(self, pic)

NVDEC is releasing picture slot for display.
pic is a Picture or None (EOS).

handlePictureDecode(self, pUserData, pPicParams)

handlePictureDisplay(self, pUserData, pDispInfo)

handleOperatingPoint(self, pUserData, pOPInfo)

__init__(self, codec, decide, device, extradata, coded_width, coded_height)

send(self, packet)

Send a compressed packet to the decoder.

Calls self.pre_decode(idx) before each decode and
self.post_decode(pic) for each displayed frame.

Args:
    packet: (pts, data) tuple or None for end-of-stream.

free(self)