Skip to content

Operations

Every editing primitive in videopython is an Operation subclass — a Pydantic BaseModel whose fields ARE the JSON wire format. Subclasses auto-register via __pydantic_init_subclass__, so importing videopython.editing (or videopython.ai) populates the registry. The registry is what VideoEdit.json_schema() uses to build the discriminated-union schema for LLM-driven plan generation.

Subclass Contract

from typing import ClassVar, Literal
from pydantic import Field

from videopython.editing import Operation, OpCategory
from videopython.base.video import Video, VideoMetadata


class Resize(Operation):
    """Resize the video.

    Args:
        width: Target width in pixels.
        height: Target height in pixels.
    """

    op: Literal["resize"] = "resize"            # discriminator + registry key
    category: ClassVar[OpCategory] = OpCategory.TRANSFORM
    streamable: ClassVar[bool] = True

    width: int | None = Field(None, gt=0)
    height: int | None = Field(None, gt=0)

    def apply(self, video: Video) -> Video: ...
    def predict_metadata(self, meta: VideoMetadata) -> VideoMetadata: ...
    def to_ffmpeg_filter(self, ctx) -> str | None: ...   # streamable transforms only

Notes:

  • op is a one-value Literal field (not a ClassVar). It flows into the JSON wire as the discriminator and is also the registry key.
  • category is OpCategory.TRANSFORM, OpCategory.EFFECT, or OpCategory.SPECIAL.
  • streamable: ClassVar[bool] = True lets VideoEdit.run_to_file() treat this op as streaming-compatible. For transforms that means implementing to_ffmpeg_filter; for effects that means implementing process_frame and streaming_init.
  • Context-dependent ops declare requires: ClassVar[tuple[str, ...]] = ("transcription",) and use a wider apply signature (def apply(self, video, transcription=None)) with # type: ignore[override].

Effects

Effect(Operation) adds a window: TimeRange | None field and a shape-and-frame-count-preserving invariant. Subclasses override _apply(self, video); the base Effect.apply resolves the window, slices the video, runs _apply, splices the result back, and asserts the invariant.

class ColorGrading(Effect):
    op: Literal["color_adjust"] = "color_adjust"
    streamable: ClassVar[bool] = True

    brightness: float = Field(0.0, ge=-1, le=1)
    # ... more fields ...

    def _apply(self, video: Video) -> Video: ...

The window field on the wire:

{"op": "color_adjust", "brightness": 0.1, "window": {"start": 1.0, "stop": 3.0}}

Audio-mutating effects (Fade, VolumeAdjust) and ops that don't fit the frame-preserving shape (TranscriptionOverlay) override apply directly.

Registry API

from videopython.editing import Operation

# Snapshot of {op_id: subclass} for every registered operation:
Operation.registry()

# Look up by op_id (raises KeyError if unknown):
cls = Operation.get("resize")

# Discriminated-union JSON Schema covering every registered op:
schema = Operation.json_schema()

AI operations register lazily, so call import videopython.ai before inspecting the registry if you need face_crop and friends.

Discovering Operations

from videopython.editing import Operation, OpCategory

for op_id, cls in Operation.registry().items():
    print(f"{op_id}: {cls.__doc__.splitlines()[0]}")

transforms = {k: v for k, v in Operation.registry().items()
              if v.category is OpCategory.TRANSFORM}

Per-Operation JSON Schema

Every subclass exposes cls.model_json_schema() (standard Pydantic), returning the JSON Schema for that specific op's fields:

from videopython.editing import Operation

cls = Operation.get("blur_effect")
schema = cls.model_json_schema()
# {
#   "properties": {
#     "op": {"const": "blur_effect", ...},
#     "mode": {"enum": ["constant", "ascending", "descending"], ...},
#     "iterations": {"type": "integer", "minimum": 1, ...},
#     "window": {"anyOf": [{"$ref": "..."}, {"type": "null"}], ...},
#     ...
#   },
#   ...
# }

Operation.json_schema() is the union over all registered ops, and that's the schema VideoEdit.json_schema() embeds for the operations field.

Registered Operations

Base (no AI dependencies)

ID Class Category Streamable
cut_frames CutFrames transform no
cut CutSeconds transform no
resize Resize transform yes
resample_fps ResampleFPS transform yes
crop Crop transform yes
speed_change SpeedChange transform no
reverse Reverse transform no
freeze_frame FreezeFrame transform no
silence_removal SilenceRemoval transform no (requires transcription)
blur_effect Blur effect yes
zoom_effect Zoom effect yes
color_adjust ColorGrading effect yes
vignette Vignette effect yes
ken_burns KenBurns effect yes
full_image_overlay FullImageOverlay effect yes
image_overlay ImageOverlay effect yes
fade Fade effect yes
volume_adjust VolumeAdjust effect yes
text_overlay TextOverlay effect yes
add_subtitles TranscriptionOverlay effect no (requires transcription)
shake Shake effect yes
punch_in PunchIn effect yes
flash Flash effect yes
chromatic_aberration ChromaticAberration effect yes
glitch Glitch effect yes
film_grain FilmGrain effect yes
sharpen Sharpen effect yes
pixelate Pixelate effect yes
mirror_flip MirrorFlip effect yes
kaleidoscope Kaleidoscope effect yes

AI (require import videopython.ai)

ID Class Category Streamable
face_crop FaceTrackingCrop transform no

API Reference

Operation

Operation

Bases: BaseModel

Pydantic base for every editing primitive.

Concrete subclasses MUST declare an op field with a single-value Literal[str] annotation; that value is the discriminator on the JSON wire and the registry key. Subclasses may override the category, streamable, and requires ClassVars.

The default apply raises NotImplementedError; predict_metadata defaults to identity; to_ffmpeg_filter defaults to None (eager).

Source code in src/videopython/editing/operation.py
class Operation(BaseModel):
    """Pydantic base for every editing primitive.

    Concrete subclasses MUST declare an ``op`` field with a single-value
    ``Literal[str]`` annotation; that value is the discriminator on the JSON
    wire and the registry key. Subclasses may override the ``category``,
    ``streamable``, and ``requires`` ClassVars.

    The default ``apply`` raises ``NotImplementedError``; ``predict_metadata``
    defaults to identity; ``to_ffmpeg_filter`` defaults to ``None`` (eager).
    """

    model_config = ConfigDict(extra="forbid", validate_assignment=True)

    op: str

    category: ClassVar[OpCategory] = OpCategory.SPECIAL
    streamable: ClassVar[bool] = False
    requires: ClassVar[tuple[str, ...]] = ()

    _registry: ClassVar[dict[str, type[Operation]]] = {}

    @classmethod
    def __pydantic_init_subclass__(cls, **kwargs: Any) -> None:
        super().__pydantic_init_subclass__(**kwargs)

        op_field = cls.model_fields.get("op")
        if op_field is None:
            return
        annotation = op_field.annotation
        if get_origin(annotation) is not Literal:
            # Abstract intermediate (e.g. Effect) -- no concrete op_id yet.
            return
        literal_values = get_args(annotation)
        if len(literal_values) != 1 or not isinstance(literal_values[0], str):
            raise TypeError(f"{cls.__name__}.op must be Literal of a single str, got {literal_values!r}")
        op_id = literal_values[0]

        existing = Operation._registry.get(op_id)
        if existing is not None and existing is not cls:
            raise ValueError(
                f"Duplicate op_id '{op_id}': "
                f"{cls.__module__}.{cls.__qualname__} vs "
                f"{existing.__module__}.{existing.__qualname__}"
            )
        Operation._registry[op_id] = cls

    @property
    def op_id(self) -> str:
        """Wire / registry identifier. Mirrors ``self.op``."""
        return self.op

    @classmethod
    def registry(cls) -> dict[str, type[Operation]]:
        """Snapshot of ``{op_id: subclass}`` for every registered Operation."""
        return dict(Operation._registry)

    @classmethod
    def get(cls, op_id: str) -> type[Operation]:
        """Look up the Operation subclass for ``op_id``."""
        try:
            return Operation._registry[op_id]
        except KeyError as exc:
            known = ", ".join(sorted(Operation._registry)) or "(none)"
            raise KeyError(f"Unknown op_id {op_id!r}. Known ops: [{known}]") from exc

    @classmethod
    def json_schema(cls) -> dict[str, Any]:
        """Discriminated-union JSON schema over every registered Operation.

        ``op`` is the discriminator tag. This is the LLM-facing schema for
        validating a single operation payload.
        """
        if not Operation._registry:
            return {"type": "object"}
        ops = sorted(Operation._registry.values(), key=lambda c: c.__name__)
        annotated = Annotated[Union[tuple(ops)], Discriminator("op")]  # type: ignore[valid-type]  # noqa: UP007
        return TypeAdapter(annotated).json_schema()

    def apply(self, video: Video) -> Video:
        """Run this operation on ``video``.

        The runner passes pipeline-context values listed in ``cls.requires``
        as keyword arguments (e.g. ``transcription=...``). Subclasses that
        declare ``requires`` widen the signature accordingly -- e.g.
        ``def apply(self, video, transcription=None) -> Video``.
        """
        raise NotImplementedError(f"{type(self).__name__}.apply not implemented")

    def predict_metadata(self, meta: VideoMetadata) -> VideoMetadata:
        """Predict output metadata from input metadata. Default: identity.

        Run during ``VideoEdit.validate()``'s dry-run, before any frames are
        decoded. Beyond predicting shape, this is the fail-fast gate, and it
        has one contract: **reject exactly the plans that would otherwise crash
        or do unrecoverable / expensive work in** :meth:`apply` **/** ``run()``;
        anything ``run()`` can absorb by graceful degradation is NOT rejected.
        ``TranscriptionOverlay`` rejects un-fittable subtitles (they used to
        crash mid-render); ``TextOverlay``/``ImageOverlay`` do not reject
        off-frame geometry (it clips to a valid no-op). Keep the check
        metadata-cheap -- no frame decode.
        """
        return meta

    def to_ffmpeg_filter(self, ctx: FilterCtx) -> str | None:
        """Compile to an ffmpeg ``-vf`` filter expression, or ``None`` for eager.

        Streamable transforms override this. Effects use ``process_frame``
        instead -- they do not go through ffmpeg filters.
        """
        return None

op_id property

op_id: str

Wire / registry identifier. Mirrors self.op.

registry classmethod

registry() -> dict[str, type[Operation]]

Snapshot of {op_id: subclass} for every registered Operation.

Source code in src/videopython/editing/operation.py
@classmethod
def registry(cls) -> dict[str, type[Operation]]:
    """Snapshot of ``{op_id: subclass}`` for every registered Operation."""
    return dict(Operation._registry)

get classmethod

get(op_id: str) -> type[Operation]

Look up the Operation subclass for op_id.

Source code in src/videopython/editing/operation.py
@classmethod
def get(cls, op_id: str) -> type[Operation]:
    """Look up the Operation subclass for ``op_id``."""
    try:
        return Operation._registry[op_id]
    except KeyError as exc:
        known = ", ".join(sorted(Operation._registry)) or "(none)"
        raise KeyError(f"Unknown op_id {op_id!r}. Known ops: [{known}]") from exc

json_schema classmethod

json_schema() -> dict[str, Any]

Discriminated-union JSON schema over every registered Operation.

op is the discriminator tag. This is the LLM-facing schema for validating a single operation payload.

Source code in src/videopython/editing/operation.py
@classmethod
def json_schema(cls) -> dict[str, Any]:
    """Discriminated-union JSON schema over every registered Operation.

    ``op`` is the discriminator tag. This is the LLM-facing schema for
    validating a single operation payload.
    """
    if not Operation._registry:
        return {"type": "object"}
    ops = sorted(Operation._registry.values(), key=lambda c: c.__name__)
    annotated = Annotated[Union[tuple(ops)], Discriminator("op")]  # type: ignore[valid-type]  # noqa: UP007
    return TypeAdapter(annotated).json_schema()

apply

apply(video: Video) -> Video

Run this operation on video.

The runner passes pipeline-context values listed in cls.requires as keyword arguments (e.g. transcription=...). Subclasses that declare requires widen the signature accordingly -- e.g. def apply(self, video, transcription=None) -> Video.

Source code in src/videopython/editing/operation.py
def apply(self, video: Video) -> Video:
    """Run this operation on ``video``.

    The runner passes pipeline-context values listed in ``cls.requires``
    as keyword arguments (e.g. ``transcription=...``). Subclasses that
    declare ``requires`` widen the signature accordingly -- e.g.
    ``def apply(self, video, transcription=None) -> Video``.
    """
    raise NotImplementedError(f"{type(self).__name__}.apply not implemented")

predict_metadata

predict_metadata(meta: VideoMetadata) -> VideoMetadata

Predict output metadata from input metadata. Default: identity.

Run during VideoEdit.validate()'s dry-run, before any frames are decoded. Beyond predicting shape, this is the fail-fast gate, and it has one contract: reject exactly the plans that would otherwise crash or do unrecoverable / expensive work in :meth:apply / run(); anything run() can absorb by graceful degradation is NOT rejected. TranscriptionOverlay rejects un-fittable subtitles (they used to crash mid-render); TextOverlay/ImageOverlay do not reject off-frame geometry (it clips to a valid no-op). Keep the check metadata-cheap -- no frame decode.

Source code in src/videopython/editing/operation.py
def predict_metadata(self, meta: VideoMetadata) -> VideoMetadata:
    """Predict output metadata from input metadata. Default: identity.

    Run during ``VideoEdit.validate()``'s dry-run, before any frames are
    decoded. Beyond predicting shape, this is the fail-fast gate, and it
    has one contract: **reject exactly the plans that would otherwise crash
    or do unrecoverable / expensive work in** :meth:`apply` **/** ``run()``;
    anything ``run()`` can absorb by graceful degradation is NOT rejected.
    ``TranscriptionOverlay`` rejects un-fittable subtitles (they used to
    crash mid-render); ``TextOverlay``/``ImageOverlay`` do not reject
    off-frame geometry (it clips to a valid no-op). Keep the check
    metadata-cheap -- no frame decode.
    """
    return meta

to_ffmpeg_filter

to_ffmpeg_filter(ctx: FilterCtx) -> str | None

Compile to an ffmpeg -vf filter expression, or None for eager.

Streamable transforms override this. Effects use process_frame instead -- they do not go through ffmpeg filters.

Source code in src/videopython/editing/operation.py
def to_ffmpeg_filter(self, ctx: FilterCtx) -> str | None:
    """Compile to an ffmpeg ``-vf`` filter expression, or ``None`` for eager.

    Streamable transforms override this. Effects use ``process_frame``
    instead -- they do not go through ffmpeg filters.
    """
    return None

Effect

Effect

Bases: Operation

Operation that preserves shape and frame count, with optional streaming.

Subclasses override :meth:_apply for in-memory execution and may additionally override :meth:streaming_init / :meth:process_frame for bounded-memory streaming via editing/streaming.py. The base :meth:apply resolves :attr:window, slices the video, runs _apply on the slice, splices the result back, and asserts the shape-preserving invariant.

Source code in src/videopython/editing/operation.py
class Effect(Operation):
    """Operation that preserves shape and frame count, with optional streaming.

    Subclasses override :meth:`_apply` for in-memory execution and may
    additionally override :meth:`streaming_init` / :meth:`process_frame` for
    bounded-memory streaming via ``editing/streaming.py``. The base
    :meth:`apply` resolves :attr:`window`, slices the video, runs
    ``_apply`` on the slice, splices the result back, and asserts the
    shape-preserving invariant.
    """

    category: ClassVar[OpCategory] = OpCategory.EFFECT

    window: TimeRange | None = Field(
        None,
        description="Time window for the effect in seconds. Omit to apply across the full duration.",
    )

    def apply(self, video: Video, **context: Any) -> Video:
        from videopython.base.video import Video as _Video

        original_shape = video.video_shape

        if self.window is None or (self.window.start is None and self.window.stop is None):
            result = self._apply(video)
        else:
            start_s, stop_s = self._resolved_window(video.total_seconds)
            start_f = round(start_s * video.fps)
            end_f = round(stop_s * video.fps)
            inner = self._apply(video[start_f:end_f])
            old_audio = video.audio
            result = _Video.from_frames(
                np.r_["0,2", video.frames[:start_f], inner.frames, video.frames[end_f:]],
                fps=video.fps,
            )
            result.audio = old_audio

        if result.video_shape != original_shape:
            raise RuntimeError(
                f"{type(self).__name__} changed video shape from {original_shape} "
                f"to {result.video_shape}; effects must preserve shape and frame count."
            )
        return result

    def predict_metadata(self, meta: VideoMetadata, **_context: Any) -> VideoMetadata:
        """Effects preserve shape and frame count, so the prediction is identity.

        Accepts ``**_context`` so requires-aware effects (``TranscriptionOverlay``)
        validate without subclasses needing to override just to widen the
        signature. Mirrors :meth:`Effect.apply`'s ``**context`` accept-all.
        """
        return meta

    def _resolved_window(self, total_seconds: float) -> tuple[float, float]:
        win = self.window or TimeRange()
        start_s = 0.0 if win.start is None else float(win.start)
        stop_s = total_seconds if win.stop is None else float(win.stop)
        start_s = min(start_s, total_seconds)
        stop_s = min(stop_s, total_seconds)
        if stop_s < start_s:
            raise ValueError(f"Effect stop ({stop_s}) must be >= start ({start_s})")
        return start_s, stop_s

    def _apply(self, video: Video) -> Video:
        """Apply the effect to ``video`` in memory. Override in subclasses."""
        raise NotImplementedError(f"{type(self).__name__}._apply not implemented")

    def streaming_init(self, total_frames: int, fps: float, width: int, height: int) -> None:
        """Hook for per-stream precomputation (per-frame alphas, sigma curves...).

        Default: no-op. Override in subclasses that need it.
        """

    def process_frame(self, frame: np.ndarray, frame_index: int) -> np.ndarray:
        """Process one ``(H, W, 3) uint8`` frame in streaming mode.

        ``frame_index`` is 0-based within this effect's active window.
        """
        raise NotImplementedError(f"{type(self).__name__} does not support streaming")

predict_metadata

predict_metadata(
    meta: VideoMetadata, **_context: Any
) -> VideoMetadata

Effects preserve shape and frame count, so the prediction is identity.

Accepts **_context so requires-aware effects (TranscriptionOverlay) validate without subclasses needing to override just to widen the signature. Mirrors :meth:Effect.apply's **context accept-all.

Source code in src/videopython/editing/operation.py
def predict_metadata(self, meta: VideoMetadata, **_context: Any) -> VideoMetadata:
    """Effects preserve shape and frame count, so the prediction is identity.

    Accepts ``**_context`` so requires-aware effects (``TranscriptionOverlay``)
    validate without subclasses needing to override just to widen the
    signature. Mirrors :meth:`Effect.apply`'s ``**context`` accept-all.
    """
    return meta

streaming_init

streaming_init(
    total_frames: int, fps: float, width: int, height: int
) -> None

Hook for per-stream precomputation (per-frame alphas, sigma curves...).

Default: no-op. Override in subclasses that need it.

Source code in src/videopython/editing/operation.py
def streaming_init(self, total_frames: int, fps: float, width: int, height: int) -> None:
    """Hook for per-stream precomputation (per-frame alphas, sigma curves...).

    Default: no-op. Override in subclasses that need it.
    """

process_frame

process_frame(
    frame: ndarray, frame_index: int
) -> np.ndarray

Process one (H, W, 3) uint8 frame in streaming mode.

frame_index is 0-based within this effect's active window.

Source code in src/videopython/editing/operation.py
def process_frame(self, frame: np.ndarray, frame_index: int) -> np.ndarray:
    """Process one ``(H, W, 3) uint8`` frame in streaming mode.

    ``frame_index`` is 0-based within this effect's active window.
    """
    raise NotImplementedError(f"{type(self).__name__} does not support streaming")

TimeRange

TimeRange

Bases: BaseModel

Half-open time window in seconds: [start, stop).

Either endpoint may be None, meaning "from the beginning" / "to the end" respectively. Used by :class:Effect.window and elsewhere.

Source code in src/videopython/editing/operation.py
class TimeRange(BaseModel):
    """Half-open time window in seconds: ``[start, stop)``.

    Either endpoint may be ``None``, meaning "from the beginning" / "to the
    end" respectively. Used by :class:`Effect.window` and elsewhere.
    """

    model_config = ConfigDict(extra="forbid", frozen=True)

    start: float | None = Field(None, ge=0, description="Start time in seconds. None means 0.")
    stop: float | None = Field(None, ge=0, description="Stop time in seconds. None means end of video.")

    @model_validator(mode="after")
    def _validate_order(self) -> TimeRange:
        if self.start is not None and self.stop is not None and self.stop < self.start:
            raise ValueError(f"TimeRange.stop ({self.stop}) must be >= start ({self.start})")
        return self

OpCategory

OpCategory

Bases: str, Enum

Coarse execution category for an Operation subclass.

Source code in src/videopython/editing/operation.py
class OpCategory(str, Enum):
    """Coarse execution category for an Operation subclass."""

    TRANSFORM = "transform"
    EFFECT = "effect"
    SPECIAL = "special"

FilterCtx

FilterCtx dataclass

Current pipeline state (post-prior-ops) when compiling to ffmpeg.

Source code in src/videopython/editing/operation.py
@dataclass(frozen=True)
class FilterCtx:
    """Current pipeline state (post-prior-ops) when compiling to ffmpeg."""

    width: int
    height: int
    fps: float