Operations
Every editing primitive in videopython is an Operation subclass — a
Pydantic BaseModel whose fields ARE the JSON wire format. Subclasses
auto-register via __pydantic_init_subclass__, so importing
videopython.editing (or videopython.ai) populates the registry. The
registry is what VideoEdit.json_schema() uses to build the
discriminated-union schema for LLM-driven plan generation.
Subclass Contract
from typing import ClassVar, Literal
from pydantic import Field
from videopython.editing import Operation, OpCategory
from videopython.base.video import Video, VideoMetadata
class Resize(Operation):
"""Resize the video.
Args:
width: Target width in pixels.
height: Target height in pixels.
"""
op: Literal["resize"] = "resize" # discriminator + registry key
category: ClassVar[OpCategory] = OpCategory.TRANSFORM
streamable: ClassVar[bool] = True
width: int | None = Field(None, gt=0)
height: int | None = Field(None, gt=0)
def apply(self, video: Video) -> Video: ...
def predict_metadata(self, meta: VideoMetadata) -> VideoMetadata: ...
def to_ffmpeg_filter(self, ctx) -> str | None: ... # streamable transforms only
Notes:
opis a one-valueLiteralfield (not aClassVar). It flows into the JSON wire as the discriminator and is also the registry key.categoryisOpCategory.TRANSFORM,OpCategory.EFFECT, orOpCategory.SPECIAL.streamable: ClassVar[bool] = TrueletsVideoEdit.run_to_file()treat this op as streaming-compatible. For transforms that means implementingto_ffmpeg_filter; for effects that means implementingprocess_frameandstreaming_init.- Context-dependent ops declare
requires: ClassVar[tuple[str, ...]] = ("transcription",)and use a widerapplysignature (def apply(self, video, transcription=None)) with# type: ignore[override].
Effects
Effect(Operation) adds a window: TimeRange | None field and a
shape-and-frame-count-preserving invariant. Subclasses override
_apply(self, video); the base Effect.apply resolves the window,
slices the video, runs _apply, splices the result back, and asserts
the invariant.
class ColorGrading(Effect):
op: Literal["color_adjust"] = "color_adjust"
streamable: ClassVar[bool] = True
brightness: float = Field(0.0, ge=-1, le=1)
# ... more fields ...
def _apply(self, video: Video) -> Video: ...
The window field on the wire:
Audio-mutating effects (Fade, VolumeAdjust) and ops that don't fit
the frame-preserving shape (TranscriptionOverlay) override apply
directly.
Registry API
from videopython.editing import Operation
# Snapshot of {op_id: subclass} for every registered operation:
Operation.registry()
# Look up by op_id (raises KeyError if unknown):
cls = Operation.get("resize")
# Discriminated-union JSON Schema covering every registered op:
schema = Operation.json_schema()
AI operations register lazily, so call import videopython.ai before
inspecting the registry if you need face_crop and friends.
Discovering Operations
from videopython.editing import Operation, OpCategory
for op_id, cls in Operation.registry().items():
print(f"{op_id}: {cls.__doc__.splitlines()[0]}")
transforms = {k: v for k, v in Operation.registry().items()
if v.category is OpCategory.TRANSFORM}
Per-Operation JSON Schema
Every subclass exposes cls.model_json_schema() (standard Pydantic),
returning the JSON Schema for that specific op's fields:
from videopython.editing import Operation
cls = Operation.get("blur_effect")
schema = cls.model_json_schema()
# {
# "properties": {
# "op": {"const": "blur_effect", ...},
# "mode": {"enum": ["constant", "ascending", "descending"], ...},
# "iterations": {"type": "integer", "minimum": 1, ...},
# "window": {"anyOf": [{"$ref": "..."}, {"type": "null"}], ...},
# ...
# },
# ...
# }
Operation.json_schema() is the union over all registered ops, and
that's the schema VideoEdit.json_schema() embeds for the operations
field.
Registered Operations
Base (no AI dependencies)
| ID | Class | Category | Streamable |
|---|---|---|---|
cut_frames |
CutFrames |
transform | no |
cut |
CutSeconds |
transform | no |
resize |
Resize |
transform | yes |
resample_fps |
ResampleFPS |
transform | yes |
crop |
Crop |
transform | yes |
speed_change |
SpeedChange |
transform | no |
reverse |
Reverse |
transform | no |
freeze_frame |
FreezeFrame |
transform | no |
silence_removal |
SilenceRemoval |
transform | no (requires transcription) |
blur_effect |
Blur |
effect | yes |
zoom_effect |
Zoom |
effect | yes |
color_adjust |
ColorGrading |
effect | yes |
vignette |
Vignette |
effect | yes |
ken_burns |
KenBurns |
effect | yes |
full_image_overlay |
FullImageOverlay |
effect | yes |
image_overlay |
ImageOverlay |
effect | yes |
fade |
Fade |
effect | yes |
volume_adjust |
VolumeAdjust |
effect | yes |
text_overlay |
TextOverlay |
effect | yes |
add_subtitles |
TranscriptionOverlay |
effect | no (requires transcription) |
shake |
Shake |
effect | yes |
punch_in |
PunchIn |
effect | yes |
flash |
Flash |
effect | yes |
chromatic_aberration |
ChromaticAberration |
effect | yes |
glitch |
Glitch |
effect | yes |
film_grain |
FilmGrain |
effect | yes |
sharpen |
Sharpen |
effect | yes |
pixelate |
Pixelate |
effect | yes |
mirror_flip |
MirrorFlip |
effect | yes |
kaleidoscope |
Kaleidoscope |
effect | yes |
AI (require import videopython.ai)
| ID | Class | Category | Streamable |
|---|---|---|---|
face_crop |
FaceTrackingCrop |
transform | no |
API Reference
Operation
Operation
Bases: BaseModel
Pydantic base for every editing primitive.
Concrete subclasses MUST declare an op field with a single-value
Literal[str] annotation; that value is the discriminator on the JSON
wire and the registry key. Subclasses may override the category,
streamable, and requires ClassVars.
The default apply raises NotImplementedError; predict_metadata
defaults to identity; to_ffmpeg_filter defaults to None (eager).
Source code in src/videopython/editing/operation.py
88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 | |
registry
classmethod
get
classmethod
Look up the Operation subclass for op_id.
Source code in src/videopython/editing/operation.py
json_schema
classmethod
Discriminated-union JSON schema over every registered Operation.
op is the discriminator tag. This is the LLM-facing schema for
validating a single operation payload.
Source code in src/videopython/editing/operation.py
apply
Run this operation on video.
The runner passes pipeline-context values listed in cls.requires
as keyword arguments (e.g. transcription=...). Subclasses that
declare requires widen the signature accordingly -- e.g.
def apply(self, video, transcription=None) -> Video.
Source code in src/videopython/editing/operation.py
predict_metadata
Predict output metadata from input metadata. Default: identity.
Run during VideoEdit.validate()'s dry-run, before any frames are
decoded. Beyond predicting shape, this is the fail-fast gate, and it
has one contract: reject exactly the plans that would otherwise crash
or do unrecoverable / expensive work in :meth:apply / run();
anything run() can absorb by graceful degradation is NOT rejected.
TranscriptionOverlay rejects un-fittable subtitles (they used to
crash mid-render); TextOverlay/ImageOverlay do not reject
off-frame geometry (it clips to a valid no-op). Keep the check
metadata-cheap -- no frame decode.
Source code in src/videopython/editing/operation.py
to_ffmpeg_filter
Compile to an ffmpeg -vf filter expression, or None for eager.
Streamable transforms override this. Effects use process_frame
instead -- they do not go through ffmpeg filters.
Source code in src/videopython/editing/operation.py
Effect
Effect
Bases: Operation
Operation that preserves shape and frame count, with optional streaming.
Subclasses override :meth:_apply for in-memory execution and may
additionally override :meth:streaming_init / :meth:process_frame for
bounded-memory streaming via editing/streaming.py. The base
:meth:apply resolves :attr:window, slices the video, runs
_apply on the slice, splices the result back, and asserts the
shape-preserving invariant.
Source code in src/videopython/editing/operation.py
201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 | |
predict_metadata
Effects preserve shape and frame count, so the prediction is identity.
Accepts **_context so requires-aware effects (TranscriptionOverlay)
validate without subclasses needing to override just to widen the
signature. Mirrors :meth:Effect.apply's **context accept-all.
Source code in src/videopython/editing/operation.py
streaming_init
Hook for per-stream precomputation (per-frame alphas, sigma curves...).
Default: no-op. Override in subclasses that need it.
process_frame
Process one (H, W, 3) uint8 frame in streaming mode.
frame_index is 0-based within this effect's active window.
Source code in src/videopython/editing/operation.py
TimeRange
TimeRange
Bases: BaseModel
Half-open time window in seconds: [start, stop).
Either endpoint may be None, meaning "from the beginning" / "to the
end" respectively. Used by :class:Effect.window and elsewhere.