Skip to content

Editing Plans (VideoEdit)

VideoEdit represents a complete multi-segment editing plan:

  1. Extract one or more segments from source videos
  2. Apply per-segment transforms, then effects
  3. Concatenate processed segments
  4. Apply post-assembly transforms, then effects

This is the recommended API for JSON/LLM-generated editing plans.

At a Glance

  • Use segments[*].transforms for transforms and segments[*].effects for effects
  • Use post_transforms for transforms after concatenation
  • Use post_effects for effects after concatenation (not post_transforms)
  • Validate first with edit.validate() before edit.run() when plans are generated dynamically

Quick Start

from videopython.base import VideoEdit

plan = {
    "segments": [
        {
            "source": "input.mp4",
            "start": 5.0,
            "end": 12.0,
            "transforms": [
                {"op": "crop", "args": {"width": 0.5, "height": 1.0, "mode": "center"}},
                {"op": "resize", "args": {"width": 1080, "height": 1920}},
            ],
            "effects": [
                {"op": "blur", "args": {"mode": "constant", "iterations": 1}, "apply": {"start": 0.0, "stop": 1.0}}
            ],
        },
        {
            "source": "input.mp4",
            "start": 20.0,
            "end": 28.0,
        },
    ],
    "post_effects": [
        {"op": "color_adjust", "args": {"brightness": 0.05}}
    ],
}

edit = VideoEdit.from_dict(plan)

# Dry-run validation using VideoMetadata (no frame loading)
predicted = edit.validate()
print(predicted)

video = edit.run()
video.save("output.mp4")

JSON Plan Format

Top-level shape:

{
  "segments": [
    {
      "source": "path/to/video.mp4",
      "start": 5.0,
      "end": 15.0,
      "transforms": [
        {"op": "crop", "args": {"width": 1080, "height": 1920}}
      ],
      "effects": [
        {"op": "blur_effect", "args": {"mode": "constant", "iterations": 2}, "apply": {"start": 0.0, "stop": 3.0}}
      ]
    }
  ],
  "post_transforms": [
    {"op": "resize", "args": {"width": 1080, "height": 1920}}
  ],
  "post_effects": [
    {"op": "color_adjust", "args": {"brightness": 0.05}}
  ]
}

Notes:

  • segments is required and must be non-empty.
  • post_transforms and post_effects are optional.
  • post_transforms accepts only transform operations.
  • post_effects accepts only effect operations.
  • Segment keys are strict (source, start, end, transforms, effects).
  • Step keys are strict:
  • transform step: op, optional args
  • effect step: op, optional args, optional apply
  • Unknown top-level keys are ignored for forward compatibility.

Pipeline Order (Enforced)

VideoEdit always runs operations in this order:

  • Per segment:
  • transforms (in order)
  • effects (in order)
  • After concatenation:
  • post transforms (in order)
  • post effects (in order)

Callers do not control transform/effect interleaving. The model enforces this discipline.

Effect Time Semantics

  • Segment effect apply.start / apply.stop are relative to the segment timeline (segment starts at 0).
  • Post effect apply.start / apply.stop are relative to the assembled output timeline.

Validation and Compatibility Checks

VideoEdit.validate() performs a dry run using VideoMetadata:

  • segment time bounds (start, end)
  • transform metadata prediction (for transforms with registered metadata_method)
  • effect time bounds
  • concatenation compatibility (exact fps, exact dimensions)

Validation returns the predicted final VideoMetadata on success and raises ValueError on invalid plans.

Validation behavior notes:

  • cut metadata prediction mirrors runtime rounded frame slicing semantics (fractional seconds are rounded to frames).
  • crop metadata prediction mirrors runtime crop slicing behavior, including odd-size center crops and edge clipping.

JSON Parsing Behavior

Alias normalization

Input aliases are accepted (for example blur), but:

  • VideoEdit.to_dict() emits canonical operation IDs (for example blur_effect)
  • VideoEdit.json_schema() lists canonical operation IDs only

Common parser constraints

  • resize requires at least one non-null dimension (width or height)
  • valid: {"op": "resize", "args": {"width": 320}}
  • valid: {"op": "resize", "args": {"height": 180}}
  • invalid: {"op": "resize"}
  • invalid: {"op": "resize", "args": {"width": null, "height": null}}

Unsupported operations in JSON plans

The parser rejects operations that are not supported in VideoEdit JSON plans, including:

  • transitions (fade_transition, blur_transition, ...)
  • multi-source operations (picture_in_picture, split_screen, ...)
  • registered operations that are not JSON-instantiable because required constructor args are excluded from registry specs (for example ken_burns, full_image_overlay)

AI operations and lazy registration

AI operation specs are registered only after importing videopython.ai.

If a plan references AI ops (for example face_crop, split_screen), import AI first:

import videopython.ai  # registers AI ops
from videopython.base import VideoEdit

edit = VideoEdit.from_dict(plan)

videopython.base does not auto-import AI modules.

Schema Generation (json_schema)

Use VideoEdit.json_schema() to get a parser-aligned JSON Schema for the current registry state:

from videopython.base import VideoEdit

schema = VideoEdit.json_schema()
print(schema["properties"]["segments"]["minItems"])  # 1

Schema properties:

  • Built dynamically from the operation registry
  • Canonical op IDs only (aliases omitted)
  • Excludes unsupported categories/tags/non-JSON-instantiable ops
  • Reflects current registration state (AI ops appear only if videopython.ai was imported)
  • Encodes parser-aligned constraints (for example resize requires at least one non-null dimension)

Serialization (to_dict)

VideoEdit.to_dict() returns a canonical JSON-ready dict:

  • canonical op IDs
  • deep-copied step args / apply args
  • stable output even if live operation instances are mutated after parsing

API Reference

VideoEdit

VideoEdit

Represents a complete multi-segment video editing plan.

Source code in src/videopython/base/edit.py
class VideoEdit:
    """Represents a complete multi-segment video editing plan."""

    def __init__(
        self,
        segments: Sequence[SegmentConfig],
        post_transform_records: Sequence[_StepRecord] | None = None,
        post_effect_records: Sequence[_StepRecord] | None = None,
    ):
        if not segments:
            raise ValueError("VideoEdit requires at least one segment")
        self.segments: tuple[SegmentConfig, ...] = tuple(segments)
        self.post_transform_records: tuple[_StepRecord, ...] = tuple(post_transform_records or ())
        self.post_effect_records: tuple[_StepRecord, ...] = tuple(post_effect_records or ())

        for record in self.post_transform_records:
            if not isinstance(record.operation, Transformation):
                raise TypeError(
                    "VideoEdit.post_transform_records must contain "
                    f"Transformation operations, got {type(record.operation)}"
                )
        for record in self.post_effect_records:
            if not isinstance(record.operation, Effect):
                raise TypeError(
                    f"VideoEdit.post_effect_records must contain Effect operations, got {type(record.operation)}"
                )

    @classmethod
    def from_json(cls, text: str) -> VideoEdit:
        try:
            data = json.loads(text)
        except json.JSONDecodeError as e:
            raise ValueError(f"Invalid VideoEdit JSON: {e.msg} at line {e.lineno} column {e.colno}") from e
        return cls.from_dict(data)

    @classmethod
    def from_dict(cls, data: dict[str, Any]) -> VideoEdit:
        if not isinstance(data, dict):
            raise ValueError("VideoEdit plan must be a JSON object")

        segments_data = data.get("segments")
        if segments_data is None:
            raise ValueError("VideoEdit plan is missing required key 'segments'")
        if not isinstance(segments_data, list):
            raise ValueError("VideoEdit plan 'segments' must be a list")
        if not segments_data:
            raise ValueError("VideoEdit plan 'segments' must not be empty")

        post_transforms_data = data.get("post_transforms", [])
        post_effects_data = data.get("post_effects", [])
        if not isinstance(post_transforms_data, list):
            raise ValueError("VideoEdit plan 'post_transforms' must be a list")
        if not isinstance(post_effects_data, list):
            raise ValueError("VideoEdit plan 'post_effects' must be a list")

        segments: list[SegmentConfig] = []
        for i, segment_data in enumerate(segments_data):
            location = f"segments[{i}]"
            segments.append(_parse_segment(segment_data, location))

        post_transform_records = [
            _parse_transform_step(step, f"post_transforms[{i}]") for i, step in enumerate(post_transforms_data)
        ]
        post_effect_records = [
            _parse_effect_step(step, f"post_effects[{i}]") for i, step in enumerate(post_effects_data)
        ]

        return cls(
            segments=segments,
            post_transform_records=post_transform_records,
            post_effect_records=post_effect_records,
        )

    def to_dict(self) -> dict[str, Any]:
        """Serialize to canonical JSON-compatible dict.

        Serialization uses `_StepRecord` snapshots as the source of truth. Mutating
        live operation objects after parsing/construction does not affect output.
        """
        return {
            "segments": [self._segment_to_dict(segment) for segment in self.segments],
            "post_transforms": [_step_to_dict(record, include_apply=False) for record in self.post_transform_records],
            "post_effects": [_step_to_dict(record, include_apply=True) for record in self.post_effect_records],
        }

    @classmethod
    def json_schema(cls) -> dict[str, Any]:
        """Return a JSON Schema for `VideoEdit` plans."""
        transform_specs = _videoedit_supported_specs_for_category(OperationCategory.TRANSFORMATION)
        effect_specs = _videoedit_supported_specs_for_category(OperationCategory.EFFECT)

        transform_step_schemas = [
            _videoedit_step_schema_from_spec(spec, include_apply=False) for spec in transform_specs
        ]
        effect_step_schemas = [_videoedit_step_schema_from_spec(spec, include_apply=True) for spec in effect_specs]

        segment_schema: dict[str, Any] = {
            "type": "object",
            "properties": {
                "source": {"type": "string", "description": "Source video path."},
                "start": {"type": "number", "description": "Segment start time in seconds."},
                "end": {"type": "number", "description": "Segment end time in seconds."},
                "transforms": {
                    "type": "array",
                    "items": {"oneOf": transform_step_schemas},
                },
                "effects": {
                    "type": "array",
                    "items": {"oneOf": effect_step_schemas},
                },
            },
            "required": ["source", "start", "end"],
            "additionalProperties": False,
        }

        return {
            "$schema": "http://json-schema.org/draft-07/schema#",
            "type": "object",
            "properties": {
                "segments": {
                    "type": "array",
                    "items": segment_schema,
                    "minItems": 1,
                },
                "post_transforms": {
                    "type": "array",
                    "items": {"oneOf": transform_step_schemas},
                },
                "post_effects": {
                    "type": "array",
                    "items": {"oneOf": effect_step_schemas},
                },
            },
            "required": ["segments"],
        }

    def run(self) -> Video:
        """Execute the editing plan and return the final video."""
        video = self._assemble_segments()
        for record in self.post_transform_records:
            video = record.operation.apply(video)
        for record in self.post_effect_records:
            if not isinstance(record.operation, Effect):
                raise TypeError(
                    f"VideoEdit.post_effect_records must contain Effect operations, got {type(record.operation)}"
                )
            video = record.operation.apply(
                video,
                start=_coerce_optional_number(record.apply_args.get("start"), "start"),
                stop=_coerce_optional_number(record.apply_args.get("stop"), "stop"),
            )
        return video

    def validate(self) -> VideoMetadata:
        """Validate the editing plan without loading video data."""
        segment_metas: list[VideoMetadata] = []
        for i, segment in enumerate(self.segments):
            segment_metas.append(self._validate_segment(i, segment))

        if len(segment_metas) > 1:
            first = segment_metas[0]
            for j, other in enumerate(segment_metas[1:], start=1):
                if first.fps != other.fps:
                    raise ValueError(
                        f"Segment 0 output fps ({first.fps}) != segment {j} output fps ({other.fps}). "
                        f"All segments must have identical fps for concatenation."
                    )
                if (first.width, first.height) != (other.width, other.height):
                    raise ValueError(
                        f"Segment 0 output dimensions ({first.width}x{first.height}) != "
                        f"segment {j} output dimensions ({other.width}x{other.height}). "
                        f"All segments must have identical dimensions for concatenation."
                    )

        meta = VideoMetadata(
            height=segment_metas[0].height,
            width=segment_metas[0].width,
            fps=segment_metas[0].fps,
            frame_count=sum(m.frame_count for m in segment_metas),
            total_seconds=round(sum(m.total_seconds for m in segment_metas), 4),
        )

        for record in self.post_transform_records:
            meta = _predict_transform_metadata(
                meta,
                record.op_id,
                record.args,
                context=f"post-assembly ({record.op_id})",
            )
        for record in self.post_effect_records:
            _validate_effect_bounds(record, meta.total_seconds, context="post-assembly")

        return meta

    def _segment_to_dict(self, segment: SegmentConfig) -> dict[str, Any]:
        return {
            "source": str(segment.source_video),
            "start": segment.start_second,
            "end": segment.end_second,
            "transforms": [_step_to_dict(record, include_apply=False) for record in segment.transform_records],
            "effects": [_step_to_dict(record, include_apply=True) for record in segment.effect_records],
        }

    def _validate_segment(self, index: int, segment: SegmentConfig) -> VideoMetadata:
        ctx = f"Segment {index}"
        if segment.start_second < 0:
            raise ValueError(f"{ctx}: start_second ({segment.start_second}) must be >= 0")
        if segment.end_second <= segment.start_second:
            raise ValueError(
                f"{ctx}: end_second ({segment.end_second}) must be > start_second ({segment.start_second})"
            )

        meta = VideoMetadata.from_path(str(segment.source_video))
        if segment.end_second > meta.total_seconds:
            raise ValueError(
                f"{ctx}: end_second ({segment.end_second}) exceeds source duration ({meta.total_seconds}s)"
            )
        meta = meta.cut(segment.start_second, segment.end_second)

        for record in segment.transform_records:
            meta = _predict_transform_metadata(meta, record.op_id, record.args, context=f"{ctx} ({record.op_id})")
        for record in segment.effect_records:
            _validate_effect_bounds(record, meta.total_seconds, context=ctx)
        return meta

    def _assemble_segments(self) -> Video:
        result: Video | None = None
        for segment in self.segments:
            video = segment.process_segment()
            result = video if result is None else result + video
        assert result is not None
        return result

to_dict

to_dict() -> dict[str, Any]

Serialize to canonical JSON-compatible dict.

Serialization uses _StepRecord snapshots as the source of truth. Mutating live operation objects after parsing/construction does not affect output.

Source code in src/videopython/base/edit.py
def to_dict(self) -> dict[str, Any]:
    """Serialize to canonical JSON-compatible dict.

    Serialization uses `_StepRecord` snapshots as the source of truth. Mutating
    live operation objects after parsing/construction does not affect output.
    """
    return {
        "segments": [self._segment_to_dict(segment) for segment in self.segments],
        "post_transforms": [_step_to_dict(record, include_apply=False) for record in self.post_transform_records],
        "post_effects": [_step_to_dict(record, include_apply=True) for record in self.post_effect_records],
    }

json_schema classmethod

json_schema() -> dict[str, Any]

Return a JSON Schema for VideoEdit plans.

Source code in src/videopython/base/edit.py
@classmethod
def json_schema(cls) -> dict[str, Any]:
    """Return a JSON Schema for `VideoEdit` plans."""
    transform_specs = _videoedit_supported_specs_for_category(OperationCategory.TRANSFORMATION)
    effect_specs = _videoedit_supported_specs_for_category(OperationCategory.EFFECT)

    transform_step_schemas = [
        _videoedit_step_schema_from_spec(spec, include_apply=False) for spec in transform_specs
    ]
    effect_step_schemas = [_videoedit_step_schema_from_spec(spec, include_apply=True) for spec in effect_specs]

    segment_schema: dict[str, Any] = {
        "type": "object",
        "properties": {
            "source": {"type": "string", "description": "Source video path."},
            "start": {"type": "number", "description": "Segment start time in seconds."},
            "end": {"type": "number", "description": "Segment end time in seconds."},
            "transforms": {
                "type": "array",
                "items": {"oneOf": transform_step_schemas},
            },
            "effects": {
                "type": "array",
                "items": {"oneOf": effect_step_schemas},
            },
        },
        "required": ["source", "start", "end"],
        "additionalProperties": False,
    }

    return {
        "$schema": "http://json-schema.org/draft-07/schema#",
        "type": "object",
        "properties": {
            "segments": {
                "type": "array",
                "items": segment_schema,
                "minItems": 1,
            },
            "post_transforms": {
                "type": "array",
                "items": {"oneOf": transform_step_schemas},
            },
            "post_effects": {
                "type": "array",
                "items": {"oneOf": effect_step_schemas},
            },
        },
        "required": ["segments"],
    }

run

run() -> Video

Execute the editing plan and return the final video.

Source code in src/videopython/base/edit.py
def run(self) -> Video:
    """Execute the editing plan and return the final video."""
    video = self._assemble_segments()
    for record in self.post_transform_records:
        video = record.operation.apply(video)
    for record in self.post_effect_records:
        if not isinstance(record.operation, Effect):
            raise TypeError(
                f"VideoEdit.post_effect_records must contain Effect operations, got {type(record.operation)}"
            )
        video = record.operation.apply(
            video,
            start=_coerce_optional_number(record.apply_args.get("start"), "start"),
            stop=_coerce_optional_number(record.apply_args.get("stop"), "stop"),
        )
    return video

validate

validate() -> VideoMetadata

Validate the editing plan without loading video data.

Source code in src/videopython/base/edit.py
def validate(self) -> VideoMetadata:
    """Validate the editing plan without loading video data."""
    segment_metas: list[VideoMetadata] = []
    for i, segment in enumerate(self.segments):
        segment_metas.append(self._validate_segment(i, segment))

    if len(segment_metas) > 1:
        first = segment_metas[0]
        for j, other in enumerate(segment_metas[1:], start=1):
            if first.fps != other.fps:
                raise ValueError(
                    f"Segment 0 output fps ({first.fps}) != segment {j} output fps ({other.fps}). "
                    f"All segments must have identical fps for concatenation."
                )
            if (first.width, first.height) != (other.width, other.height):
                raise ValueError(
                    f"Segment 0 output dimensions ({first.width}x{first.height}) != "
                    f"segment {j} output dimensions ({other.width}x{other.height}). "
                    f"All segments must have identical dimensions for concatenation."
                )

    meta = VideoMetadata(
        height=segment_metas[0].height,
        width=segment_metas[0].width,
        fps=segment_metas[0].fps,
        frame_count=sum(m.frame_count for m in segment_metas),
        total_seconds=round(sum(m.total_seconds for m in segment_metas), 4),
    )

    for record in self.post_transform_records:
        meta = _predict_transform_metadata(
            meta,
            record.op_id,
            record.args,
            context=f"post-assembly ({record.op_id})",
        )
    for record in self.post_effect_records:
        _validate_effect_bounds(record, meta.total_seconds, context="post-assembly")

    return meta

SegmentConfig

SegmentConfig is still exported, but most users should construct plans via VideoEdit.from_dict(...) or VideoEdit.from_json(...).

SegmentConfig dataclass

Configuration for a single video segment in an editing plan.

Source code in src/videopython/base/edit.py
@dataclass
class SegmentConfig:
    """Configuration for a single video segment in an editing plan."""

    source_video: Path
    start_second: float
    end_second: float
    transform_records: tuple[_StepRecord, ...] = field(default_factory=tuple)
    effect_records: tuple[_StepRecord, ...] = field(default_factory=tuple)

    def __post_init__(self) -> None:
        self.transform_records = tuple(self.transform_records)
        self.effect_records = tuple(self.effect_records)
        for record in self.transform_records:
            if not isinstance(record.operation, Transformation):
                raise TypeError(
                    "SegmentConfig.transform_records must contain "
                    f"Transformation operations, got {type(record.operation)}"
                )
        for record in self.effect_records:
            if not isinstance(record.operation, Effect):
                raise TypeError(
                    f"SegmentConfig.effect_records must contain Effect operations, got {type(record.operation)}"
                )

    def process_segment(self) -> Video:
        """Load the segment and apply transforms then effects."""
        video = Video.from_path(
            str(self.source_video),
            start_second=self.start_second,
            end_second=self.end_second,
        )
        for record in self.transform_records:
            video = record.operation.apply(video)
        for record in self.effect_records:
            if not isinstance(record.operation, Effect):
                raise TypeError(
                    f"SegmentConfig.effect_records must contain Effect operations, got {type(record.operation)}"
                )
            video = record.operation.apply(
                video,
                start=_coerce_optional_number(record.apply_args.get("start"), "start"),
                stop=_coerce_optional_number(record.apply_args.get("stop"), "stop"),
            )
        return video

process_segment

process_segment() -> Video

Load the segment and apply transforms then effects.

Source code in src/videopython/base/edit.py
def process_segment(self) -> Video:
    """Load the segment and apply transforms then effects."""
    video = Video.from_path(
        str(self.source_video),
        start_second=self.start_second,
        end_second=self.end_second,
    )
    for record in self.transform_records:
        video = record.operation.apply(video)
    for record in self.effect_records:
        if not isinstance(record.operation, Effect):
            raise TypeError(
                f"SegmentConfig.effect_records must contain Effect operations, got {type(record.operation)}"
            )
        video = record.operation.apply(
            video,
            start=_coerce_optional_number(record.apply_args.get("start"), "start"),
            stop=_coerce_optional_number(record.apply_args.get("stop"), "stop"),
        )
    return video