Open-source video inpainting tool

Remove text from video.
One command.

Auto-detect hardcoded subtitles, watermarks, logos, and timestamps. Generate masks and inpaint — no manual work needed. Multilingual out of the box.

Get started View on GitHub

quickstart.py

from videowipe import remove_text

# Mask is optional — subtitle regions
# are auto-detected if omitted
remove_text(
    video="input.mp4",
    output="result/",
)
        

Features

Detect. Select. Inpaint.

A complete pipeline from frame sampling to mask generation to background restoration. Built for production use.

Auto-detection

DBNet-based text detector samples frames across the video, finds text regions, and clusters them by position. No manual mask drawing. Works with Chinese, English, Korean, Burmese, and more.

Pluggable backends

Default STTN backend runs on CPU. Swap in ProPainter, LaMa, or any external model via --external-command. Same mask pipeline, better quality when you have GPU.

Intent-driven cleanup

Tell it what to remove in plain English: --intent "remove bottom Chinese subtitles". Optional OCR reads text content. LLM-backed selection via --agent.

Pipeline

Three stages, zero config

The full pipeline runs in one command. Here's what happens under the hood.

Detection

Sample frames across the video. Run DBNet text detection on each. Cluster results by spatial position and select the best preview frame.

Target selection

Classify regions as subtitle, watermark, logo, or timestamp. Optional OCR reads the text. Intent parser maps your instructions to specific targets.

Inpainting

Generate masks from selected regions. Fill in using temporal information from neighboring frames via STTN or your preferred external model.

Usage

Python API or CLI

Two ways to use VideoWipe — pick whichever fits your workflow.

python — full pipeline

from videowipe import WipeEngine

engine = WipeEngine(
    task="clean",
    detect_mode="balanced",
    ocr="auto",
)
engine.process(
    video="input.mp4",
    targets=["subtitle", "watermark"],
    intent="remove Chinese subtitles",
    output="result/",
)
engine.cleanup()
          

terminal — quick clean

# Auto-detect and remove all text
videowipe clean input.mp4 -o result/

# Target specific types
videowipe clean input.mp4 --target subtitle

# Natural language intent
videowipe clean input.mp4 \
  --intent "remove bottom Chinese subtitles"

# Preview without processing
videowipe clean input.mp4 --preview -o result/

# Docker (CPU)
docker run --rm -v "$(pwd)":/data \
  ghcr.io/kkenny0/videowipe:latest \
  clean /data/input.mp4 -o /data/result/
          

3.8+

Python version required

~480 MB

Docker CPU image

Manual masks needed

MIT

Open-source license

Support

Help keep VideoWipe maintained.

If VideoWipe saves you time on subtitle, watermark, or text-overlay cleanup, support helps keep model packaging, Docker images, detection tuning, and documentation maintained.

Support the project

Remove text from video.One command.