Open-source video inpainting tool

Remove text from video.
One command.

Auto-detect hardcoded subtitles, watermarks, logos, and timestamps. Generate masks and inpaint — no manual work needed. Multilingual out of the box.

quickstart.py
from videowipe import remove_text # Mask is optional — subtitle regions # are auto-detected if omitted remove_text( video="input.mp4", output="result/", )
$ pip install videowipe
Detect. Select. Inpaint.

A complete pipeline from frame sampling to mask generation to background restoration. Built for production use.

Auto-detection

DBNet-based text detector samples frames across the video, finds text regions, and clusters them by position. No manual mask drawing. Works with Chinese, English, Korean, Burmese, and more.

Pluggable backends

Default STTN backend runs on CPU. Swap in ProPainter, LaMa, or any external model via --external-command. Same mask pipeline, better quality when you have GPU.

Intent-driven cleanup

Tell it what to remove in plain English: --intent "remove bottom Chinese subtitles". Optional OCR reads text content. LLM-backed selection via --agent.

Three stages, zero config

The full pipeline runs in one command. Here's what happens under the hood.

1

Detection

Sample frames across the video. Run DBNet text detection on each. Cluster results by spatial position and select the best preview frame.

2

Target selection

Classify regions as subtitle, watermark, logo, or timestamp. Optional OCR reads the text. Intent parser maps your instructions to specific targets.

3

Inpainting

Generate masks from selected regions. Fill in using temporal information from neighboring frames via STTN or your preferred external model.

Python API or CLI

Two ways to use VideoWipe — pick whichever fits your workflow.

python — full pipeline
from videowipe import WipeEngine engine = WipeEngine( task="clean", detect_mode="balanced", ocr="auto", ) engine.process( video="input.mp4", targets=["subtitle", "watermark"], intent="remove Chinese subtitles", output="result/", ) engine.cleanup()
terminal — quick clean
# Auto-detect and remove all text videowipe clean input.mp4 -o result/ # Target specific types videowipe clean input.mp4 --target subtitle # Natural language intent videowipe clean input.mp4 \ --intent "remove bottom Chinese subtitles" # Preview without processing videowipe clean input.mp4 --preview -o result/ # Docker (CPU) docker run --rm -v "$(pwd)":/data \ ghcr.io/kkenny0/videowipe:latest \ clean /data/input.mp4 -o /data/result/
3.8+
Python version required
~480 MB
Docker CPU image
0
Manual masks needed
MIT
Open-source license
Clean video, no manual work.

Install with pip, run one command. Auto-detect handles the rest. Works on CPU, scales to GPU.

Help keep VideoWipe maintained.

If VideoWipe saves you time on subtitle, watermark, or text-overlay cleanup, support helps keep model packaging, Docker images, detection tuning, and documentation maintained.