MIT licensed · Open source

Talk to your videos.

Name: OpenVideoSearch
Author: OpenVideoSearch

Open-source AI for long-form video Q&A. Self-host on your GPU, or plug into any provider.

Every other tool either caps out after a few minutes of video or ships your footage to someone else's cloud. OpenVideoSearch does neither.

demo.mp4 · coming soon

Under the hood

grounded

Grounded in the source

Exact timestamp on every answer
Combines visual, audio, and context

Local — supports cloud too if you're GPU-poor 🤗

Your videos never leave your machine

Full offline stack on your GPU
Any OpenAI-compatible endpoint for ASR + vision

agentic

Built to stay accurate

Manages its own context — no context rot
Every reasoning step, in the open
Holds up across 10+ hour corpora

indexed

Understood before you ask

Indexed, summarized, and chaptered at upload
Works across video, audio, and images
Scales to 10+ hour recordings

A few use cases

just a few — if you can describe it, it can find it

Meetings & calls

"What did we decide on auth last quarter?"

across calls decisions exact moment

Phone gallery & photos

"Find the sunset from day two"

describe it any format exact file

Lectures & courses

"Where does she derive attention?"

concept search cross-lecture exact slide

Interview & pitch prep

"Was I convincing at minute 12?"

body language pacing content gaps

Security footage

"When did someone enter after hours?"

visual search 10+ hours fully offline

Depositions & compliance

"Every mention of the NDA, with timestamps"

self-hosted cited searchable

Self-host in minutes

Pick your path: bring an API key, or run every model locally on your own GPU.

Read the README →

Local GPU no external API, runs fully offline

# configure local model settings in .env, then:
git clone https://github.com/adnane-errazine/OpenVideoSearch.git
cd OpenVideoSearch && cp .env.example .env
docker compose -f compose.models.yml up # model servers
docker compose -f compose.yml -f compose.gpu.yml up # the app, on your GPU

Cloud API any OpenAI-compatible endpoint

# set your API key in .env, then:
git clone https://github.com/adnane-errazine/OpenVideoSearch.git
cd OpenVideoSearch && cp .env.example .env
docker compose up
# with an NVIDIA GPU, run this instead:
docker compose -f compose.yml -f compose.gpu.yml up

Any OpenAI-compatible models work. The agent needs a vision + tool-calling chat model with a large context window; ingestion a vision model; ASR must return word-level timestamps; embeddings any fixed dimension; a reranker is optional. Model requirements →