Film and documentary transcription, built for the edit
Drop your interviews, dailies, and verite footage – get a speaker-labeled transcript plus the themes, the standout quotes, and a Q&A breakdown, ready to paper-edit before you open the timeline.
60 min free · no card required · we never train on your audio
How do you transcribe interviews?
To transcribe film and documentary footage, upload your interviews or dailies, or paste a link, and Pepys returns a speaker-labeled transcript in minutes, plus the standout quotes, recurring themes, and a clean Q&A breakdown for your paper edit. It's pay-as-you-go with no subscription, and credits never expire.
Made for filmmakers
An hour of interview is an hour you have to live through again to find the one line that carries the scene. Forty hours of dailies is forty. Before a single cut lands, a documentary editor is hunting for the moment that made the room go quiet, the answer that became the spine, the sentence buried at minute thirty-eight of a tape nobody wants to rescrub. That hunt is the job, and it's all sitting inside footage you already shot. It just needs to become text you can read, highlight, and search.
This is where documentary transcription stops being a chore and becomes the cutting room itself. You build a selects reel and a paper edit from the transcript, then string-out the best answers long before you sync a single clip to the timeline. Speaker labels keep your director, interviewer, and each subject from collapsing into one wall of text, and every line you highlight carries its own timecode. Search a half-remembered phrase across the whole shoot and jump straight to the frame, instead of scrubbing a card a second time.
Paper-edit your interviews
A clean, speaker-labeled transcript of every interview so you can read, highlight, and build the cut on paper before you open the timeline.
Find the line in the haystack
Search across forty hours of dailies for a half-remembered phrase and jump straight to the timecode instead of rescrubbing tapes.
Standout quotes, surfaced
The most usable lines and answers pulled out for you, ready to drop into a string-out, a trailer, or a festival cut.
Festival-ready captions
Frame-accurate SRT and VTT exports for your screener, accessibility, and the subtitle spotting your distributor will ask for.
Built in, not bolted on
The story beats, the strongest soundbites, and a recap for the edit
Every interviewis analyzed automatically the moment it’s transcribed. Here’s a real sample, run through it.
The Sentence That Became the Spine: A Director on the Interview That Reshaped Her Film
A documentary director walks through the on-camera interview she nearly cut for time – a retired miner named Walter, booked as a ninety-second supporting voice. When he spoke about the day the pit closed, one line reorganized the entire film around it. She explains why she refused to ask him to repeat the moment, why she held eleven seconds of silence instead of cutting away, and how reading forty hours of footage as transcript, not waveform, is where the edit actually gets built. Her throughline: the footage is never the film; structure is the real edit.
Themes
Notable quotes
- “The town didn't die when the mine closed, it died the day we stopped talking about it.”
- “The moment you ask someone to repeat their grief, it becomes a performance of grief.”
- “A face holding back is more cinematic than any B-roll I could shoot.”
- “The edit isn't built in the edit suite anymore. It's built in the margins of a transcript.”
- “The footage is never the film. The film is the order you put it in.”
Q&A
What was the line that made you realize this wasn't a supporting character anymore?
It was Walter's line, quoted exactly: 'The town didn't die when the mine closed, it died the day we stopped talking about it.' The director and her focus puller both recognized it as the spine of the movie, and the rest of the film reorganized itself around that sentence.
A lot of directors would chase that line and ask for a cleaner take. Did you?
No. She calls resisting that urge the hardest discipline in documentary, because asking someone to repeat their grief turns it into a performance of grief that you can hear in the cutting room. Instead she let the silence run for eleven seconds and said nothing.
How did you know to hold eleven seconds of silence instead of cutting away?
She learned it the expensive way on her first film, where she covered every silence with a hand, a window, or a photograph and audiences felt managed. Now she trusts the face, because a face holding back is more cinematic than any B-roll she could shoot.
You shot roughly forty hours of interviews for a ninety-minute film. How do you find moments like Walter's in that haystack?
Transcripts. She reads the full forty hours as text and highlights on paper before touching the timeline, because a great line jumps off a page where it can hide in a waveform. As she puts it, the edit is built in the margins of a transcript, and structure, not coverage, is the real edit.
Clean, speaker-labeled, click-to-seek
Ask, don’t scrub
Ask the transcript anything.
An hour-long recording? Don’t skim it – ask. Every answer stays grounded in your transcript and cites the exact timestamp, so you can jump to the moment and check it yourself.
Which line did she say became the spine of the film?
It was Walter's line, quoted exactly: 'The town didn't die when the mine closed, it died the day we stopped talking about it.' She says she and her focus puller both knew that was the spine of the movie, and everything reorganized itself around that sentence.
Why didn't she ask him to repeat the moment for a cleaner take?
She calls resisting that the hardest discipline in documentary, because the moment you ask someone to repeat their grief, it becomes a performance of grief you can't un-hear in the cutting room. Instead she let the silence run for eleven seconds and didn't say a word, since she now trusts the face over cutting to B-roll.
Grounded in your transcript – if the answer isn’t in the audio, it says so instead of guessing.
Who said what
Speaker labels that survive cross-talk
Automatic speaker diarization. Two people, four people, cross-talk and interruptions – interviews, panels, messy meetings. Pepys keeps each voice on its own line instead of blurring them into one, so you never rewind to figure out who was talking.
So the festival nearly didn't happen this year–
–it almost didn't. We lost the venue three weeks out.
Three weeks? How do you even start to–
You call everyone you know. The whole town pitched in.
And that's how it ended up in the park.
Record in any language – 99+ detected automatically
- English
- 中文
- Español
- العربية
- हिन्दी
- Français
- 日本語
- Português
- Русский
- Deutsch
- 한국어
- Italiano
- বাংলা
- Türkçe
- فارسی
- Tiếng Việt
- தமிழ்
- Polski
- ไทย
- Українська
- Nederlands
- עברית
- Ελληνικά
- తెలుగు
- Bahasa Indonesia
- اردو
- Svenska
- मराठी
- Română
- Magyar
- Čeština
- ગુજરાતી
- Kiswahili
- ქართული
- Tagalog
- አማርኛ
Works with the platforms you live in.
Paste a link from YouTube, TikTok, Instagram, Facebook, Spotify, or Apple Podcasts – or drop in any audio or video file. We transcribe it once, then you export it however your workflow needs.
- YouTube
- TikTok
- Spotify
- Apple Podcasts
- or any file
Export to any format
- TXT
- Markdown
- DOCX
- SRT
- VTT
- JSON
Most useful for filmmakers: SRT · VTT · Transcript (DOCX) · TXT · PDF
Timestamps, speaker labels, and subtitle timing carry through to every export.
How film and documentary transcription works
Upload or paste a link
Drop your interview or paste its link – any audio or video, in any language.
Get your transcript
A clean, speaker-labeled transcript with AI notes tuned to your format, ready in minutes.
Edit and export
Fix anything inline, then export to SRT, VTT, TXT, DOCX, PDF, or JSON.
Why filmmakers pick Pepys
No subscription – pay per project, and credits never expire between shoots.
Themes and standout quotes are pulled for you, not a separate copy-paste into a chatbot.
Upload dailies straight off the card, or paste a link – no wrangling files first.
Speaker labels keep your director, interviewer, and subjects from blurring into one block.
What filmmakers say
captions, chapters AND a hook breakdown straight off the upload. i pull 3 shorts out of every long video now. huge.
Daniel K.YouTube creator · Product HuntI transcribe in the original language and receive a translated version with the subtitles still intact. It saved an entire round of contractor work on my last film. Thank you for building this.
Giulia F.Documentary filmmaker · email transcribe once, deliver in another language with the timing preserved. the part of subtitling i used to dread is just... done now.
Lucas D.Subtitle translator · X
Film and documentary transcription – questions, answered
How do I transcribe a documentary interview?
Upload the interview file or paste its link, and Pepys returns a speaker-labeled transcript in minutes, along with the standout quotes, recurring themes, and a clean Q&A breakdown you can paper-edit from.
Can it handle forty hours of dailies without me babysitting it?
Yes. Queue as many files as you have and let them process. Each comes back as its own searchable, speaker-labeled transcript, so you can read across the whole shoot as text rather than scrubbing tapes one at a time.
Does it tell the interviewer and subjects apart?
Yes. Speaker diarization separates each voice, so a director, an interviewer, and a subject come back labeled instead of as one wall of text. You can rename a speaker once and it updates everywhere in the transcript.
Will the captions line up frame-accurately for my screener?
Exports are timestamped, so SRT and VTT drop into your editor or player without manual re-syncing. That covers screeners, accessibility, and the subtitle spotting a distributor asks for.
Can it transcribe interviews in another language and subtitle them?
Yes. It auto-detects the spoken language across 99+ languages, and you can transcribe in the original and export translated subtitles with the timing preserved, so a foreign-language subject doesn't mean a separate round of contractor work.
Do you train on my footage?
No. Your footage is never used to train a model. For unreleased films and protected subjects, that point is non-negotiable, and it's why editors trust it for sensitive material.
Do I have to subscribe?
No. Pepys is pay-as-you-go – buy a block of hours, use them across this film and the next, and the credits never expire between projects. You can start free with 60 minutes, no card.
More industries
Turn your next interview into a transcript, the standout quotes, and a paper edit – and pay only for that project.
Pay as you go – credits never expire, nothing to cancel. Or start free with 60 minutes, no card.