9,438,517minutes transcribed

Video transcription, built for the edit

Drop in the footage or paste a link – get a speaker-labeled, click-to-seek transcript plus ready-to-burn captions, so you can cut to the words instead of scrubbing the timeline.

or paste a link

60 min free · no card required · we never train on your audio

Trusted by 100,000+ creators, podcasters, journalists & researchers

How do you transcribe videos?

To transcribe a video, upload the file or paste its link and Pepys returns a speaker-labeled, time-coded transcript in minutes – plus exportable SRT and VTT captions and a quick AI summary. It's pay-as-you-go with no subscription, and credits never expire.

Upload or paste a link

Drop your video or paste its link – any audio or video, in any language.

Get your transcript

A clean, speaker-labeled transcript with AI notes tuned to your format, ready in minutes.

Edit and export

Fix anything inline, then export to SRT, VTT, TXT, DOCX, PDF, or JSON.

Made for videographers

Scrubbing the timeline for one line
Hunting for a half-remembered quote across hours of interview footage burns the hours you'd rather color – search the transcript and click straight to the frame it was spoken.
Retyping captions by hand
Typing out every line for burn-in or sidecar subtitles is its own afternoon – Pepys returns frame-accurate SRT and VTT that drop into your NLE, no round-trip through a caption tool.
Paying monthly between shoots
A subscription bills every month even in the weeks you're not editing – Pepys is pay per video and the credits never expire between projects.

Built in, not bolted on

A searchable transcript, summary, and captions – the moment it uploads

Every videois analyzed automatically the moment it’s transcribed. Here’s a real sample, run through it.

hartley-wedding-prep-memo.mp4AI analysis, built in

AI analysis

On-Set Memo: Shooting the Hartley Wedding So the Edit Cuts Itself

A two-camera wedding shoot planned out loud before call time. The locked wide is the safety net while the long lens hunts reactions, and the whole approach is built around the vows audio, because the edit is cut to the voice first and the picture is built around it. Lav-and-backup-recorder audio, exposing for faces against a blown-out window, grabbing b-roll and safe portrait frames early, mirrored cards and fresh batteries, and a two-drive backup before leaving all serve one goal: never lose the moment the couple actually paid for.

Key points

Two-camera plan: the A-cam wide of the altar is the locked safety shot, while the B-cam long lens lives on faces – "The story is in the reactions, not the wide."
Audio is treated as the make-or-break: a lav on the officiant plus a backup recorder on the lectern, because "If the lav fails, the whole ceremony is unusable".
Expose for the couple against the harsh west window: "A blown window looks intentional. A muddy gray face looks like a mistake."
The edit is voice-first: "Find the line, then find the frame. The voice drives the cut, never the other way around."
Capture b-roll and the five known-good portrait frames early: "Get the safe shots before you get the pretty shots", since golden hour is roughly twenty minutes of light.
Protect the footage: mirror to two cards per body, fresh batteries at three forty-five, and back up to two drives before leaving – "The footage doesn't exist until it's in two places."

Run this on your own video

Clean, speaker-labeled, click-to-seek

0:00 / 2:21

Transcribe your first video free – 60 min

Ask, don’t scrub

Ask the transcript anything.

An hour-long recording? Don’t skim it – ask. Every answer stays grounded in your transcript and cites the exact timestamp, so you can jump to the moment and check it yourself.

hartley-wedding-prep-memo.mp4Ask AI

What's the audio plan for the ceremony, and what's the backup if it fails?

She's putting a lav on the officiant plus a recorder in his jacket pocket, because the on-camera mic is garbage at thirty feet. If the lav fails the whole ceremony is unusable, so she's also running a backup recorder on the lectern.

Cited0:26

Why does she expose for the couple's faces and let the window blow out?

The four o'clock sun comes straight through the big west window behind the altar, so the couple will be backlit. She exposes for their faces and lets the window go, on the logic that a blown window looks intentional while a muddy gray face looks like a mistake.

Cited0:40

What's the audio plan for the ceremony, and what's the backup if it fails?

Cited0:26

Why does she expose for the couple's faces and let the window blow out?

Cited0:40

Ask anything about this transcript…

Grounded in your transcript – if the answer isn’t in the audio, it says so instead of guessing.

Clean paragraphs. No more um's and ah's.

The left is what Pepys hands back – logical paragraphs with the filler stripped out, punctuated and readable. The right is the raw, one-line-per-segment dump most transcribers leave you with.

reel-voiceover.mp4

um so yeah everyone keeps telling you to like lead with your best line right but uh honestly if you give away the whole answer in the first second you know there's basically no reason for anyone to keep watching so the hook isn't kind of the smartest thing you say it's like a loop you open that they need to close and um that's the part that actually keeps people around

Raw

BeforeAfter

Who said what

Speaker labels that survive cross-talk

Automatic speaker diarization. Two people, four people, cross-talk and interruptions – interviews, panels, messy meetings. Pepys keeps each voice on its own line instead of blurring them into one, so you never rewind to figure out who was talking.

Reporter

So the festival nearly didn't happen this year–

Mara Okonkwo

–it almost didn't. We lost the venue three weeks out.

Reporter

Three weeks? How do you even start to–

Mara Okonkwo

You call everyone you know. The whole town pitched in.

Reporter

And that's how it ended up in the park.

Captions for every cut
Frame-accurate SRT and VTT files that drop straight into your NLE or your social uploads, no retyping.
A paper edit you can read
A clean, time-coded transcript so you can mark your selects on the page before you ever touch the timeline.
Find any line in seconds
A searchable transcript that jumps you to the exact frame a phrase was spoken, instead of scrubbing for it.
Pull the soundbites
A quick summary surfaces the strongest lines, so the clips you cut for the highlight reel write themselves.

Works with the platforms you live in.

Paste a link from YouTube, TikTok, Instagram, Facebook, Spotify, or Apple Podcasts – or drop in any audio or video file. We transcribe it once, then you export it however your workflow needs.

YouTube
TikTok
Instagram
Facebook
Spotify
Apple Podcasts
or any file

Export to any format

TXT
Markdown
DOCX
PDF
SRT
VTT
JSON

Most useful for videographers: SRT · VTT · TXT · DOCX · PDF

Timestamps, speaker labels, and subtitle timing carry through to every export.

Why videographers pick Pepys

No subscription – pay per video, and credits never expire between shoots.
Captions are built in, not a separate caption tool to round-trip through.
Paste a YouTube, Vimeo, or direct video link – no exporting the file first.
Speaker labels keep your interview subjects from blurring into one block of text.

What videographers say

I used to tell myself I would repurpose long videos and then absolutely not do it. Pepys gives me the transcript, chapters, and hook ideas, which means the clips actually happen instead of living forever on my to-do list.
Daniel K.YouTube creator · Long video to shorts
I had hours of interviews and that horrible feeling that the story was somewhere in there, but I could not see it yet. Reading the transcripts made the shape of the film visible. I could search, highlight, pull quotes, and start building the cut before opening the timeline.
Giulia F.Documentary filmmaker · The edit got unstuck
The translation is useful, but the magic is that the timing survives. That is the part that used to ruin my afternoon.
Lucas D.Subtitle translator · Timing survived translation

Video transcription – questions, answered

How do I transcribe a video?

Upload the video file or paste its link (YouTube, Vimeo, or a direct URL) and Pepys returns a speaker-labeled, time-coded transcript in minutes, along with a short AI summary and exportable captions. You don't need to strip the audio out first.

Can I get burn-in or sidecar captions for my edit?

Yes. Every video exports to SRT and VTT, both frame-accurate and ready to import into Premiere, DaVinci Resolve, Final Cut, or a social uploader. Edit any wording inline before you export.

Does it separate the people speaking in an interview?

Yes. Speaker diarization splits each voice, so a multi-person interview or a two-subject piece comes back labeled rather than as one wall of text. Rename "Speaker 1" to your subject's name and it updates everywhere.

Can I do a paper edit from the transcript?

That's the point. The transcript is time-coded and click-to-seek, so you can read the whole shoot, mark your selects on the page, and jump straight to the frame each line was spoken before you build the timeline.

What can I export for a project?

SRT and VTT captions, plain text, a DOCX, and a PDF of the transcript. One click each, and the timecodes stay intact so everything lines up back in your NLE.

How does it handle on-location audio and accents?

It auto-detects the spoken language across 99+ languages and handles a range of accents and noisier run-and-gun audio. Anything it mishears you can fix inline in the editor before exporting.

Do I have to subscribe?

No. Pepys is pay-as-you-go – buy a block of hours, use them across as many shoots as you like, and the credits never expire. You can start free with 60 minutes, no card.

Start your first video free

More industries

Popular tools

Don't just take our word for it.

Ask ChatGPT, Claude, or Perplexity whether Pepys is the right fit for videographers.

Ask ChatGPT Ask Claude Ask Perplexity

Turn your next shoot into a searchable transcript and ready-to-burn captions – and pay only for that video.

Pay as you go – credits never expire, nothing to cancel. Or start free with 60 minutes, no card.

Start free – 60 minutes or see pricing

Video transcription, built for the edit

How do you transcribe videos?

Upload or paste a link

Get your transcript

Edit and export

Made for videographers

Scrubbing the timeline for one line

Retyping captions by hand

Paying monthly between shoots

A searchable transcript, summary, and captions – the moment it uploads

On-Set Memo: Shooting the Hartley Wedding So the Edit Cuts Itself

Clean, speaker-labeled, click-to-seek

Ask the transcript anything.

Clean paragraphs. No more um's and ah's.

Speaker labels that survive cross-talk

Captions for every cut

A paper edit you can read

Find any line in seconds

Pull the soundbites

Works with the platforms you live in.

Why videographers pick Pepys

What videographers say

Video transcription – questions, answered

Don't just take our word for it.

Turn your next shoot into a searchable transcript and ready-to-burn captions – and pay only for that video.