Transcribe anything.
Drop in any file or link, in any language, and get back an accurate, speaker-labeled transcript – plus AI notes built for your format and a chat that answers anything about it, right in the editor.
60 min free · no card required · we never train on your audio
Native, per-format AI
AI notes built for what you actually make.
One upload, output shaped to what you made. Pick your format and watch your recording come back as the notes that format actually needs – not one generic summary, not a bolt-on chatbot you re-prompt.
The hook
“Stop posting your best idea in the first three seconds.”
Core message
A hook should open a curiosity loop rather than give away the answer; you sustain watch time by paying that loop off in installments, editing the transcript before the footage, and closing on an open question to drive rewatches.
Call to action
If you want the transcript-first checklist I run on every video, comment the word LOOP and I'll send it over. And follow, because tomorrow I'm breaking down the one caption mistake that's tanking your reach.
Retention structure
- 1
Pattern-interrupt hook that negates accepted advice and names the stakes (watch time)
- 2
Reframe: redefines what a hook actually is ('the moment you open a loop the viewer needs to close')
- 3
Proof / credibility beat using personal data ('I pulled the retention graphs on my last forty Reels')
- 4
Payoff with a concrete, stealable framework (name the mistake, raise stakes, promise the fix, deliver in installments)
- 5
Tactical escalation: edit the transcript with no audio or visuals, then end on an open question to trigger rewatches
- 6
Recap of the four moves followed by a two-part CTA (comment 'LOOP' + follow with a next-video tease)
Topics
Suggested hashtags
Every persona gets notes built for their format, from the same upload.
Ask, don’t scrub
Ask the transcript anything.
An hour-long recording? Don’t skim it – ask. Every answer stays grounded in your transcript and cites the exact timestamp, so you can jump to the moment and check it yourself.
What's the main claim about memory?
That memory is a reconstruction, not a recording – every time you recall something you rebuild it from fragments and quietly edit it. The guest's blunt version: confidence is the worst possible witness.
Any technique I can actually use today?
Yes – retrieval practice: close the book and force yourself to recall instead of rereading. The struggle to recall is the workout that builds the memory. And protect your sleep – that's when the brain files the day.
Grounded in your transcript – if the answer isn’t in the audio, it says so instead of guessing.
Clean paragraphs. No more um's and ah's.
The left is what Pepys hands back – logical paragraphs with the filler stripped out, punctuated and readable. The right is the raw, one-line-per-segment dump most transcribers leave you with.
um so yeah everyone keeps telling you to like lead with your best line right but uh honestly if you give away the whole answer in the first second you know there's basically no reason for anyone to keep watching so the hook isn't kind of the smartest thing you say it's like a loop you open that they need to close and um that's the part that actually keeps people around
RawWho said what
Speaker labels that survive cross-talk
Automatic speaker diarization. Two people, four people, cross-talk and interruptions – interviews, panels, messy meetings. Pepys keeps each voice on its own line instead of blurring them into one, so you never rewind to figure out who was talking.
So the festival nearly didn't happen this year–
–it almost didn't. We lost the venue three weeks out.
Three weeks? How do you even start to–
You call everyone you know. The whole town pitched in.
And that's how it ended up in the park.
Detect any language. Deliver it in another.
We auto-detect the spoken language on the way in, 99+ of them, then translate the finished transcript into any other, with timestamps and subtitles intact. Every language is the same flat rate – no language tiers.
- English
- 中文
- Español
- العربية
- हिन्दी
- Français
- 日本語
- Português
- Русский
- Deutsch
- 한국어
- Italiano
- বাংলা
- Türkçe
- فارسی
- Tiếng Việt
- தமிழ்
- Polski
- ไทย
- Українська
- Nederlands
- עברית
- Ελληνικά
- తెలుగు
- Bahasa Indonesia
- اردو
- Svenska
- मराठी
- Română
- Magyar
- Čeština
- ગુજરાતી
- Kiswahili
- ქართული
- Tagalog
- አማርኛ
Transcribe anything.
Auto-detected · 99+ languagesWorks with the platforms you live in.
Paste a link from YouTube, TikTok, Instagram, Facebook, Spotify, or Apple Podcasts – or drop in any audio or video file. We transcribe it once, then you export it however your workflow needs.
- YouTube
- TikTok
- Spotify
- Apple Podcasts
- or any file
Export to any format
- TXT
- Markdown
- DOCX
- SRT
- VTT
- JSON
Timestamps, speaker labels, and subtitle timing carry through to every export.
Everything’s included. No feature gates.
Most tools meter you – exports stuck behind the paid tier, speaker labels as an add-on, the “good” AI locked to a higher plan. Here, every feature is on from your very first minute. You read the words; we handle the file.
Private, and no subscription
We never train on your audio. Credits never expire – buy once, transcribe whenever, nothing to cancel.
AI tuned to your format
Hooks for short-form, show notes for podcasts, decisions for meetings, study notes for lectures – built into the editor, not a bolted-on chatbot you re-prompt.
And the basics, never gated:
Accurate & timestamped
Word- and segment-level timestamps on every line.
Speaker labels
Every voice on its own line, automatically.
99+ languages, in and out
Auto-detected in; translate the result on the way out.
Every export format
TXT, SRT, VTT, DOCX, PDF, JSON in one click.
From file to finished, in three steps
Upload or paste a link
Any audio or video. We extract the audio from your file for you.
Get your transcript
Timestamps, speaker labels, and AI tuned to your format – show notes, hooks, or a summary. Ready in minutes.
Export anywhere
SRT, VTT, TXT, DOCX, PDF, JSON. One click. Your credits never expire.
Loved by people who’d rather not type
Creators, podcasters, students, journalists, and researchers turn hours of audio into searchable text – with the AI notes to go with it.
drop the episode in and the show notes + pull-quotes come back done. what used to eat a whole evening is now basically a coffee break.
Maya R.Podcast producer · via Xhad ~2 hrs of interviews to get through on deadline. uploaded the lot, got it back speaker-labeled and fully searchable, so i could jump straight to the quote i half-remembered instead of scrubbing the timeline for twenty minutes. genuinely saved the story.
Tomás H.Investigative journalist · via Reddit I work across three languages and it detected each one correctly without me changing a single setting. The timestamps line up to the word – exactly what my research needs.
Priya N.Linguistics PhD candidate · via Trustpilot captions, chapters AND a hook breakdown straight off the upload. i pull 3 shorts out of every long video now. huge.
Daniel K.YouTube creator · via Product HuntEvery user interview comes back as a clean, searchable transcript I can tag and quote directly in my reports. Synthesis used to be the slowest part of my week and now it's an afternoon. The speaker labels alone are worth it for me.
Sofia L.UX researcher · via G2 no subscription is the whole reason i switched. buy hours, use em whenever, balance is still there months later. thats the review.
Marcus B.Indie founder · via Reddit
No subscription. Just the hours you use.
Other tools bill you every month whether you transcribe or not. Pepys is pay-as-you-go from $0.85 an hour, charged only when you use it – credits never expire, nothing to cancel.