9,438,517minutes transcribed

Multi-Speaker Transcription

Upload audio with several voices or paste a link and get a transcript that separates and labels each speaker.

or paste a link

Accepts any multi-speaker audio or video file – or a link · returns a timestamped transcript with each voice tagged Speaker 1, 2, 3….

Speaker labels are assigned per chunk (Speaker 1, Speaker 2…) from the audio – they are not voiceprint or biometric identification. Rename them to real names after transcription.

60 min free · no card required · we never train on your audio

Trusted by 100k+ users

How do I transcribe multi-speaker audio?

To transcribe multi-speaker audio, upload your recording to Pepys or paste a link, and AI returns a clean, timestamped transcript that separates the voices into labels like Speaker 1, Speaker 2, and Speaker 3, in 99+ languages. You can rename the speakers and export. Your first 60 minutes are free, no card required.

How multi-speaker transcription works

Add the recording

Upload a file with several voices – a panel, interview, or group call – or paste a link.

Let AI separate the voices

Pepys transcribes and diarizes the audio, tagging each turn Speaker 1, Speaker 2, Speaker 3 with timestamps.

Rename and export

Relabel speakers with real names, edit inline, then export to TXT, Markdown, DOCX, PDF, SRT, VTT, or JSON.

When three or four people talk over each other, a flat block of text is useless – you can't tell who said what. Pepys transcribes multi-speaker audio and separates the turns, so a panel, a group interview, or a roundtable comes back tagged Speaker 1, Speaker 2, Speaker 3 with timestamps you can click straight to.

Once it's labeled you can rename each speaker, fix anything inline, and export a clean, attributed transcript. It works in 99+ languages, includes an AI summary and built-in chat, and we never train on your audio – you pay only for the minutes you transcribe, and credits never expire.

Clean paragraphs. No more um's and ah's.

The left is what Pepys hands back – logical paragraphs with the filler stripped out, punctuated and readable. The right is the raw, one-line-per-segment dump most transcribers leave you with.

reel-voiceover.mp4

um so yeah everyone keeps telling you to like lead with your best line right but uh honestly if you give away the whole answer in the first second you know there's basically no reason for anyone to keep watching so the hook isn't kind of the smartest thing you say it's like a loop you open that they need to close and um that's the part that actually keeps people around

Raw

BeforeAfter

Separates several voices into Speaker 1, 2, 3 with clickable timestamps
Rename speakers to real names and export an attributed transcript
AI summary and built-in chat included, so you can skim who decided what
99+ languages, auto-detected · we never train on your audio · credits never expire

Works with the platforms you live in.

Paste a link from YouTube, TikTok, Instagram, Facebook, Spotify, or Apple Podcasts – or drop in any audio or video file. We transcribe it once, then you export it however your workflow needs.

YouTube
TikTok
Instagram
Facebook
Spotify
Apple Podcasts
or any file

Export to any format

TXT
Markdown
DOCX
PDF
SRT
VTT
JSON

Timestamps, speaker labels, and subtitle timing carry through to every export.

Multi-speaker transcription – questions, answered

How do I transcribe multi-speaker audio?

Upload your recording on this page or paste a link. Pepys transcribes it and separates the voices into labeled speakers with timestamps in minutes – your first 60 minutes are free, no card required.

How many speakers can it handle?

It works with two voices or a full panel. Pepys detects each turn and tags it; the more distinct the voices, the cleaner the separation.

Does it know each speaker's real name?

No – it assigns per-chunk labels like Speaker 1 and Speaker 2 based on the audio, not voiceprint identity. You rename them to real names yourself, and the labels carry into every export.

Can I transcribe multi-speaker audio in another language?

Yes – language is auto-detected across 99+ languages, and you can translate the finished, speaker-labeled transcript afterward.

Do you keep my recording?

Only as long as needed to transcribe it, and you can auto-delete it after. We never train AI on your audio or transcripts.

More free tools

Keep reading

Don't just take our word for it.

Ask ChatGPT, Claude, or Perplexity what Pepys is and who it's for. One click, and your favorite AI does the homework.

Ask ChatGPT Ask Claude Ask Perplexity

Multi-speaker transcription – free to start

Pay as you go – credits never expire, nothing to cancel. Or start free with 60 minutes, no card.

Start free – 60 minutes or see pricing