Audio to VTT – WebVTT Caption Generator
Upload audio and download a timed WebVTT (.vtt) caption file – generated automatically in minutes.
Accepts an audio file – MP3, M4A, WAV, AAC, FLAC, OGG and more · returns a timed WebVTT (.vtt) caption file.
60 min free · no card required · we never train on your audio
How do I convert audio to a VTT file?
To convert audio to VTT, upload your recording to Pepys and it transcribes the speech into timestamped cues, then exports a WebVTT (.vtt) caption file ready for HTML5 video and the web. It works in 99+ languages, auto-detected. Your first 60 minutes are free, no card required.
How audio to vtt works
Upload your audio
Drop in any audio file – we prepare the audio and detect the language for you.
Transcribe into cues
AI transcribes the speech into accurate, timestamped caption cues in minutes.
Download your .vtt
Export a standards-compliant WebVTT file – or choose SRT, TXT, Markdown, DOCX, PDF, or JSON instead.
WebVTT is the caption format the web speaks – it's what the HTML5 <track> element loads to put captions on a video. Pepys gets you there from audio alone: upload a recording and it transcribes the speech into timed cues, then exports a clean .vtt file with correct WebVTT syntax and timestamps.
Language is auto-detected across 99+ languages, long recordings are chunked and stitched automatically, and you can edit cues or translate the captions before you export. Your first 60 minutes are free, you pay only for what you transcribe, credits never expire, and we never train on your audio.
Clean paragraphs. No more um's and ah's.
The left is what Pepys hands back – logical paragraphs with the filler stripped out, punctuated and readable. The right is the raw, one-line-per-segment dump most transcribers leave you with.
um so yeah everyone keeps telling you to like lead with your best line right but uh honestly if you give away the whole answer in the first second you know there's basically no reason for anyone to keep watching so the hook isn't kind of the smartest thing you say it's like a loop you open that they need to close and um that's the part that actually keeps people around
RawExports a standards-compliant WebVTT (.vtt) file for HTML5 and the web
Every cue is timestamped, so captions sync straight away
Handles long audio – we chunk and stitch large files automatically
99+ languages, auto-detected · translate the captions · credits never expire
Any language – 99+ detected automatically
- English
- 中文
- Español
- العربية
- हिन्दी
- Français
- 日本語
- Português
- Русский
- Deutsch
- 한국어
- Italiano
- বাংলা
- Türkçe
- فارسی
- Tiếng Việt
- தமிழ்
- Polski
- ไทย
- Українська
- Nederlands
- עברית
- Ελληνικά
- తెలుగు
- Bahasa Indonesia
- اردو
- Svenska
- मराठी
- Română
- Magyar
- Čeština
- ગુજરાતી
- Kiswahili
- ქართული
- Tagalog
- አማርኛ
Works with the platforms you live in.
Paste a link from YouTube, TikTok, Instagram, Facebook, Spotify, or Apple Podcasts – or drop in any audio or video file. We transcribe it once, then you export it however your workflow needs.
- YouTube
- TikTok
- Spotify
- Apple Podcasts
- or any file
Export to any format
- TXT
- Markdown
- DOCX
- SRT
- VTT
- JSON
Timestamps, speaker labels, and subtitle timing carry through to every export.
Audio to vtt – questions, answered
How do I convert audio to a VTT file?
Upload your audio on this page – the first 60 minutes are free, no card. Pepys transcribes it into timed cues and you download a WebVTT (.vtt) caption file in minutes.
Where is a VTT file used?
WebVTT is the standard caption format for web video via the HTML5 <track> element, and it works in many players too. If you need the older format instead, Pepys also exports SRT.
Can I make VTT captions in another language?
Yes – language is auto-detected across 99+ languages, and you can translate the finished captions into another language while keeping the cue timing intact.
Can I tweak the captions before downloading?
Yes. Edit the text inline first, then export your .vtt so it comes out clean and correctly timed. We never train AI on your audio or transcripts.
More free tools
Keep reading
Don't just take our word for it.
Ask ChatGPT, Claude, or Perplexity what Pepys is and who it's for. One click, and your favorite AI does the homework.
Audio to vtt – free to start
Pay as you go – credits never expire, nothing to cancel. Or start free with 60 minutes, no card.