9,438,517minutes transcribed

Transcript to JSON

Upload a recording or paste a link and get structured JSON – an array of cues with start, end, speaker, and text.

or paste a link

Accepts an audio or video file – or a link to one · returns structured JSON – an array of cues with start, end, speaker, and text.

60 min free · no card required · we never train on your audio

Trusted by 100k+ users

How do I export a transcript to JSON?

To get a transcript as JSON, upload a recording or paste a link to Pepys and it transcribes the audio, then exports structured JSON – an array of cues, each with start, end, speaker, and text. It's ready to parse in any language or pipeline, across 99+ spoken languages. Your first 60 minutes are free, no card required.

How transcript to json works

Upload audio or paste a link

Drop in a recording or paste a link – we extract the audio automatically.

Get your transcript

AI transcribes it into speaker-labeled cues with start and end timestamps in minutes.

Export to JSON

Download structured JSON – an array of cues you can parse in code – or export TXT, Markdown, DOCX, PDF, SRT, or VTT.

A .json export hands your code the transcript as data, not prose: an array of cues, each with a start, end, speaker, and text field. Pepys transcribes your audio or link and returns exactly that, so a search index, a captions player, a summarizer, or a dataset can iterate over the cues without scraping or parsing free-form text.

The timestamps are precise enough to sync subtitles, jump a player to a quote, or align segments for translation. It works in 99+ auto-detected languages, every cue is grounded in the real audio, and we never train on your data. Pay only for what you transcribe; credits never expire.

What the JSON looks like

transcript.json

[
  { "start": 0.00, "end": 4.20, "speaker": "Speaker 1",
    "text": "Welcome back to the show." },
  { "start": 4.20, "end": 9.80, "speaker": "Speaker 2",
    "text": "Glad to be here. Let's get into it." }
]

An array of cues – start, end, speaker, and text fields – ready to parse in any language or runtime
Precise timestamps to sync captions, drive a player, or align segments for translation
Speaker labels included, so your code knows who said each line
99+ languages, auto-detected · we never train on your audio · credits never expire

Works with the platforms you live in.

Paste a link from YouTube, TikTok, Instagram, Facebook, Spotify, or Apple Podcasts – or drop in any audio or video file. We transcribe it once, then you export it however your workflow needs.

YouTube
TikTok
Instagram
Facebook
Spotify
Apple Podcasts
or any file

Export to any format

TXT
Markdown
DOCX
PDF
SRT
VTT
JSON

Most useful for your work: JSON

Timestamps, speaker labels, and subtitle timing carry through to every export.

Transcript to json – questions, answered

How do I export a transcript to JSON?

Upload a recording or paste a link on this page – the first 60 minutes are free, no card. Pepys transcribes it and lets you download structured JSON cues with timestamps, speakers, and text.

What's the shape of the JSON?

You get an array of cues, where each cue has a start time, an end time, a speaker label, and the text spoken – simple to iterate over in any programming language.

Can I convert the JSON to subtitles later?

Yes – the same transcript also exports directly to SRT and VTT, so you can grab JSON for your code and a subtitle file from the one upload.

Does it support multiple speakers and languages?

Yes. Cues come speaker-labeled, and language is auto-detected across 99+ languages – you can translate the finished transcript before exporting too.

More free tools

Keep reading

Don't just take our word for it.

Ask ChatGPT, Claude, or Perplexity what Pepys is and who it's for. One click, and your favorite AI does the homework.

Ask ChatGPT Ask Claude Ask Perplexity

Transcript to json – free to start

Pay as you go – credits never expire, nothing to cancel. Or start free with 60 minutes, no card.

Start free – 60 minutes or see pricing