Transcription is a methodological choice, not clerical work
Transcription is an act of representation that shapes what your analysis can find (Oliver, Serovich & Mason, 2005). Every transcript reflects interpretive decisions about what you capture and representational decisions about how you render it (Bucholtz, 2000). Your committee will expect you to name the convention you used and defend it, not treat the file as a neutral copy of the audio.
So write the choice into your methodology chapter. State whether you transcribed naturalized or denaturalized verbatim, why that fits your analytic approach, and who did the transcribing. That paragraph is short, but examiners look for it. It shows you understood transcription as a research decision, not a typing task.
This guide covers the parts specific to a dissertation. For recording setup and the general AI-first-pass-then-clean workflow, start with the pillar on how to transcribe an interview, then come back here for convention, coding, and ethics.
Naturalized or denaturalized verbatim – which does your analysis need?
The core split is naturalism versus denaturalism. Naturalized verbatim keeps every utterance in as much detail as possible; denaturalized verbatim corrects grammar, removes stutters and pauses, and standardizes non-standard accents (Oliver, Serovich & Mason, 2005). Neither is more accurate. They answer different questions.
If you're doing conversation or discourse analysis, you need the detail. A standard orthographic transcript bleaches out how and when things are said, which is exactly what those methods analyze (Hoey & Kendrick, 2022). Jeffersonian notation captures it: overlaps, timed pauses, emphasis. For thematic or content analysis, denaturalized clean verbatim is usually enough and much faster to code.
Pick one before you edit a single line, because the choice changes every turn. Switching conventions halfway leaves an inconsistent corpus that's hard to defend and harder to code.
Get an AI first pass, then transcribe for your dissertation codebook
Doing it by hand is the slow way. Manual transcription of one hour of interview audio can take up to six hours of work (Haberl et al., 2023, citing Bell et al., 2018). Across a dozen interviews, that's weeks. An AI first pass turns each hour into minutes of processing plus focused correction against the audio.
Get a speaker-labeled first-pass draft, then correct it to your convention. Clean speaker separation matters more here than in journalism: coding software keys on who said what, so an interviewer's prompt shouldn't get coded as a participant's answer.
Read the draft against the recording and fix what AI misses: participant pseudonyms, domain terms, and numbers said quickly. Mark anything genuinely unclear as [inaudible] with its timestamp, so a coded excerpt stays honest rather than a confident guess.
Format your dissertation transcript for NVivo, ATLAS.ti, or MAXQDA
Your transcript has to import cleanly into your CAQDAS. NVivo reads plain text, rich text, or Word documents, optionally with a speaker column; ATLAS.ti accepts .doc, .docx, .rtf, .odt and .txt; MAXQDA takes the same formats. DOCX is the safe common denominator.
Structure matters as much as format. In MAXQDA, each contribution starts a new paragraph with the speaker's name followed by a colon, and the software auto-codes every turn to that speaker. Keep the pattern consistent – 'Interviewer:' and 'P07:' at each turn's start – and your speaker coding is done on import.
Export a coding-ready DOCX with those speaker labels intact for import, and keep a copy for your appendix if your program requires transcripts submitted with the thesis.
Consent, special-category data, and how to cite participants
Identifiable interview data carries obligations. Under the US Common Rule, private information for which a subject's identity may readily be ascertained is identifiable, and collecting it makes you a human-subjects researcher (45 CFR 46.102). That is what your IRB consent and data-handling plan exist to cover.
If your interviews reveal health, religion, ethnicity, political opinions, or sexual orientation, they may be special-category data whose processing is prohibited without an Article 9 condition under GDPR. Store identifiable audio access-controlled, and pseudonymize in the transcript copy while keeping an un-redacted master secure.
Cite participants correctly. In APA, you don't cite your own research participants as personal communications – you quote them directly from your data. Personal-communication citations are for outside interviews you reference, not for the interviews that are your dataset.