Start with IRB approval and consent to record
A clinical research interview is human-subjects research, and that changes what you do before you press record. Under the Common Rule, a human subject is a living individual an investigator studies through interaction, or by using identifiable private information about them (45 CFR 46.102). An interview is both. So your protocol, your consent, and your data-handling plan sit under an IRB's review, not just your own judgment.
You need legally effective informed consent before a participant takes part. The Common Rule requires it (45 CFR 46.116(a)), and consent is generally documented on a signed, IRB-approved form (46.117(a)). The rule's enumerated consent elements do not name audio recording. Getting explicit permission to record, and to keep the recording, is standard IRB practice rather than statutory wording – so follow the protocol your board approved.
Write down what you promised. If your consent form says the audio will be de-identified and deleted on a set timeline, your workflow has to match that. A transcription step you never disclosed is a gap an IRB can flag later.
De-identify before the file leaves your control
The safest workflow keeps identifiers out of what you share and store. HIPAA's Safe Harbor method lists 18 categories of identifiers to remove – names, dates more specific than a year, ZIP codes, and biometric identifiers including voice prints (45 CFR 164.514). Remove them and the data has no reasonable basis for identifying someone, which is the standard the rule sets.
Stripping names isn't enough, because ordinary details still combine to point at one person. Latanya Sweeney found that ZIP code, gender, and date of birth alone made 87% of Americans unique in 1990 census data. A later recomputation on 2000 data put the figure near 63%. Either way, generalize the quasi-identifiers: dates to years, locations to broad regions, rare roles to categories.
The mechanics are their own craft. Stripping direct identifiers and generalizing quasi-identifiers is done on the transcript, and you keep a separate, access-controlled key that maps codes back to people. Don't email the un-redacted version around, and don't leave it in a shared drive.
Does HIPAA apply to clinical research interview transcription?
It depends on who you are and whether the data is still PHI. HIPAA binds covered entities (health plans, clearinghouses, and providers who bill electronically) and their business associates (45 CFR 160.103). An independent academic researcher is often neither, which is an inference from those definitions, not a blanket exemption. Your IRB and grant terms may still bind you.
When PHI is involved, the rules get specific. A covered entity may disclose PHI to a business associate only with a written agreement, a BAA, giving satisfactory assurance the data is safeguarded (45 CFR 164.502(e)). Research use of PHI otherwise needs the participant's authorization (164.508) or an IRB waiver that finds minimal privacy risk (164.512(i)).
For a tool like Pepys: it does not sign a BAA and isn't for identifiable PHI, so use it only for de-identified recordings. If your file still contains PHI, that's a covered-entity and BAA question to settle first – de-identify, or keep the work inside a HIPAA-covered pipeline.
Capture the verbatim detail your analysis depends on
In a clinical interview the exact words are the data, not decoration. Transcribing by hand can take up to six hours for one hour of audio, per the aTrain study citing Bell et al. An AI first pass cuts that to minutes of processing plus focused correction, so your attention goes to verifying the lines that carry meaning.
How much you clean the wording depends on your method. If you're capturing patient-reported outcomes, keep it strict. The FDA defines a PRO as "any report of the status of a patient's health condition that comes directly from the patient, without interpretation of the patient's response by a clinician or anyone else" (2009 guidance). Tidying a participant's phrasing can quietly change what they reported.
The coding and export mechanics (naturalized versus denaturalized styles, CAQDAS-ready formatting) belong to qualitative research transcription as a craft, so this guide won't repeat them. For the first pass, a de-identified recording gives you a speaker-labeled, timestamped draft you can correct against the audio and hand to your coding software.
Handle the recording and transcript securely
Voice is itself an identifier, so the raw recording stays sensitive even after you scrub names from the text. Safe Harbor lists biometric identifiers, including voice prints, among the 18 to remove (45 CFR 164.514(b)(2)(i)(P)). That's why de-identification happens on the transcript, and the audio is tightly controlled or deleted rather than stored casually.
An IRB waiver of authorization turns on a plan to protect the identifiers and then destroy them at the earliest reasonable point (45 CFR 164.512(i)). Build that into your process: access-controlled storage, a set deletion date for raw audio, and written assurance the data won't be reused. Pick a tool that doesn't train on your files and lets you delete them after processing; Pepys does both.
Keep the un-redacted master and the code key separate from the working transcript, each under its own access control. If you ever need to verify a quote against the original, you can – without that recording drifting through email or a shared drive.