Digitize genealogy recordings before the tape fails
The tape is the emergency, not the transcript. Magnetic tape breaks down chemically. Its polyurethane binder absorbs moisture and hydrolyzes, and the most common failure, sticky-shed syndrome, leaves a gummy residue that can hinder or even prohibit playback. Once a tape sheds, you may not get a clean transfer at all. Digitize first, transcribe second.
Even undamaged tape is at risk, because you may no longer be able to play it. Magnetic formats are obsolete, and IASA and UNESCO warn that replay equipment in operable condition is disappearing rapidly, with routine tape transfer estimated to end around 2025. The National Archives makes the same point: beyond deterioration of the media, the proper equipment must exist to play the record back. Degradation and obsolescence are two clocks, both running.
That gives you a deadline. The National Archives recommends reformatting audio tape older than 15 years (and video tape older than five). Most family cassettes, reels, and camcorder tapes cleared that mark long ago, so treat any inherited recording as overdue rather than safe.
Transfer to uncompressed audio, not a lossy MP3. For preservation, IASA recommends a minimum of 48 kHz sampling at 24-bit, in Broadcast WAVE (BWF), the de-facto archival standard. Keep that WAV as your untouched master and do all your transcription and editing on a copy.
What degraded audio does to your transcript
Old recordings transcribe worse, and the reason is measurable. Automatic transcription accuracy tracks the signal-to-noise ratio. Word error rate falls as the ratio rises, and performance drops sharply once the SNR is below about 5 dB (Frontiers in Signal Processing, 2022). Hiss, mains hum, and a relative sitting across the room from a 1970s cassette recorder land right in that danger zone.
So clean the audio, not the transcript. De-noise a copy of the digitized file to lift hiss and hum before you transcribe, and keep the master untouched. Then run the first pass and audit it against the audio rather than trusting it. Our guide to improving transcription accuracy covers the noise-reduction and audit-edit steps in more detail.
Expect trouble with overlapping voices. Family recordings tend to be single-mic, kitchen-table affairs, with several relatives talking over each other, and speaker separation struggles there. A tool built for separating overlapping speakers helps, but plan to correct speaker turns by hand around the crosstalk.
Transcribe the names and places most carefully
The words that matter most in genealogy are the ones AI gets wrong most often. Ancestor names, maiden names, townlands, and parishes are out-of-vocabulary named entities, which the speech-recognition literature identifies as a common cause of spoken-language errors (Interspeech, 2012). A model that handles ordinary conversation cleanly will still mangle a surname like Szczepański or a place like Ballinrobe.
So audit every name by ear. Don't accept the draft's spelling of a surname or a village. Pause on each proper noun, replay it, and check it against records you already hold – a census page, a headstone photo, a parish register. Where the audio is genuinely unclear, bracket it with a timestamp instead of guessing: [inaudible 12:04] is honest, and a confident wrong spelling propagates straight into your tree.
Keep a name key as you go. Jot a short list mapping what the tool heard to the correct spelling, then find-and-replace across the file. Relatives pronounce old family names their own way, and a key keeps a great-grandmother's surname spelled the same across two hours of talk.
Treat the recording as a genealogical source
A family transcript is evidence, and evidence carries standards. The Genealogical Proof Standard's five components include complete and accurate source citations and thorough analysis and correlation (Board for Certification of Genealogists). Cite the recording itself: who spoke, who made it, the date, where the file now lives, and the timestamp of the passage you're quoting.
Know what kind of evidence you have. A recording your grandmother made is an original source, but the information inside it can be secondhand. Elizabeth Shown Mills' evidence-analysis map separates original from derivative sources, and firsthand from secondhand information; accounts that repeat hearsay, tradition, or lore don't speak from personal knowledge and need corroboration. Her memory of her own wedding is firsthand; her account of her grandfather's birth year is not.
If you're capturing a new interview rather than transcribing an old tape, record it to archival standard from the start. Our oral history transcription guide walks through recording, consent, and repository-ready formatting for a fresh recording. This guide assumes the tape already exists and the clock is against you.
Archive the audio and transcript so they survive
Digitizing a tape once is not the same as preserving it. The Library of Congress uses a four-step personal-archiving workflow: identify your files, decide which ones matter most, and organize them with descriptive names. The fourth step is the one people skip: make copies and store them in places as physically far apart as practical. One copy in one location is a single flood or failed drive away from gone.
Follow the 3-2-1 rule. Keep three copies, on two kinds of media, with one stored off-site (Texas State Library and Archives). For a family archive, that's the WAV master plus the transcript, held on your computer and an external drive, with a third copy in cloud storage.
Export a readable access copy too. Keep the WAV as the preservation master, but also export the finished transcript to a document you can attach to the tree or hand to relatives. Name every file for the surname, date, and place, so the archive is still findable years from now.