Small toolset for converting downloaded Amazon Transcribe JSON into a clean speaker-separated transcript.
This repo includes:
- A web app (Vite + React + TypeScript) where you paste JSON directly.
- A CLI you can run as
npx amazon-transcribe-parser.
Use the raw JSON output downloaded from Amazon Transcribe job results.
The parser reads:
results.audio_segments+results.items(primary path)results.speaker_labels.segments+results.items(fallback)
- Install deps:
npm install- Start the app:
npm run dev- Open the local URL shown by Vite.
- Paste the downloaded Amazon Transcribe JSON.
- Click Parse Transcript.
- Optionally rename speaker labels (
spk_0,spk_1, ...). - Copy all text or download
transcript.txt.
Run with NPX:
npx amazon-transcribe-parser --input ./transcribe-output.jsonIf no --speaker mapping is provided and you're in an interactive terminal, the CLI prompts you to name each detected speaker.
-i, --input, --source <path>: source Amazon Transcribe JSON file.-s, --speaker <spk=name>: rename speaker IDs, repeatable.-o, --output <path>: write parsed transcript to file instead of stdout.--list-speakers: print detected speaker IDs and exit.--no-interactive: disable prompts.-h, --help: show help.
Identify source file only:
npx amazon-transcribe-parser --input ./call.jsonIdentify speakers explicitly:
npx amazon-transcribe-parser \
--input ./call.json \
--speaker spk_0=Agent \
--speaker spk_1=CustomerSave output to file:
npx amazon-transcribe-parser --input ./call.json --output ./transcript.txtList speaker IDs first:
npx amazon-transcribe-parser --input ./call.json --list-speakersnpm run cli -- --input ./call.jsonThe parser outputs entries like:
[00:01.4 -> 00:05.8] Agent:
Hello, thanks for calling.
[00:05.9 -> 00:08.2] Customer:
Hi, I need help with my order.