Skip to content

Transcription API

The transcription flow has three resources:

  • File (/api/files/) — the audio you upload. Once uploaded, transcription starts automatically.
  • Transcript (/api/transcripts/) — the result, with metadata (language, duration, word count) and download links for SRT and DOCX.
  • Task (/api/tasks/) — the background job that did the work (useful for retry).

All endpoints in this section require either a session cookie or an API key — see Authentication.


  1. POST /api/files/ — upload an audio file inside an existing project. The response gives you a file.id. Transcription is enqueued automatically.
  2. Poll GET /api/files/{id}/ until status becomes transcribed (success) or failed.
  3. GET /api/transcripts/ or GET /api/transcripts/{id}/ — read the resulting transcript record.
  4. GET /api/transcripts/{id}/download/?type=srt|docx — fetch a time-limited download URL for the file.

Upload an audio file. Auto-triggers transcription when the caller’s subscription has remaining minutes.

Auth: API key or session.

Content-Type: multipart/form-data

Form fields:

FieldRequiredDescription
projectyesID of the project the file belongs to. Must be visible to the caller.
fileyesThe audio file. Common formats are supported (mp3, wav, m4a, flac, ogg, mp4, …).
file_typeyesSet to audio for transcription. Other values (csv, xlsx, srt, vtt) are used for non-audio uploads.

Response (201 Created):

{
"id": 4217,
"project": 312,
"original_filename": "interview.wav",
"file_type": "audio",
"status": "transcribing",
"duration_seconds": 1834.2,
"uploaded_at": "2026-05-17T14:02:11Z"
}

Example:

Terminal window
curl -X POST https://app.uspeech.io/api/files/ \
-H "Authorization: Api-Key abc123.longersecretstringhere" \
-F "project=312" \
-F "file_type=audio" \
-F "file=@./interview.wav"

⚠️ Important: a 201 only means the upload was accepted and the job was queued. The transcript itself is produced asynchronously — see the polling pattern below.


Read a single file, including its current status.

Auth: API key or session.

Response (200 OK): same shape as the POST response above.

StatusMeaning
uploadedFile saved, transcription not yet queued (used by non-audio uploads).
transcribingTranscription is in progress.
transcribedTranscription completed successfully. A Transcript record now exists.
failedTranscription failed. Check Transcript.error_reason for details — see Status Codes & Errors.
analyzing, completedSet when downstream analysis is running or finished (not relevant for the transcription-only flow).
Terminal window
file_id=4217
while :; do
status=$(curl -sS "https://app.uspeech.io/api/files/${file_id}/" \
-H "Authorization: Api-Key $USPEECH_KEY" | jq -r .status)
echo "$(date -u +%H:%M:%S) status=$status"
case "$status" in
transcribed|failed) break ;;
esac
sleep 10
done

💡 Tip: transcription typically takes ~1/8 of the audio’s duration. A 10-minute call is usually done in well under two minutes.


List transcripts visible to the caller (scoped to projects they can see).

Auth: API key or session.

Response (200 OK): paginated list. Each item:

{
"id": 901,
"uploaded_file": 4217,
"language": "en",
"duration_seconds": 1834.2,
"word_count": 3120,
"srt_path": "tenant/312/results/transcripts/interview.srt",
"docx_path": "tenant/312/results/transcripts/interview.docx",
"status": "completed",
"created_at": "2026-05-17T14:05:33Z",
"updated_at": "2026-05-17T14:08:47Z"
}

status here is queuedprocessingcompleted | failed.


Retrieve a single transcript by ID. Same field shape as the list response.

Terminal window
curl https://app.uspeech.io/api/transcripts/901/ \
-H "Authorization: Api-Key abc123.longersecretstringhere"

If the transcript exists but isn’t visible to the caller, the response is 404 Not Found.


Return a time-limited URL for the transcript artefact (SRT or DOCX). The URL points at private storage — use it from the same machine that received the response.

Auth: API key or session.

Query parameters:

ParamRequiredDescription
typenosrt (default) or docx.

Response (200 OK):

{
"download_url": "https://…/interview.srt?X-Amz-Signature=…"
}

If the requested artefact hasn’t been generated (e.g. the transcript failed), the response is 404 Not Found.

Example:

Terminal window
# Get the URL …
url=$(curl -sS "https://app.uspeech.io/api/transcripts/901/download/?type=srt" \
-H "Authorization: Api-Key abc123.longersecretstringhere" | jq -r .download_url)
# … then download.
curl -o interview.srt "$url"

If a file ends up in failed, you can retry it via the Task API.

  1. Find the transcription Task for the file at GET /api/tasks/?task_type=transcription (or follow the link from the web app).
  2. Call POST /api/tasks/{task_id}/retry/ to re-queue it.

Don’t retry on terminal failures like audio_too_short — see Status Codes & Errors for the full list of error_reason values.


Terminal window
# 1. Upload
resp=$(curl -sS -X POST https://app.uspeech.io/api/files/ \
-H "Authorization: Api-Key $USPEECH_KEY" \
-F "project=312" -F "file_type=audio" \
-F "file=@./interview.wav")
file_id=$(echo "$resp" | jq -r .id)
# 2. Poll
while :; do
status=$(curl -sS "https://app.uspeech.io/api/files/${file_id}/" \
-H "Authorization: Api-Key $USPEECH_KEY" | jq -r .status)
[ "$status" = "transcribed" ] && break
[ "$status" = "failed" ] && { echo "transcription failed"; exit 1; }
sleep 10
done
# 3. Find the transcript and download SRT
transcript_id=$(curl -sS "https://app.uspeech.io/api/transcripts/?uploaded_file=${file_id}" \
-H "Authorization: Api-Key $USPEECH_KEY" | jq -r '.results[0].id')
url=$(curl -sS "https://app.uspeech.io/api/transcripts/${transcript_id}/download/?type=srt" \
-H "Authorization: Api-Key $USPEECH_KEY" | jq -r .download_url)
curl -o interview.srt "$url"