Transcription API

🎙️ Transcription API

The transcription flow has three resources:

File (/api/files/) — the audio you upload. Once uploaded, transcription starts automatically.
Transcript (/api/transcripts/) — the result, with metadata (language, duration, word count) and download links for SRT and DOCX.
Task (/api/tasks/) — the background job that did the work (useful for retry).

All endpoints in this section require either a session cookie or an API key — see Authentication.

The flow at a glance

POST /api/files/ — upload an audio file inside an existing project. The response gives you a file.id. Transcription is enqueued automatically.
Poll GET /api/files/{id}/ until status becomes transcribed (success) or failed.
GET /api/transcripts/ or GET /api/transcripts/{id}/ — read the resulting transcript record.
GET /api/transcripts/{id}/download/?type=srt|docx — fetch a time-limited download URL for the file.

POST /api/files/

Upload an audio file. Auto-triggers transcription when the caller’s subscription has remaining minutes.

Auth: API key or session.

Content-Type: multipart/form-data

Form fields:

Field	Required	Description
`project`	yes	ID of the project the file belongs to. Must be visible to the caller.
`file`	yes	The audio file. Common formats are supported (mp3, wav, m4a, flac, ogg, mp4, …).
`file_type`	yes	Set to `audio` for transcription. Other values (`csv`, `xlsx`, `srt`, `vtt`) are used for non-audio uploads.

Response (201 Created):

{
  "id": 4217,
  "project": 312,
  "original_filename": "interview.wav",
  "file_type": "audio",
  "status": "transcribing",
  "duration_seconds": 1834.2,
  "error_reason": "",
  "uploaded_at": "2026-05-17T14:02:11Z"
}

Example:

curl -X POST https://app.uspeech.io/api/files/ \
  -H "Authorization: Api-Key abc123.longersecretstringhere" \
  -F "project=312" \
  -F "file_type=audio" \
  -F "file=@./interview.wav"

⚠️ Important: a 201 only means the upload was accepted and the job was queued. The transcript itself is produced asynchronously — see the polling pattern below.

GET /api/files/{id}/

Read a single file, including its current status.

Auth: API key or session.

Response (200 OK): same shape as the POST response above.

File status reference

Status	Meaning
`uploaded`	File saved, transcription not yet queued (used by non-audio uploads).
`transcribing`	Transcription is in progress.
`transcribed`	Transcription completed successfully. A `Transcript` record now exists.
`failed`	Transcription failed. The `error_reason` field in this same response tells you why (`audio_too_short`, `hallucinated_transcript`, or empty for other errors) — see Status Codes & Errors.
`analyzing`, `completed`	Set when downstream analysis is running or finished (not relevant for the transcription-only flow).

Polling pattern

file_id=4217
while :; do
  resp=$(curl -sS "https://app.uspeech.io/api/files/${file_id}/" \
    -H "Authorization: Api-Key $USPEECH_KEY")
  status=$(echo "$resp" | jq -r .status)
  echo "$(date -u +%H:%M:%S) status=$status"
  case "$status" in
    transcribed) break ;;
    failed)
      echo "failed, reason: $(echo "$resp" | jq -r '.error_reason // "unknown"')"
      break ;;
  esac
  sleep 10
done

💡 Tip: transcription typically takes ~1/8 of the audio’s duration. A 10-minute call is usually done in well under two minutes.

GET /api/transcripts/

List transcripts visible to the caller (scoped to projects they can see).

Auth: API key or session.

Response (200 OK): paginated list. Each item:

{
  "id": 901,
  "uploaded_file": 4217,
  "language": "en",
  "duration_seconds": 1834.2,
  "word_count": 3120,
  "srt_path": "tenant/312/results/transcripts/interview.srt",
  "docx_path": "tenant/312/results/transcripts/interview.docx",
  "status": "completed",
  "error_reason": "",
  "created_at": "2026-05-17T14:05:33Z",
  "updated_at": "2026-05-17T14:08:47Z"
}

status here is queued → processing → completed | failed. When status is failed, error_reason holds a machine-readable cause (audio_too_short, hallucinated_transcript) or is empty for other errors — see Status Codes & Errors.

GET /api/transcripts/{id}/

Retrieve a single transcript by ID. Same field shape as the list response.

curl https://app.uspeech.io/api/transcripts/901/ \
  -H "Authorization: Api-Key abc123.longersecretstringhere"

If the transcript exists but isn’t visible to the caller, the response is 404 Not Found.

GET /api/transcripts/{id}/download/

Return a time-limited URL for the transcript artefact (SRT or DOCX). The URL points at private storage — use it from the same machine that received the response.

Auth: API key or session.

Query parameters:

Param	Required	Description
`type`	no	`srt` (default) or `docx`.

Response (200 OK):

{
  "download_url": "https://…/interview.srt?X-Amz-Signature=…"
}

If the requested artefact hasn’t been generated (e.g. the transcript failed), the response is 404 Not Found.

Example:

# Get the URL …
url=$(curl -sS "https://app.uspeech.io/api/transcripts/901/download/?type=srt" \
  -H "Authorization: Api-Key abc123.longersecretstringhere" | jq -r .download_url)

# … then download.
curl -o interview.srt "$url"

Retrying a failed transcription

If a file ends up in failed, you can retry it via the Task API.

Find the transcription Task for the file at GET /api/tasks/?task_type=transcription (or follow the link from the web app).
Call POST /api/tasks/{task_id}/retry/ to re-queue it.

Don’t retry on terminal failures like audio_too_short — see Status Codes & Errors for the full list of error_reason values.

End-to-end example

# 1. Upload
resp=$(curl -sS -X POST https://app.uspeech.io/api/files/ \
  -H "Authorization: Api-Key $USPEECH_KEY" \
  -F "project=312" -F "file_type=audio" \
  -F "file=@./interview.wav")
file_id=$(echo "$resp" | jq -r .id)

# 2. Poll
while :; do
  resp=$(curl -sS "https://app.uspeech.io/api/files/${file_id}/" \
    -H "Authorization: Api-Key $USPEECH_KEY")
  status=$(echo "$resp" | jq -r .status)
  [ "$status" = "transcribed" ] && break
  [ "$status" = "failed" ] && {
    echo "transcription failed: $(echo "$resp" | jq -r '.error_reason // "unknown"')"
    exit 1
  }
  sleep 10
done

# 3. Find the transcript and download SRT
transcript_id=$(curl -sS "https://app.uspeech.io/api/transcripts/?uploaded_file=${file_id}" \
  -H "Authorization: Api-Key $USPEECH_KEY" | jq -r '.results[0].id')

url=$(curl -sS "https://app.uspeech.io/api/transcripts/${transcript_id}/download/?type=srt" \
  -H "Authorization: Api-Key $USPEECH_KEY" | jq -r .download_url)

curl -o interview.srt "$url"