💬 Transcriptions Guide

This guide covers everything you need to know about transcribing audio and video files in USpeech Analytics.

Overview

USpeech Analytics uses advanced AI to convert your audio and video recordings into accurate, searchable text. The transcription engine:

Supports multiple languages (English, Spanish, Ukrainian and more, with auto-detection)
Identifies different speakers automatically
Provides turn-level timestamps (optional)
Handles various audio qualities and accents

Quick How-To

📋 How to create a project

Navigate to the Transcription panel
Select Create Project in the Project drop-down menu
Choose project configuration:
- Project Title - enter a name for your project
- Audio File Type - select the type of audio files you want to transcribe. Available types are: Interview for in-depth interviews, Focus Group for group discussions, Call for contact center calls
- Return Timestamps - select this option if you want to include timestamps in the transcript
- Language - select the language of your audio files. The only available option is Auto

📝 How to Transcribe a File

Navigate to your Audio Analysis project
Click Upload Files and select one or more audio/video files
Wait for the upload to complete. Files will appear in the table My Files having status Uploaded (or Transcribed in case of SRT files)
Select one or more files you need to transcribe and click the Transcribe button. You can also click on three dots next to the file name and select Transcribe from the drop-down menu
The status will change to “Processing” - processing typically takes about 1/8 of the audio length
Once complete, the status will change to “Transcribed” and you can click on the file name to review

💡 Tip: you’ll get an email notification when the transcription process is finished. In order to enable email notifications, go to Profile, check “Notify job completion” and click Save.

⚠️ Important: any transcribed audio files are billed based on their duration, after the transcription is done, regardless of the number of analysis types. However, uploaded transcription files are not billed until the analysis is performed on them.

👁️‍🗨️ How to Review Your Transcript

Click on the file name to open the file details or transcript view. Only “Transcribed” files can be reviewed
Review the transcript

💾 How to Download Your Transcript

Choose one or more files having “Transcribed” status
Click the Download button. The transcription will be downloaded in Word Document (docx) format

Supported File Types

Audio Formats

Format	Extension	Notes
MP3	.mp3	Most common, widely supported
WAV	.wav	Uncompressed, highest quality
M4A	.m4a	Apple format, good compression
AAC	.aac	Advanced Audio Coding
OGG	.ogg	Open format
FLAC	.flac	Lossless compression

Video Formats

Format	Extension	Notes
MP4	.mp4	Most common video format
MOV	.mov	Apple QuickTime format
AVI	.avi	Windows format
WebM	.webm	Web-optimized format

Transcription Formats

Format	Extension	Notes
SRT	.srt	SubRip Subtitle format

File Limits

Maximum duration: 2 hours per file
Maximum file size: 500 MB
Minimum quality: 16 kbps audio bitrate recommended

Project Settings

When creating or editing an Audio Analysis project, you can configure default settings that apply to all new uploads.

Audio Type

Choose the type of content you’re transcribing to optimize results:

Audio Type	Best For	How It Helps
Interview	One-on-one conversations	Optimized for 2 speakers, clear turn-taking
Focus Group	Group discussions	Better handles multiple speakers, overlapping speech
Call	Phone/video calls	Handles varying audio quality, echo

Language Settings

Option	Description
Auto	Automatically detects the spoken language (recommended for mixed-language content)

💡Tip: Use auto-detection unless you’re experiencing recognition issues with a specific language.

Timestamps

Setting	Description
Off (default)	Transcript shows speaker turns without timestamps — cleaner for reading
On	Includes word-level timestamps — useful for video editing, precise citations

Understanding Transcript Output

Speaker Identification

The system automatically identifies different speakers and labels them. If possible, it also attempts to identify the role of each speaker (e.g., interviewer, participant):

Interviewer: Thank you for joining us today. Can you tell me about your experience?

Participant: Of course. I've been using the product for about six months now...

Note: Speaker labels are consistent within a file (Speaker 1 is always the same person), but may vary across different files.

Transcript Structure

Your transcript includes:

Speaker labels — Who is speaking
Speech content — What was said
Timestamps (if enabled) — When it was said

File Status Reference

Status	Meaning	Action
Uploaded	File uploaded, ready for transcription	Click “Transcribe” to start
Processing	Processing in progress	Wait for completion
Transcribed	Transcription complete	View or download transcript
Failed	An error occurred	Try re-uploading or contact support

Tips for Best Results

Before Recording

Use quality equipment — A decent microphone makes a big difference
Minimize background noise — Record in quiet environments
Position microphones correctly — Ensure all speakers are audible
Test levels — Do a short test recording first

File Preparation

Trim unnecessary sections — Remove long silences or irrelevant parts
Use common formats — MP3 or M4A work best
Keep files under 2 hours — Split longer recordings
Ensure consistent volume — Normalize audio if needed

After Transcription

Review the transcript — AI is accurate but not perfect
Check speaker labels — Verify speaker identification
Note any corrections — Keep track of specialized terms or names

Troubleshooting

Common Issues

Transcription is taking too long

Large files (>1 hour) may take 15-30 minutes
Check your internet connection
Very poor audio quality requires more processing

Poor accuracy

Ensure audio quality is sufficient (clear speech, minimal noise)
Try setting the correct language manually instead of auto-detect
Background music or noise significantly impacts accuracy

Speakers not identified correctly

Very similar voices may be confused
Overlapping speech can cause misattribution
Short utterances may not be correctly attributed

File won’t upload

Check file format is supported
Ensure file is under 500 MB
Try converting to MP3 format

Error Messages

Error	Cause	Solution
”File too large”	Exceeds 500 MB limit	Compress or split the file
”Unsupported format”	File type not recognized	Convert to MP3 or WAV
”Processing failed”	Audio couldn’t be processed	Check audio quality, re-upload
”Insufficient hours”	Usage limit reached	Upgrade subscription or wait for reset

Frequently Asked Questions

Q: How accurate are the transcriptions? A: Accuracy typically ranges from 90-98% depending on audio quality, accents, and background noise. Clear recordings in English achieve the highest accuracy.

Q: Can I edit the transcript? A: Currently, transcripts are read-only in the app. Download as Word format to make edits.

Q: Are my recordings secure? A: Yes. Files are encrypted in transit and at rest. Recordings are stored privately and only accessible to your team.

Q: What happens to my audio files after transcription? A: Files are retained in your project until you delete them. You can remove files at any time.

Q: Can I transcribe in multiple languages in one file? A: Yes, use the “Auto” language setting. The system will detect and transcribe mixed-language content.

Next Steps

Once you have transcripts, you can:

Analyze conversations to extract insights
Export transcripts for use in other tools
Share results with your team