Skip to content

💬 Transcriptions Guide

This guide covers everything you need to know about transcribing audio and video files in USpeech Analytics.


USpeech Analytics uses advanced AI to convert your audio and video recordings into accurate, searchable text. The transcription engine:

  • Supports multiple languages (English, Spanish, Ukrainian and more, with auto-detection)
  • Identifies different speakers automatically
  • Provides turn-level timestamps (optional)
  • Handles various audio qualities and accents

  1. Navigate to the Transcription panel
  2. Select Create Project in the Project drop-down menu
  3. Choose project configuration:
    • Project Title - enter a name for your project
    • Audio File Type - select the type of audio files you want to transcribe. Available types are: Interview for in-depth interviews, Focus Group for group discussions, Call for contact center calls
    • Return Timestamps - select this option if you want to include timestamps in the transcript
    • Language - select the language of your audio files. The only available option is Auto
  1. Navigate to your Audio Analysis project
  2. Click Upload Files and select one or more audio/video files
  3. Wait for the upload to complete. Files will appear in the table My Files having status Uploaded (or Transcribed in case of SRT files)
  4. Select one or more files you need to transcribe and click the Transcribe button. You can also click on three dots next to the file name and select Transcribe from the drop-down menu
  5. The status will change to “Processing” - processing typically takes about 1/8 of the audio length
  6. Once complete, the status will change to “Transcribed” and you can click on the file name to review

💡​ Tip: you’ll get an email notification when the transcription process is finished. In order to enable email notifications, go to Profile, check “Notify job completion” and click Save.

⚠️ Important: any transcribed audio files are billed based on their duration, after the transcription is done, regardless of the number of analysis types. However, uploaded transcription files are not billed until the analysis is performed on them.

👁️‍🗨️ How to Review Your Transcript

Section titled “👁️‍🗨️ How to Review Your Transcript”
  1. Click on the file name to open the file details or transcript view. Only “Transcribed” files can be reviewed
  2. Review the transcript
  1. Choose one or more files having “Transcribed” status
  2. Click the Download button. The transcription will be downloaded in Word Document (docx) format

FormatExtensionNotes
MP3.mp3Most common, widely supported
WAV.wavUncompressed, highest quality
M4A.m4aApple format, good compression
AAC.aacAdvanced Audio Coding
OGG.oggOpen format
FLAC.flacLossless compression
FormatExtensionNotes
MP4.mp4Most common video format
MOV.movApple QuickTime format
AVI.aviWindows format
WebM.webmWeb-optimized format
FormatExtensionNotes
SRT.srtSubRip Subtitle format
  • Maximum duration: 2 hours per file
  • Maximum file size: 500 MB
  • Minimum quality: 16 kbps audio bitrate recommended

When creating or editing an Audio Analysis project, you can configure default settings that apply to all new uploads.

Choose the type of content you’re transcribing to optimize results:

Audio TypeBest ForHow It Helps
InterviewOne-on-one conversationsOptimized for 2 speakers, clear turn-taking
Focus GroupGroup discussionsBetter handles multiple speakers, overlapping speech
CallPhone/video callsHandles varying audio quality, echo
OptionDescription
AutoAutomatically detects the spoken language (recommended for mixed-language content)

💡Tip: Use auto-detection unless you’re experiencing recognition issues with a specific language.

SettingDescription
Off (default)Transcript shows speaker turns without timestamps — cleaner for reading
OnIncludes word-level timestamps — useful for video editing, precise citations

The system automatically identifies different speakers and labels them. If possible, it also attempts to identify the role of each speaker (e.g., interviewer, participant):

Interviewer: Thank you for joining us today. Can you tell me about your experience?
Participant: Of course. I've been using the product for about six months now...

Note: Speaker labels are consistent within a file (Speaker 1 is always the same person), but may vary across different files.

Your transcript includes:

  1. Speaker labels — Who is speaking
  2. Speech content — What was said
  3. Timestamps (if enabled) — When it was said

StatusMeaningAction
UploadedFile uploaded, ready for transcriptionClick “Transcribe” to start
ProcessingProcessing in progressWait for completion
TranscribedTranscription completeView or download transcript
FailedAn error occurredTry re-uploading or contact support

  1. Use quality equipment — A decent microphone makes a big difference
  2. Minimize background noise — Record in quiet environments
  3. Position microphones correctly — Ensure all speakers are audible
  4. Test levels — Do a short test recording first
  1. Trim unnecessary sections — Remove long silences or irrelevant parts
  2. Use common formats — MP3 or M4A work best
  3. Keep files under 2 hours — Split longer recordings
  4. Ensure consistent volume — Normalize audio if needed
  1. Review the transcript — AI is accurate but not perfect
  2. Check speaker labels — Verify speaker identification
  3. Note any corrections — Keep track of specialized terms or names

Transcription is taking too long

  • Large files (>1 hour) may take 15-30 minutes
  • Check your internet connection
  • Very poor audio quality requires more processing

Poor accuracy

  • Ensure audio quality is sufficient (clear speech, minimal noise)
  • Try setting the correct language manually instead of auto-detect
  • Background music or noise significantly impacts accuracy

Speakers not identified correctly

  • Very similar voices may be confused
  • Overlapping speech can cause misattribution
  • Short utterances may not be correctly attributed

File won’t upload

  • Check file format is supported
  • Ensure file is under 500 MB
  • Try converting to MP3 format
ErrorCauseSolution
”File too large”Exceeds 500 MB limitCompress or split the file
”Unsupported format”File type not recognizedConvert to MP3 or WAV
”Processing failed”Audio couldn’t be processedCheck audio quality, re-upload
”Insufficient hours”Usage limit reachedUpgrade subscription or wait for reset

Q: How accurate are the transcriptions? A: Accuracy typically ranges from 90-98% depending on audio quality, accents, and background noise. Clear recordings in English achieve the highest accuracy.

Q: Can I edit the transcript? A: Currently, transcripts are read-only in the app. Download as Word format to make edits.

Q: Are my recordings secure? A: Yes. Files are encrypted in transit and at rest. Recordings are stored privately and only accessible to your team.

Q: What happens to my audio files after transcription? A: Files are retained in your project until you delete them. You can remove files at any time.

Q: Can I transcribe in multiple languages in one file? A: Yes, use the “Auto” language setting. The system will detect and transcribe mixed-language content.


Once you have transcripts, you can:

  • Analyze conversations to extract insights
  • Export transcripts for use in other tools
  • Share results with your team