Skip to content

📊​ Survey Analysis Guide

This guide explains how to use Uspeech Analytics to automatically code and categorize open-ended survey responses.


Survey Analysis uses AI to automatically categorize open-ended text responses from surveys, saving hours of manual coding work. The system:

  • Identifies themes in free-text responses
  • Creates consistent codes across all responses
  • Handles large datasets efficiently (thousands of responses)
  • Exports results in analysis-ready Excel format

This is ideal for market researchers, UX researchers, and anyone working with open-ended survey data.


  1. Create a Survey Analysis project

    • Click Survey Analysis in the navigation bar
    • Click New Project
    • Name your project
    • Choose Google Forms
    • Click Create
  2. Upload your data file

    • Click Upload File button and choose your file to analyze
    • Supported formats: CSV, Excel (.xlsx, .xls) exported from Google Forms
  3. Configure columns

    • After uploading, you’ll see Configure Google Forms dialog
    • Select the column containing user ID
    • Select the column containing timestamps
  4. Configure analysis options

    • Next, you’ll see Configure Survey Analysis dialog
    • Review the default settings
    • Adjust if needed
    • You don’t need to configure column names, they are already set
    • Click Save Settings
  1. Create a Survey Analysis project

    • Click Survey Analysis in the navigation bar
    • Click New Project
    • Name your project
    • Click Create
  2. Upload your data file

    • Click Upload File button and choose your file to analyze
    • Supported file formats: CSV, Excel (.xlsx, .xls)
    • Detailed format description see below
  3. Configure columns and analysis options

    • Next, you’ll see Configure Survey Analysis dialog
    • Set up the required column names in the Response Column and User ID Column dropdowns
    • Set up the question in the Question dropdown
      • Either choose question column name from the dropdown (question text should be in the first row of the column)
      • Or enter question text manually after choosing “Custom” option
      • Or set question-to-sheet mapping from the predefined template by choosing template name
    • Set up pre-defined categories if needed; by default no pre-defined categories are used
      • Either choose “Select from columns” in the Code Source dropdown, and then set corresponding column names in the Code id and Code name dropdowns
      • Or choose pre-defined categories vocabulary name from the dropdown
    • Adjust analysis parameters if needed
    • Click Save Settings
  1. Create a Survey Analysis project

    • Click Survey Analysis in the navigation bar
    • Click New Project
    • Name your project
    • Choose template name in the dropdown (see Templates & Codes Guide)
    • Click Create
  2. Upload your data file

    • Click Upload File button and choose your file to analyze
    • Supported file formats: CSV, Excel (.xlsx, .xls)
    • Detailed format description see below
  3. Configure analysis options

    • Next, you’ll see Configure Survey Analysis dialog
    • Review the default settings
    • Adjust if needed
    • You don’t need to configure column names, they are already set in the template
    • Click Save Settings
  1. Run the analysis

    • Click Analyze
    • Processing time depends on the number of responses
    • Processing is done per question. Once processing is complete for a question, the Status will change to “Completed” on the corresponding row of the table
  2. Review results

    • View the results for each question in the table by clicking on the question row
    • You will see a pie chart with category distribution
  3. Download results

    • Check all questions you want to review or download results for
    • Click Export button
    • Results are available as Excel files
    • Each response is coded with one or more category numbers

Best for: General open-ended questions like “What did you like about the product?” or “How can we improve?”

How it works:

  1. AI reads all responses to understand common themes
  2. Creates a codebook of categories that represent the data
  3. Assigns each response to one or more categories
  4. Produces frequency counts and percentages
  5. Output Excel file will contain:
    • codebook with category definitions
    • category codes in front of each response

Example input:

“The app is really fast and easy to use” “I love how quickly I can find what I need” “The interface is confusing and hard to navigate”

Example output codebook:

CodeCategoryCount%
1Speed/Performance4522%
2Ease of Use3819%
3Navigation Issues3115%
4Visual Design2512%

Best for: Questions asking about brand awareness, preferences, or mentions like “What brands come to mind when you think of smartphones?”

How it works:

  1. AI identifies all brand names mentioned in responses
  2. Standardizes variations (e.g., “Apple,” “apple,” “APPLE” → “Apple”)
  3. Creates a code for each unique brand
  4. Counts mentions and calculates share of voice
  5. Output Excel file will contain:
    • codebook with brand definitions
    • brand codes in front of each response

Example input:

“I usually buy Samsung or sometimes Apple” “Definitely iPhone, I’ve always been an Apple person” “I prefer Google Pixel phones”

Example output codebook:

CodeBrandMentions%
1Apple/iPhone15635%
2Samsung9822%
3Google Pixel4510%

FormatExtensionNotes
CSV.csvComma-separated values, UTF-8 encoding recommended
Excel.xlsxModern Excel format (2007+)
Excel (Legacy).xlsOlder Excel format

Your file should have:

  • Header row — Column names in the first row
  • Response column — At least one column with text responses
  • ID column (optional) — Respondent identifiers for tracking

Example data structure:

UserIDResponseAgeGenderQuestion
001”I love the new design”25F”What do you think about our new design?“
002”Too expensive for what you get”34M
003”Fast shipping was great”28F

When you upload an Excel file with multiple sheets:

  • Each sheet is treated as a separate question
  • The sheet name becomes the question identifier
  • Results are generated per sheet

This is ideal for surveys with multiple open-ended questions — put each question’s responses in a separate sheet.

You can include a predefined codebook in your Excel file to guide the analysis. This is useful when you have specific categories or themes you want to identify in the responses.

Another option is to set the codebook in the analysis settings.

How to use:

  • Add two columns - one with codes and one with descriptions - to each sheet of your Excel file where you want to use a predefined codebook
  • These columns can be in any position in the sheet, and the rows of the codes are independent of the response rows
  • The AI will use these codes to categorize responses. Still it may add new codes if needed, unless “Freeze Codes” is enabled

Select the column containing the text you want to analyze.

💡 Tips:

  • Choose the column with the actual response text
  • Avoid columns with just codes or numbers
  • For Excel files, ensure the correct sheet is selected

Select the column containing unique identifiers for each respondent.

Why it matters:

  • Allows you to match results back to your original data
  • Required if you want to merge results with other survey data
  • If not specified, row numbers are used

Provide the original question that was asked. Although the AI can often infer the question from the context, providing it explicitly can help the AI generate more relevant category names and descriptions.

Why it helps:

  • Gives AI context for better categorization
  • Results in more relevant category names
  • Especially useful for ambiguous responses

Advanced settings in the Configure Survey Analysis dialog

What it does: Sets the percentage below which responses are grouped into “Other.”

ExampleMeaning
0.04 (default)Categories with <4% of responses become “Other”
0.10Categories with <10% become “Other” (fewer categories)
0.01Even rare categories (1%+) are kept (more categories)

Lower threshold = More categories, more detail Higher threshold = Fewer categories, cleaner results

What it does: Sets the minimum number of responses required for a category to be included in the results. Categories with fewer responses are merged into “Other”.

Important: This setting works in conjunction with “Other Threshold”. If a category has fewer responses than the minimum, it will be merged into “Other” regardless of its percentage.

Important: This setting does not affect special categories like “Other” or “No Response”.

ExampleMeaning
1 (default)Categories with at least 1 response are included
5Categories with at least 5 responses are included (filters out very rare categories)
10Categories with at least 10 responses are included (only common categories)

Lower minimum = More categories, more detail Higher minimum = Fewer categories, more reliable results

What it does: Suggests how many categories per response the AI can add. Some responses may have multiple categories. If the number is too low, only most relevant categories will be added.

SettingBehavior
Auto (blank)AI decides based on each response — usually 1-3 categories
Specific numberAI aims for approximately this many categories

When to specify:

  • You need a specific number for your analysis framework
  • Responses are long and you want to get the most important categories
  • In case of brand analysis where you want to focus on top brands

What it does: Controls whether AI can create new categories beyond predefined codes.

Important: This setting only works when predefined codes are provided.

SettingBehavior
Off (default)AI uses predefined codes AND creates new ones if needed
OnAI only uses predefined codes; unmatched responses go to “Other”

Use “On” when:

  • You have a complete, fixed codebook
  • Consistency across studies is critical
  • You’re replicating a previous analysis

Predefined codes let you specify categories in advance, ensuring consistent coding across studies.

  • Tracking studies — Same codes across multiple waves
  • Team standardization — Everyone uses the same categories
  • Known categories — You already know the likely response types
  • Comparative analysis — Matching categories across questions
  1. From Templates (recommended)

    • Create a template with your codes (see Templates Guide)
    • Select the template when configuring analysis
    • Codes are automatically applied
  2. From File Columns

    • If your file already has a code column, select it as “Code ID Column”
    • Select the “Code Name Column” for category descriptions
    • Existing codes will be used as the starting point

Codes are structured as number → description pairs:

CodeDescription
1Product Quality
2Price/Value
3Customer Service
4Ease of Use
98Other
99None

Results are downloaded as Excel files with these columns:

ColumnDescription
UserID/RespRespondent identifier
AnswerOriginal response text
Code1, Code2
Category numbers assigned to this response
Cod_NumCategory code number
Cod_KeyCategory description
CountNumber of responses in this category
FractionPercentage of total responses

Please note that the column names may vary depending on the column names in the input file and project/template settings.

A single response can be assigned multiple codes when it mentions several themes:

Response: “Great quality but too expensive”

  • Code1: 1 (Product Quality)
  • Code2: 2 (Price/Value)
CategoryMeaning
OtherResponses that don’t fit main categories (below threshold)
NoneIrrelevant responses or cases when the user did not provide a meaningful response

  1. Clean your data — Remove test responses, duplicates
  2. Check encoding — Use UTF-8 for CSV files
  3. Handle blanks — They will be cleaned up automatically, so you don’t need to worry about them
  4. Consistent format — Same structure across all sheets
ScenarioRecommended Settings
First-time analysis, exploring dataAuto categories, 4% threshold
Tracking study with existing codesUse template, freeze codes ON
Detailed analysis neededLower threshold (1-2%), more categories
Executive summary neededHigher threshold (5-10%), fewer categories
  1. Check category names — Do they make sense for your data?
  2. Review “Other” — Are important themes being missed?
  3. Spot-check assignments — Verify a sample of codings
  4. Adjust if needed — Re-run with different settings

Analysis is taking too long

  • Large files (10,000+ responses) take longer
  • Complex responses require more processing
  • Check your internet connection

Too many categories

  • Increase the “Other” threshold
  • Increase “Minimum responses per category”
  • Use predefined codes

Too few categories

  • Decrease the “Other” threshold
  • Decrease “Minimum responses per category”
  • Review if data has enough variety

Important themes in “Other”

  • Lower the “Other” threshold
  • Add the theme to predefined codes
  • Decrease “Minimum responses per category”

Incorrectly assigned categories to responses

  • When using predefined codes, some responses may be incorrectly assigned to categories. Making codes definition more detailed and explicit can help improve accuracy.

Wrong file column selected

  • Re-upload and carefully select the correct column
  • Check that your file has proper headers
ErrorCauseSolution
”No responses found”Selected column is emptyChoose correct response column
”Invalid file format”File is corrupted or unsupportedRe-export from source, try CSV
”Processing failed”Error in settings configurationCheck settings (especially column names) and try again
”Processing failed”High load in LLM serviceTry again later

Q: How many responses can I analyze at once? A: The system handles thousands of responses efficiently. For very large datasets (50,000+), consider splitting into batches.

Q: How long does analysis take? A: Typically 1-5 minutes for up to 1,000 responses. Larger datasets take proportionally longer.

Q: Are my predefined codes preserved? A: Yes, when using templates or predefined codes, they remain in your results even if the AI extends them.

Q: What languages are supported? A: The system can handle large variety of languages. Results may vary.

Q: Can I merge results back with my original data? A: Yes, use the respondent ID column to match results with your original dataset.

Q: How do I ensure consistent coding across multiple files? A: Create a template with predefined codes and apply it to all files. Set “Free codes” to On for strict consistency.