📊 Survey Analysis Guide

This guide explains how to use Uspeech Analytics to automatically code and categorize open-ended survey responses.

Overview

Survey Analysis uses AI to automatically categorize open-ended text responses from surveys, saving hours of manual coding work. The system:

Identifies themes in free-text responses
Creates consistent codes across all responses
Handles large datasets efficiently (thousands of responses)
Exports results in analysis-ready Excel format

This is ideal for market researchers, UX researchers, and anyone working with open-ended survey data.

⏩ Quick How-To

How to Set Up and Run Google Forms Survey Analysis

Create a Survey Analysis project
- Click Survey Analysis in the navigation bar
- Click New Project
- Name your project
- Choose Google Forms
- Click Create
Upload your data file
- Click Upload File button and choose your file to analyze
- Supported formats: CSV, Excel (.xlsx, .xls) exported from Google Forms
Configure columns
- After uploading, you’ll see Configure Google Forms dialog
- Select the column containing user ID
- Select the column containing timestamps
Configure analysis options
- Next, you’ll see Configure Survey Analysis dialog
- Review the default settings
- Adjust if needed
- You don’t need to configure column names, they are already set
- Click Save Settings

How to Set Up Survey Analysis (Custom Format)

Create a Survey Analysis project
- Click Survey Analysis in the navigation bar
- Click New Project
- Name your project
- Click Create
Upload your data file
- Click Upload File button and choose your file to analyze
- Supported file formats: CSV, Excel (.xlsx, .xls)
- Detailed format description see below
Configure columns and analysis options
- Next, you’ll see Configure Survey Analysis dialog
- Set up the required column names in the Response Column and User ID Column dropdowns
- Set up the question in the Question dropdown
  - Either choose question column name from the dropdown (question text should be in the first row of the column)
  - Or enter question text manually after choosing “Custom” option
  - Or set question-to-sheet mapping from the predefined template by choosing template name
- Set up pre-defined categories if needed; by default no pre-defined categories are used
  - Either choose “Select from columns” in the Code Source dropdown, and then set corresponding column names in the Code id and Code name dropdowns
  - Or choose pre-defined categories vocabulary name from the dropdown
- Adjust analysis parameters if needed
- Click Save Settings

How to Set Up and Run Survey Analysis from Template

Create a Survey Analysis project
- Click Survey Analysis in the navigation bar
- Click New Project
- Name your project
- Choose template name in the dropdown (see Templates & Codes Guide)
- Click Create
Upload your data file
- Click Upload File button and choose your file to analyze
- Supported file formats: CSV, Excel (.xlsx, .xls)
- Detailed format description see below
Configure analysis options
- Next, you’ll see Configure Survey Analysis dialog
- Review the default settings
- Adjust if needed
- You don’t need to configure column names, they are already set in the template
- Click Save Settings

How to Run the Analysis and View Results

Run the analysis
- Click Analyze
- Processing time depends on the number of responses
- Processing is done per question. Once processing is complete for a question, the Status will change to “Completed” on the corresponding row of the table
Review results
- View the results for each question in the table by clicking on the question row
- You will see a pie chart with category distribution
Download results
- Check all questions you want to review or download results for
- Click Export button
- Results are available as Excel files
- Each response is coded with one or more category numbers

Analysis Types

Open-Ended Analysis

Best for: General open-ended questions like “What did you like about the product?” or “How can we improve?”

How it works:

AI reads all responses to understand common themes
Creates a codebook of categories that represent the data
Assigns each response to one or more categories
Produces frequency counts and percentages
Output Excel file will contain:
- codebook with category definitions
- category codes in front of each response

Example input:

“The app is really fast and easy to use” “I love how quickly I can find what I need” “The interface is confusing and hard to navigate”

Example output codebook:

Code	Category	Count	%
1	Speed/Performance	45	22%
2	Ease of Use	38	19%
3	Navigation Issues	31	15%
4	Visual Design	25	12%

Brand Analysis

Best for: Questions asking about brand awareness, preferences, or mentions like “What brands come to mind when you think of smartphones?”

How it works:

AI identifies all brand names mentioned in responses
Standardizes variations (e.g., “Apple,” “apple,” “APPLE” → “Apple”)
Creates a code for each unique brand
Counts mentions and calculates share of voice
Output Excel file will contain:
- codebook with brand definitions
- brand codes in front of each response

Example input:

“I usually buy Samsung or sometimes Apple” “Definitely iPhone, I’ve always been an Apple person” “I prefer Google Pixel phones”

Example output codebook:

Code	Brand	Mentions	%
1	Apple/iPhone	156	35%
2	Samsung	98	22%
3	Google Pixel	45	10%

File Requirements

Supported Formats

Format	Extension	Notes
CSV	.csv	Comma-separated values, UTF-8 encoding recommended
Excel	.xlsx	Modern Excel format (2007+)
Excel (Legacy)	.xls	Older Excel format

Data Structure

Your file should have:

Header row — Column names in the first row
Response column — At least one column with text responses
ID column (optional) — Respondent identifiers for tracking

Example data structure:

UserID	Response	Age	Gender	Question
001	”I love the new design”	25	F	”What do you think about our new design?“
002	”Too expensive for what you get”	34	M
003	”Fast shipping was great”	28	F

Excel Files with Multiple Sheets

When you upload an Excel file with multiple sheets:

Each sheet is treated as a separate question
The sheet name becomes the question identifier
Results are generated per sheet

This is ideal for surveys with multiple open-ended questions — put each question’s responses in a separate sheet.

Predefined Codebook

You can include a predefined codebook in your Excel file to guide the analysis. This is useful when you have specific categories or themes you want to identify in the responses.

Another option is to set the codebook in the analysis settings.

How to use:

Add two columns - one with codes and one with descriptions - to each sheet of your Excel file where you want to use a predefined codebook
These columns can be in any position in the sheet, and the rows of the codes are independent of the response rows
The AI will use these codes to categorize responses. Still it may add new codes if needed, unless “Freeze Codes” is enabled

Analysis Settings

Response Column

Select the column containing the text you want to analyze.

💡 Tips:

Choose the column with the actual response text
Avoid columns with just codes or numbers
For Excel files, ensure the correct sheet is selected

Respondent ID Column

Select the column containing unique identifiers for each respondent.

Why it matters:

Allows you to match results back to your original data
Required if you want to merge results with other survey data
If not specified, row numbers are used

Question Text (Optional)

Provide the original question that was asked. Although the AI can often infer the question from the context, providing it explicitly can help the AI generate more relevant category names and descriptions.

Why it helps:

Gives AI context for better categorization
Results in more relevant category names
Especially useful for ambiguous responses

Advanced Settings

Advanced settings in the Configure Survey Analysis dialog

Other Threshold

What it does: Sets the percentage below which responses are grouped into “Other.”

Example	Meaning
0.04 (default)	Categories with <4% of responses become “Other”
0.10	Categories with <10% become “Other” (fewer categories)
0.01	Even rare categories (1%+) are kept (more categories)

Lower threshold = More categories, more detail Higher threshold = Fewer categories, cleaner results

Min Answers per Category

What it does: Sets the minimum number of responses required for a category to be included in the results. Categories with fewer responses are merged into “Other”.

Important: This setting works in conjunction with “Other Threshold”. If a category has fewer responses than the minimum, it will be merged into “Other” regardless of its percentage.

Important: This setting does not affect special categories like “Other” or “No Response”.

Example	Meaning
1 (default)	Categories with at least 1 response are included
5	Categories with at least 5 responses are included (filters out very rare categories)
10	Categories with at least 10 responses are included (only common categories)

Lower minimum = More categories, more detail Higher minimum = Fewer categories, more reliable results

Number of Categories

What it does: Suggests how many categories per response the AI can add. Some responses may have multiple categories. If the number is too low, only most relevant categories will be added.

Setting	Behavior
Auto (blank)	AI decides based on each response — usually 1-3 categories
Specific number	AI aims for approximately this many categories

When to specify:

You need a specific number for your analysis framework
Responses are long and you want to get the most important categories
In case of brand analysis where you want to focus on top brands

Freeze Codes

What it does: Controls whether AI can create new categories beyond predefined codes.

Important: This setting only works when predefined codes are provided.

Setting	Behavior
Off (default)	AI uses predefined codes AND creates new ones if needed
On	AI only uses predefined codes; unmatched responses go to “Other”

Use “On” when:

You have a complete, fixed codebook
Consistency across studies is critical
You’re replicating a previous analysis

Using Predefined Codes

Predefined codes let you specify categories in advance, ensuring consistent coding across studies.

When to Use Predefined Codes

Tracking studies — Same codes across multiple waves
Team standardization — Everyone uses the same categories
Known categories — You already know the likely response types
Comparative analysis — Matching categories across questions

How to Apply Predefined Codes

From Templates (recommended)
- Create a template with your codes (see Templates Guide)
- Select the template when configuring analysis
- Codes are automatically applied
From File Columns
- If your file already has a code column, select it as “Code ID Column”
- Select the “Code Name Column” for category descriptions
- Existing codes will be used as the starting point

Predefined Codes Format

Codes are structured as number → description pairs:

Code	Description
1	Product Quality
2	Price/Value
3	Customer Service
4	Ease of Use
98	Other
99	None

Understanding Results

Output File Structure

Results are downloaded as Excel files with these columns:

Column	Description
UserID/Resp	Respondent identifier
Answer	Original response text
Code1, Code2…	Category numbers assigned to this response
Cod_Num	Category code number
Cod_Key	Category description
Count	Number of responses in this category
Fraction	Percentage of total responses

Please note that the column names may vary depending on the column names in the input file and project/template settings.

Multiple Codes Per Response

A single response can be assigned multiple codes when it mentions several themes:

Response: “Great quality but too expensive”

Code1: 1 (Product Quality)
Code2: 2 (Price/Value)

Special Categories

Category	Meaning
Other	Responses that don’t fit main categories (below threshold)
None	Irrelevant responses or cases when the user did not provide a meaningful response

💡 Tips for Best Results

Preparing Your Data

Clean your data — Remove test responses, duplicates
Check encoding — Use UTF-8 for CSV files
Handle blanks — They will be cleaned up automatically, so you don’t need to worry about them
Consistent format — Same structure across all sheets

Choosing Settings

Scenario	Recommended Settings
First-time analysis, exploring data	Auto categories, 4% threshold
Tracking study with existing codes	Use template, freeze codes ON
Detailed analysis needed	Lower threshold (1-2%), more categories
Executive summary needed	Higher threshold (5-10%), fewer categories

Reviewing Results

Check category names — Do they make sense for your data?
Review “Other” — Are important themes being missed?
Spot-check assignments — Verify a sample of codings
Adjust if needed — Re-run with different settings

Troubleshooting

Common Issues

Analysis is taking too long

Large files (10,000+ responses) take longer
Complex responses require more processing
Check your internet connection

Too many categories

Increase the “Other” threshold
Increase “Minimum responses per category”
Use predefined codes

Too few categories

Decrease the “Other” threshold
Decrease “Minimum responses per category”
Review if data has enough variety

Important themes in “Other”

Lower the “Other” threshold
Add the theme to predefined codes
Decrease “Minimum responses per category”

Incorrectly assigned categories to responses

When using predefined codes, some responses may be incorrectly assigned to categories. Making codes definition more detailed and explicit can help improve accuracy.

Wrong file column selected

Re-upload and carefully select the correct column
Check that your file has proper headers

Error Messages

Error	Cause	Solution
”No responses found”	Selected column is empty	Choose correct response column
”Invalid file format”	File is corrupted or unsupported	Re-export from source, try CSV
”Processing failed”	Error in settings configuration	Check settings (especially column names) and try again
”Processing failed”	High load in LLM service	Try again later

❔Frequently Asked Questions

Q: How many responses can I analyze at once? A: The system handles thousands of responses efficiently. For very large datasets (50,000+), consider splitting into batches.

Q: How long does analysis take? A: Typically 1-5 minutes for up to 1,000 responses. Larger datasets take proportionally longer.

Q: Are my predefined codes preserved? A: Yes, when using templates or predefined codes, they remain in your results even if the AI extends them.

Q: What languages are supported? A: The system can handle large variety of languages. Results may vary.

Q: Can I merge results back with my original data? A: Yes, use the respondent ID column to match results with your original dataset.

Q: How do I ensure consistent coding across multiple files? A: Create a template with predefined codes and apply it to all files. Set “Free codes” to On for strict consistency.

Next Steps

Learn how to create and manage Templates for consistent analysis
Return to Quick Start Guide for an overview
Explore Conversation Analysis for interview insights

📊​ Survey Analysis Guide