Summarizing engagement surveys for non-technical audience using ChatGPT API

Distill Survey highlights and Overall Sentiment pulse check in less than 10 steps!

Shilpa Leo
5 min readAug 26, 2023
Photo by Jacqueline Munguía on Unsplash

Companies run surveys ALL the time! From engagement surveys to new program effectiveness surveys and onboarding/offboarding experience surveys, there are many of them! The survey results are crucial driving factors for business actions toward fixing problems to keep the target audience more engaged. That puts a lot of weight on the teams accountable for summarizing the insights from these surveys. Especially in non-technical team settings, this would mean a lot of work to analyze the survey data on applications like Excel to distill business-presentable insights.

How ChatGPT API can help with turning Text into Insights

First, it is essential to understand why this article does not use ChatGPT’s web user interface and instead uses the API. Of course, the Web interactive tool is the most popular option associated with ChatGPT and is widely used by many of us. But, the API way presents some advantages, such as:

  • You can unleash the full power of ChatGPT within your Python code notebook to continue other downstream analyses with the output from your ChatGPT prompt.
  • Going the “programmatic” way is always superior after testing the waters with the “UI” way — it simply allows for scalability and automation because the same code can be recycled with just a new dataset change to get quick insights!
  • OpenAI documentation also suggests using the API for Text analysis use cases, given its higher flexibility.

Code

The following code can be run on Jupyter or Google Colab with necessary adaptions, especially on the .csv file read portion.

# STEP 1: install the OpenAI Python library

#!pip install openai # uncomment and run for first time installation

The OpenAI Python library must be configured with your account’s secret key, available on this website. You’ll need to set an API key up to proceed further.

  • Click on Login if you have an existing ChatGPT account or, click on Sign Up
  • Under API Keys, click on Create new API key and ensure to copy the created API key to proceed further
# STEP 2: pass the created API key

import openai
openai.api_key = "sk-..." # insert your own API key within " "

I loaded a local .csv file into a Google Colab notebook like the code below (adapted from this source). While using Jupyter Notebook instead, you can directly read the local file from its path using the Pandas pd.read_csv() call.

# STEP 3: read file with text comments

from google.colab import files
uploaded = files.upload()

Once the above code executes, choose the local file (my file was called ‘survey.csv’), then parse its contents into a Pandas DataFrame.

# STEP 4: parse file into Pandas DataFrame

import pandas as pd
import io

df = pd.read_csv(io.BytesIO(uploaded['survey.csv']))
print(df.shape)
df.head()

Some text data formatting was then done using Pandas DataFrame’s str.cat(). This method concatenated strings element-wise within a column. It’s a way to combine strings in a specified column with a separator. The default separator is an empty string (‘’). This means that if you use str.cat() without specifying a separator, it will concatenate the strings in the specified column without any added characters in between them.

Using this technique, I made one big block containing all rows of text from my raw data.

# STEP 5: prepare text for chatgpt summary

combined_comments = df['comments'].astype(str).str.cat()
print(combined_comments)

Finally, the magic happens with this function call created to use OpenAI’s gpt-3.5-turbo model and the chat completions endpoint for our task.

  • The documentation shows how an example API call looks like (the function and the parameter calls necessary to pass)
  • It also outlines how the Chat completions API response looks like, which can be treated as a Python (nested) dictionary
  • The chat assistant’s reply can be extracted with response['choices'][0]['message']['content']
  • Note: The API is non-deterministic by default. This means you might get a slightly different completion every time you call it, even if your prompt stays the same. Setting temperature = 0 will make the outputs mostly deterministic, but a small amount of variability will remain.
# STEP 6: create helper function for task

def comment_summarization(prompt, model="gpt-3.5-turbo"):
messages = [{"role": "user", "content": prompt}]
response = openai.ChatCompletion.create(
model=model,
messages=messages,
temperature=0, # this is the degree of randomness of the model's output
)
return response.choices[0].message["content"]

You can infer from my prompt that I’m being super clear and specific in outlining the expectations around my task in my prompt — this is one of the key recommended principles by OpenAI for developers’ prompt engineering. These clear prompts then act as guiding principles for ChatGPT to consider while analyzing and finally reporting the expected output in the exact requested format, which is intentionally dictionary-like for further downstream processing.

# STEP 7: summarize comments

prompt = f"""
Perform the following actions:
- Summarize the text delimited by triple backticks into 3 items in a python list
- Analyse and report the overall sentiment only one out of the following sentiments: [positive, negative, neutral, mixed]
- Explain your assessment of sentiment in bullets format

Return the output in the following format:

{{"Highlights": ```[summary here as 3 list items]```,
"Sentiments": ```overall sentiment here```,
"Sentiment analysis":```sentiment analysis here```}}

Here are the source comments to analyse:
```{combined_comments}```
"""
response = comment_summarization(prompt)
print(response)
  • The output returned adhered to the requested specs, had highlights distilled from the bunch of comments combined for analysis, the overall sentiment deduced and reasoning for the deduced sentiment — which any non-technical team can
  • If you examine the type of variable response, it’ll be a string, specifically, a JSON string, which can be easily converted to a Python dictionary using the json module’s json.loads() function, which is part of the Python standard library. The resulting output from executing below will be the dictionary representation of the JSON data that can be assigned to a variable and converted back to Pandas DataFrame, etc., for further processing.
# STEP 8: convert JSON string output (response) into a Python dictionary
import json
json.loads(response)

Summary

Within less than 10 steps, we saw how ChatGPT API can be embedded inside a Python code notebook to go from data to insights quickly! The insights thus generated can be shared with non-technical audiences to get a high-level digest from survey responses. Do try it out on your text dataset and let me know your comments on how it went — did you have to enhance the prompts further, or the prompts here did the same magic for you as they did on my data? Love to hear!

This was just the beginning trial of using this amazing API for business insights. I’ll explore more and add more complexity to the task — stay tuned for future follow up articles!

Inspired by knowledge gained from this awesome DeepLearning.AI tutorial.

--

--

Shilpa Leo

Data Scientist| EdTech Instructor| Data Analytics| AWS| Public Speaker| Python| ML| NLP| Power BI | SQL| RPA| https://www.linkedin.com/in/shilpa-sindhe/