0% found this document useful (0 votes)

261 views

Starbucks Sentiment Analysis Using VADER

Uploaded by

Arief Setiawan

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

261 views

Starbucks Sentiment Analysis Using VADER

Uploaded by

Arief Setiawan

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 23

Uncovering Starbucks

Customer Reviews
Using VADER Analysis

By Yulian Farid Wahyudi

Background
Have you ever wondered why Starbucks is
known as having the best coffee? It earned
the prestigious title of 'No. 1 Best Coffee' in
Zagat's Survey of National Chain Restaurants
from 2009 to 2011.

At Starbucks, it's not just coffee, it's a cultural

experience. With a cosy vibe, a diverse menu,
and a commitment to quality, Starbucks is
unique. Many enjoy not just coffee but a daily
ritual of excellence.

It's no surprise that Starbucks is hugely

popular. But with all this popularity, what's the
community's real opinion about Starbucks?
Is Starbucks just overrated, or is there more to it?
This question stems from a curiosity about how
the public truly perceives the renowned coffee
giant. In this analysis, I've chosen to use VADER.

VADER (Valence Aware Dictionary for Sentiment

Reasoning) is an NLTK module that provides
sentiment scores based on the words used.

VADER has the advantage of assessing the

sentiment of any given text without the need for
previous training data. The result generated by
VADER is a dictionary of 4 keys; neg, neu, pos, and
compound. neg, neu, and pos meaning negative,
neutral, and positive respectively. The compound
refers to a metric that represents the overall
sentiment of a piece of text.
About Dataset
Source : Content :

The dataset is secondary to Kaggle with a. Name: The reviewer's name (simulation)
850 observations. The data was collected b. Location: The location or city associated with
by web scraping customer reviews and the reviewer, if provided.
ratings from the ConsumerAffairs website, c. Date: The date when the review was posted.
whose location is the USA. Note that this d. Rating: The star rating given by the reviewer,
dataset is for research and analysis ranges from 1 to 5.
purposes and may be subject to the e. Review: The textual content of the review,
terms and conditions specified by captures the reviewer's experience and
ConsumerAffairs. opinions

https://www.kaggle.com/datasets/ha
rshalhonde/starbucks-reviews-dataset
Flow Analysis

Data VADER Sentiment Conclusion and

Preprocessing Scoring Suggestions

02 04 06
01 03 07
05
Start EDA Wordcloud End
Data Pre-Processing
A. Split the Location’s Variable and Drop The Unnecessary Variable
The location variable contains two pieces of information : state and city. It will benefit us to split the location
into two variable because we can explore the data through both city and state.

Then, we also have to drop the unnecessary variable Image_Links. We will not analyse it through sentiment
analysis.
Data Pre-Processing
B. Detecting and Handling Missing Values
The next step is to detect and drop the missing values. Dropping missing values is often done to maintain data integrity and
ensure compatibility with certain analyses or models. This approach simplifies the data by removing instances where
information is incomplete, which can be crucial for maintaining the accuracy of statistical analyses and preventing biases.

From the output above, we know that there are 145 missing values in Rating’s variable. We will remove the missing values and
check again. After removing the missing value from the output above, we know that no missing values exist, and we can
continue the analysis.
Data Pre-Processing
C. Text Data Pre-Processing
In this step, we will prepare the text data. The pre-processing of this data
contains several steps :
a. Convert Text to Lowercase
It ensures uniformity in the dataset, as the same word in different
cases is treated as identical.
b. Remove the unnecessary contents :
in this case, we’ll remove the links, @, hashtag, contraction,and
punctuation inside the text data that are not important in this analysis.
c. Tokenize The Text
Tokenization allows for a word-level analysis of the text. By breaking
down sentences into individual words, it becomes easier to
understand the semantic meaning and relationships between words
d. Remove the Stopwords
Stopwords are common words (e.g., "and," "the," "is") that often occur
frequently in a language but contribute little to the overall meaning of
a text. By removing stopwords, we reduce noise in the data, allowing
the analysis to focus on more meaningful words
e. Join The Words Back into a Cleaned Sentence
During preprocessing, the original text is often transformed into a list
of individual words or tokens. Joining these words back together
reconstructs the cleaned text, making it readable and usable for
downstream tasks
Data Pre-Processing
C. Text Data Pre-Processing
After doing the text preprocessing, the data is ready to used for analysis. We can check the data sample below, and it looks clean
enough to be an analysis object.
Exploratory Data Analysis (EDA)
A. Reviews by Rating
The bar chart shows that many customers are giving low ratings, especially rating 1, indicating possible dissatisfaction. Rating 2 in
the second place is also common, signalling moderate dissatisfaction among customers.
Exploratory Data Analysis (EDA)
B. Ratings by State
If we look at the ratings by state, we know that visually, of the 10 states with the most ratings, the majority gave a rating of 1 with the
state CA (California) as the biggest contributor.
Exploratory Data Analysis (EDA)
B. Ratings by City
But If we look at the ratings by cities, we know that visually, of the 10 states with the most ratings, the majority gave a rating of 1,
with the city New York as the biggest contributor.

The discrepancy between the highest state-level ratings and city-level contributions might be attributed to the way the data is
aggregated. While California (CA) could have the highest overall ratings when considering the entire state, at the city level, New
York might be the primary contributor to rating 1. This can create a situation where, visually, California appears to have the highest
ratings, but when broken down by city, New York stands out.
Exploratory Data Analysis (EDA)
B. Ratings by Year
From the graph below, we can see that 2017 had the most ratings collected. The public was very enthusiastic about Starbucks
through reviews and ratings.

If we look back to 2017, from the Starbucks press release annual meeting report, Starbucks boosted its digital offerings with
innovations like the expanded Mobile Order & Pay platform, which allows customers to order via Amazon Alexa and Ford
vehicles. This supports data regarding the number of people who were enthusiastic about providing ratings and reviews of
Starbucks in 2017. But unfortunately, after 2017, public interest decreased.
https://stories.starbucks.com/press/2017/press-release-starbucks-annual-meeting-2017/
VADER Sentiment Scoring
A. Employ NLTK’s SentimentIntensityAnalyzer
We will employ NLTK's SentimentIntensityAnalyzer to obtain the negativity, neutrality, and positivity scores of the text. This method
utilises a "bag of words" strategy:
a. Stop words are excluded
b. Each word is assigned a score, which is then aggregated to derive a total score.

Once the sia object is created, we can use its methods, such as
polarity_scores, to get sentiment scores, including negativity,
neutrality, and positivity, for a given text.

For example:
“The taste is so Bad!”
'neg': 0.545 - This indicates that approximately 54.5% of the text is
classified as negative.
'neu': 0.455 - This indicates that approximately 45.5% of the text is
classified as neutral.
'pos': 0.0 - This indicates that 0% of the text is classified as positive.
'compound': -0.6988 - This is an overall compound sentiment score
that combines the scores of neg, neu, and pos. In this case, the
compound score is negative, suggesting an overall negative
sentiment.
VADER Sentiment Scoring
B. Employ Sentiment Analysis on dataset
In the next step, we will employ sentiment analysis using the SentimentIntensityAnalyzer (sia) on the entire dataset (df). It iterates
through each row in the dataset, extracts the cleaned text from the 'clean' column, and calculates the polarity scores using the
polarity_scores method from the SentimentIntensityAnalyzer. The results, which include the compound sentiment score, are stored
in a dictionary (res) with the ‘name' of each row as the key.
VADER Sentiment Scoring
B. Employ Sentiment Analysis on dataset
After that, we will create a new DataFrame (vaders) that combines the sentiment analysis results with the original dataset based on
the ‘name' column. From the graph below, we know that the scores of neg, neu, pos, and compound already exist on the dataset.
VADER Sentiment Scoring
C. Sentiment Scoring vs Rating
To know that sentiment scoring is in line with rating, we will visualise the difference between sentiment scoring and rating.

From the boxplot above, when the rating is higher, the compound score will be higher too. When the rating is neutral, the
compound score will be flat, and when the rating is negative, the compound score will be lower.
Visualise The Classification using Wordcloud
A. Positive Sentiment
From the wordcloud below, we know that “Service”,”Time”,”Order”,”Employee” are those that appear more frequently in positive
reviews.

1. The frequent mention of "Service"

indicates that customers highly
appreciate the quality of service
provided.

2. Keywords like "Time" and "Order"

suggest that customers are
satisfied with the timely
processing of their orders,
reflecting operational efficiency.
Visualise The Classification using Wordcloud
B. Negative Sentiment
From the wordcloud below, we know that “Time”,”Order”,”Manager”,”Service” are those that appear more frequently in negative
reviews.
1. The mention of "Service" indicates
that customers are mainly
unhappy with customer service
and employee behaviour

2. The frequent mention of "Time" in

negative reviews indicates that
customers may face challenges
related to service speed or timely
order fulfilment.

3. The frequent mention of "Order"

implies that negative experiences
might be linked to issues with
order accuracy, fulfilment, or
general processing.

4. The word "Manager" appearing in

negative reviews indicates that
customers may have
encountered situations requiring
managerial intervention.
Visualise The Classification using Wordcloud
C. Neutral Sentiment
From the wordcloud below, we know that “Ordered”,”Location”,”Grande”,”Venti” are those that appear more frequently in neutral
reviews.

In the context of neutral reviews, the frequent appearance of the words suggests that customers are likely discussing their
orders, the store location, and specific drink sizes. This information can be valuable for understanding common topics or
aspects that customers mention without a strong positive or negative sentiment.
Conclusion and Suggestions
A. Conclusion
From the analysis that we have carried out, the following conclusions are obtained.
1. Customer Dissatisfaction Hotspots:
- The bar chart reveals a significant number of low ratings, particularly rating 1, signalling dissatisfaction.
- California states and New York City stand out as areas with the highest dissatisfaction levels, necessitating targeted
improvement efforts.
2. Digital Innovations Impact:
- The boost in digital offerings in 2017, such as the Mobile Order & Pay platform, contributed to heightened enthusiasm and
increased customer engagement.
- Post-2017, there is a noticeable decline in public enthusiasm, highlighting the need for continued innovation and adaptation to
evolving consumer expectations.
3. Service Excellence and Operational Efficiency:
- Customer reviews consistently emphasise the importance of "Service," showcasing a positive sentiment towards the quality of
service provided.
- Positive mentions of "Time" and "Order" indicate satisfaction with efficient order processing and operational excellence.
4. Challenges Leading to Negative Experiences:
- Negative reviews featuring "Time" suggest challenges related to service speed and timely order fulfilment.
- Negative experiences related to "Order" point to potential issues with accuracy, fulfilment processes, or general processing.
- The appearance of "Manager" in negative reviews implies situations requiring managerial intervention.
5. Insights from Neutral Reviews:
- Neutral reviews highlight discussions around routine topics like orders, store locations, and specific drink sizes.
- These insights can guide Starbucks in understanding common customer concerns that may not strongly lean towards either
positive or negative sentiment.
Conclusion and Suggestions
B. Suggestions
Based on the conclusions that have been presented. Some of the recommendations given are as follows.
1. Addressing Dissatisfaction Hotspots:
- Implement targeted improvement initiatives in areas with the highest dissatisfaction, particularly in California states
and New York City.
- Conduct thorough investigations into low-rated experiences to identify specific pain points and tailor solutions.
2. Revitalising Digital Strategies:
- Enhance public enthusiasm by introducing fresh digital innovations and enhancements.
- Leverage customer feedback to adapt digital offerings, ensuring they align with evolving expectations and
preferences.
3. Sustaining Service Excellence:
- Strengthen the focus on service quality, acknowledging its crucial role in overall customer satisfaction.
- Emphasise efficient order processing, operational excellence, and positive interactions with staff members.
4. Addressing Challenges for Negative Experiences:
- Tackle challenges related to service speed and order fulfilment highlighted in negative reviews.
- Implement corrective measures to address issues with order accuracy, fulfilment processes, and general processing.
5. Managerial Intervention:
- Address situations requiring managerial intervention, as indicated by the appearance of "Manager" in negative reviews.
- Provide additional training and support for staff and managers to handle challenging scenarios effectively.
6. Leveraging Insights from Neutral Reviews:
- Use insights from routine discussions in neutral reviews to enhance customer experiences.
- Consider incorporating feedback on orders, store locations, and specific drink sizes into operational improvements.
Thank You
Yulian Farid Wahyudi
yulianfarid4@gmail.com
https://www.linkedin.com/in/yulianfarid-wahyudi/

Hourglass Workout Program by Luisagiuliet 2
76% (21)
Hourglass Workout Program by Luisagiuliet 2
51 pages
12 Week Program: Summer Body Starts Now
87% (46)
12 Week Program: Summer Body Starts Now
70 pages
Read People Like A Book by Patrick King-Edited
57% (83)
Read People Like A Book by Patrick King-Edited
12 pages
Livingood, Blake - Livingood Daily Your 21-Day Guide To Experience Real Health
77% (13)
Livingood, Blake - Livingood Daily Your 21-Day Guide To Experience Real Health
260 pages
Cheat Code To The Universe
94% (79)
Cheat Code To The Universe
34 pages
Facial Gains Guide (001 081)
91% (45)
Facial Gains Guide (001 081)
81 pages
Curse of Strahd
95% (467)
Curse of Strahd
258 pages
The Psychiatric Interview - Daniel Carlat
91% (34)
The Psychiatric Interview - Daniel Carlat
473 pages
The Borax Conspiracy
91% (57)
The Borax Conspiracy
14 pages
The Secret Language of Attraction
86% (108)
The Secret Language of Attraction
278 pages
How To Develop and Write A Grant Proposal
83% (542)
How To Develop and Write A Grant Proposal
17 pages
Penis Enlargement Secret
60% (124)
Penis Enlargement Secret
12 pages
Workbook For The Body Keeps The Score
89% (53)
Workbook For The Body Keeps The Score
111 pages
Donald Trump & Jeffrey Epstein Rape Lawsuit and Affidavits
83% (1016)
Donald Trump & Jeffrey Epstein Rape Lawsuit and Affidavits
13 pages
KamaSutra Positions
78% (69)
KamaSutra Positions
55 pages
7 Hermetic Principles
93% (30)
7 Hermetic Principles
3 pages
27 Feedback Mechanisms Pogil Key
77% (13)
27 Feedback Mechanisms Pogil Key
6 pages
Frank Hammond - List of Demons
92% (92)
Frank Hammond - List of Demons
3 pages
Phone Codes
79% (28)
Phone Codes
5 pages
36 Questions That Lead To Love
91% (35)
36 Questions That Lead To Love
3 pages
How 2 Setup Trust
97% (307)
How 2 Setup Trust
3 pages
100 Questions To Ask Your Partner
78% (36)
100 Questions To Ask Your Partner
2 pages
The 36 Questions That Lead To Love - The New York Times
91% (35)
The 36 Questions That Lead To Love - The New York Times
3 pages
Satanic Calendar
25% (56)
Satanic Calendar
4 pages
The 36 Questions That Lead To Love - The New York Times
95% (21)
The 36 Questions That Lead To Love - The New York Times
3 pages
Jeffrey Epstein39s Little Black Book Unredacted PDF
77% (13)
Jeffrey Epstein39s Little Black Book Unredacted PDF
95 pages
14 Easiest & Hardest Muscles To Build (Ranked With Solutions)
100% (8)
14 Easiest & Hardest Muscles To Build (Ranked With Solutions)
27 pages
1001 Songs
70% (73)
1001 Songs
1,798 pages
Zodiac Sign & Their Most Common Addictions
63% (30)
Zodiac Sign & Their Most Common Addictions
9 pages
The 4 Hour Workweek, Expanded and Updated by Timothy Ferriss - Excerpt
23% (954)
The 4 Hour Workweek, Expanded and Updated by Timothy Ferriss - Excerpt
38 pages
Homework 4
No ratings yet
Homework 4
4 pages
Zomato Restaurant Clustering & Sentiment Analysis - Ipynb - Colaboratory
No ratings yet
Zomato Restaurant Clustering & Sentiment Analysis - Ipynb - Colaboratory
27 pages
Trackpad Pro Ver. 5.0 Class 6
From Everand
Trackpad Pro Ver. 5.0 Class 6
Nidhi Arora
No ratings yet
CS178 Homework #1: Problem 0: Getting Connected
No ratings yet
CS178 Homework #1: Problem 0: Getting Connected
4 pages
House Price Prediction: Project Description
No ratings yet
House Price Prediction: Project Description
11 pages
DataMining Course Handout PDF
No ratings yet
DataMining Course Handout PDF
5 pages
Sample - Customer Churn Prediction Python Documentation
No ratings yet
Sample - Customer Churn Prediction Python Documentation
33 pages
Hotels Review Classification Final
No ratings yet
Hotels Review Classification Final
34 pages
(CS2102) Group 4 Project Report
No ratings yet
(CS2102) Group 4 Project Report
22 pages
Synopsis Minor Project-2
No ratings yet
Synopsis Minor Project-2
5 pages
Assignment Data Analysis Example
100% (1)
Assignment Data Analysis Example
10 pages
Unit II Notes
No ratings yet
Unit II Notes
36 pages
Data Visualization R Programming Power Bi Lab Record
No ratings yet
Data Visualization R Programming Power Bi Lab Record
29 pages
Lifecycle of A Data Science Project
No ratings yet
Lifecycle of A Data Science Project
1 page
Technical Communication - 2023-2024
No ratings yet
Technical Communication - 2023-2024
2 pages
Capstone Presentation
No ratings yet
Capstone Presentation
9 pages
Hackathon Overall Travel Experience of Traveling in Shinkansen Bullet Train Merging Two Data Set
No ratings yet
Hackathon Overall Travel Experience of Traveling in Shinkansen Bullet Train Merging Two Data Set
59 pages
CSC8001-Data Science Project Report
No ratings yet
CSC8001-Data Science Project Report
5 pages
Advanced Certification in Data Science and Artificial Intelligence
No ratings yet
Advanced Certification in Data Science and Artificial Intelligence
18 pages
Text Analytics
No ratings yet
Text Analytics
30 pages
Great Lakes Extraa_Learn Project Business Report - 2-Kavish-Rathod
No ratings yet
Great Lakes Extraa_Learn Project Business Report - 2-Kavish-Rathod
22 pages
Credit EDA Assignment PDF
No ratings yet
Credit EDA Assignment PDF
40 pages
LP3 - ML Mini-Project Report Format Shreeyas
No ratings yet
LP3 - ML Mini-Project Report Format Shreeyas
13 pages
NOTES OF Python Ok
No ratings yet
NOTES OF Python Ok
73 pages
Prediction of Mobile Phone Price Class Using Supervised Machine Learning Techniques
No ratings yet
Prediction of Mobile Phone Price Class Using Supervised Machine Learning Techniques
4 pages
Machine Learning Projects PDF
No ratings yet
Machine Learning Projects PDF
5 pages
Big Data
No ratings yet
Big Data
9 pages
Module 2
No ratings yet
Module 2
20 pages
Project DVT CarInsurance
No ratings yet
Project DVT CarInsurance
10 pages
Interview Preparations - NielsenIQ
No ratings yet
Interview Preparations - NielsenIQ
1 page
KPMG Task 2
No ratings yet
KPMG Task 2
5 pages
Time Series Forecasting - Sparkling - Buisness Report
No ratings yet
Time Series Forecasting - Sparkling - Buisness Report
70 pages
PYF_Project_LearnerNotebook_LowCode
No ratings yet
PYF_Project_LearnerNotebook_LowCode
6 pages
Capstone Notes-Model
No ratings yet
Capstone Notes-Model
20 pages
Tourism Adoption Project Report
No ratings yet
Tourism Adoption Project Report
14 pages
Sentiment Analysis of Restaurant Customer
100% (1)
Sentiment Analysis of Restaurant Customer
6 pages
Technical Docs of NETFLIX MOVIES AND TV SHOWS CLUSTERING
No ratings yet
Technical Docs of NETFLIX MOVIES AND TV SHOWS CLUSTERING
12 pages
Assignment 2 Solution
No ratings yet
Assignment 2 Solution
6 pages
"Sentiment Analysis of Imdb Movie Reviews": A Project Report
No ratings yet
"Sentiment Analysis of Imdb Movie Reviews": A Project Report
27 pages
Machine Learning - Customer Segment Project. Approved by UDACITY
100% (1)
Machine Learning - Customer Segment Project. Approved by UDACITY
19 pages
Final Report
No ratings yet
Final Report
729 pages
Assignment 02
No ratings yet
Assignment 02
9 pages
LDA KNN Logistic
100% (1)
LDA KNN Logistic
29 pages
Project Report On DBMS Project
No ratings yet
Project Report On DBMS Project
22 pages
Exploratory Data Analysis
No ratings yet
Exploratory Data Analysis
9 pages
Data Warehousing and Data Mining - Handbook
0% (2)
Data Warehousing and Data Mining - Handbook
27 pages
M4 Data Mining W4 Business Report
No ratings yet
M4 Data Mining W4 Business Report
22 pages
Wine Quality Synopsis
No ratings yet
Wine Quality Synopsis
3 pages
Machine Learning Introduction
No ratings yet
Machine Learning Introduction
20 pages
A Big Data Analytics Study Challenges, Unresolved Research Issues, and Techniques
100% (1)
A Big Data Analytics Study Challenges, Unresolved Research Issues, and Techniques
8 pages
Ss Project With Python
No ratings yet
Ss Project With Python
9 pages
Customer Review Analysis Using Data Science
No ratings yet
Customer Review Analysis Using Data Science
31 pages
SE 7204 BIG Data Analysis Unit I Final
No ratings yet
SE 7204 BIG Data Analysis Unit I Final
66 pages
Anomaly Detection
No ratings yet
Anomaly Detection
11 pages
MCA Project Titles
No ratings yet
MCA Project Titles
2 pages
Project On Sentimental Analysis: Submitted by
No ratings yet
Project On Sentimental Analysis: Submitted by
17 pages
Text Mining Project Report
No ratings yet
Text Mining Project Report
27 pages
116222942-Data Mining-On-Forest-Cover-Prediction
No ratings yet
116222942-Data Mining-On-Forest-Cover-Prediction
21 pages
Optimizing Hadoop for MapReduce
From Everand
Optimizing Hadoop for MapReduce
Khaled Tannir
No ratings yet
(Excerpts From) Investigating Performance: Design and Outcomes With Xapi
From Everand
(Excerpts From) Investigating Performance: Design and Outcomes With Xapi
Janet Laane Effron
No ratings yet

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.

Starbucks Sentiment Analysis Using VADER

Uploaded by

Starbucks Sentiment Analysis Using VADER

Uploaded by

Uncovering Starbucks

By Yulian Farid Wahyudi

At Starbucks, it's not just coffee, it's a cultural

It's no surprise that Starbucks is hugely

VADER (Valence Aware Dictionary for Sentiment

VADER has the advantage of assessing the

Data VADER Sentiment Conclusion and

1. The frequent mention of "Service"

2. Keywords like "Time" and "Order"

2. The frequent mention of "Time" in

3. The frequent mention of "Order"

4. The word "Manager" appearing in

You might also like

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.