Let’s use machine learning! This week we try to better understand Amazon Alexa by scraping a whole lot of Amazon Alexa reviews.
A Cloudy resolution
Instead of just talking and pondering about AI/ML, my resolution for 2019 is to use more machine learning in Cloudy itself. While I am not going to build an algorithm to do all of my writings, something that Forbes is experimenting with, hopefully by using AI/ML on different topics we can extract interesting insights.
A good place to start is by applying machine learning to a product that relies on machine learning: Amazon Alexa. After Amazon announced it sold over 100mn Alexa units, I wanted to get a better understanding of how people are using their Alexa and what they think of them. Instead of doing a survey or interviewing people, I decided to use machine learning.
First, I scraped several thousand reviews of the Amazon Echo (which is the Amazon’s Choice for the word echo). To make matters fun, I decided to use open source and AWS ML solutions to see what sort of insights I could come up with; Amazon ML on Amazon ML.
Side note: We utilize a proprietary Ogilvy tool to do this type of work and let me know if you are interested in learning more.
A disclaimer on reviews
Amazon reviews don’t tell the full picture of a product, especially Alexa. The median number of words that people use in my sample is 15 per review and 20% of reviews had less than 6 words. A lot of people just put “Love it!” or “Great” which makes you wonder why people are putting reviews in the first place. If you’ve never seen reviews on Amazon before, here are some examples:
Love my Amazon 2nd generation. The speakers are amazing
What can I say other than I LOVE MY ALEXA!! I had a dot and it was great, but this has the SOUND! I bought another dot to give my son and gave my previous dot to my other son. These are so cool. Quality made and it has great sound. Thanks Amazon for another great product.
Lot of fun
What are people saying?
The first thing I tried was a technique called Latent Dirichlet Allocation (Dirichlet rhymes with hooray), which attempts to assign different words to topics. While the results aren’t perfect, you can see some interesting associations:
|Sound Quality||Great, good, sound, quality, sounds|
|Music||Music, like, sound, play|
|Setup||Easy, setup, use|
Many of the reviews focus on the sound quality of the speaker and how convenient it is to play music. It seems for this specific version of Alexa, people aren’t overwhelmed and raving about the AI capabilities of the device, but rather like it for being a superior speaker. This is also confirmed by survey done by voicebot.ai, showing listening to music is the #2 use case (answering a question is #1).
I then decided to put the reviews through Amazon’s keyword extraction API, which does exactly that; extracts keywords from text. The Amazon API identifies music and sound quality as the top keywords, but also identifies gift/christmas, indicating some people have either given or received Alexa as a gift.
Amazon Keyword Result
|Keyword||Rank (Dec/Jan 2019)|
This insight aligns with data from Google Trends which shows queries for Alexa are highly cyclical and peak during the Christmas season, while also getting some lift for CES.
Google Trends for Alexa (peaks are Christmas time)
How does machine learning compare to just listing the most frequently occurring words? After removing words like I, he, she, etc, the top three words are “sound”, “great”, and “music” with gift ranking #51, implying there is some added benefit by using machine learning instead of just listing the top words.
If you are the brand manager for Alexa, an interesting insight here would be to talk about what a great gift idea the Alexa could be. Or maybe you advertise the lower priced Alexa as a stocking stuffer. Not that Amazon needs to sell any more Alexa’s…
While each reviewer is forced to give a numeric rating of the product, I wanted to see if the text of the review matched the rating. To do this, I decided to pass the reviews into Amazon’s sentiment analysis to see what results I get. This confirmed people who give favorable reviews of the product also use positive language to describe it. The Amazon API matches a similar distribution of how people actually rate the Alexa, with the large majority of reviews being rated “Positive”
Actual Rating Distribution/API Sentiment Distribution
|Actual Rating||% of reviews||Amazon API Rating||% of reviews|
While people aren’t raving about the AI capabilities of Alexa, they are satisfied and even seem to be having fun with the product. The main feature of this specific Alexa is the speaker, which implies maybe Amazon should go into the speaker business?
One final one, while I didn’t scrape reviews for the Amazon Dot, I received one for Christmas…which means I have two Alexa’s sitting unplugged in my kitchen drawer.