Conversational UI is revolutionizing how we interact with technology. Even if you aren’t personally convinced, all the major powerhouses of technology are. Microsoft, Facebook, Google, Apple and Amazon have invested heavily in messaging apps and conversational AI.
Amazon takes conversational experiences even further with their successful Echo product, an connected speaker with far-field microphone technology, access to all of Amazon’s e-commerce capabilities, and a smart AI assisted named Alexa. On the Echo, all commands are delivered vocally, free of hands or devices, which differentiates their voice UI experience from text-driven Messenger bots, iMessage mini-app windows, invasive Slackbots, and the device-specific experiences of Siri or Cortana.
Here at TOPBOTS, we’ve built conversational and voice UI experiences on several platforms — including Messenger, Slack, and the Amazon Echo — and have had the opportunity to compare the developer experience across them. Despite the potential of Alexa’s pervasive and invisible UI, the device is one of the least friendly platforms for an Amazon Echo developer to built on.
Here are four important reasons why:
Common Technical Challenges For Alexa Skill Developers
1. Your Skill invocation phrase will likely conflict with a song on Prime Music
Invocation phrases are unique verbal sentences that are used to activate your particular Skill on the Echo. They’re already inherently tricky to design since voice recognition and interpretation can be volatile, but the Echo adds an extra hurdle for developers to overcome.
The vast majority of Echo customers have Amazon Prime, which means they also have access to the entire digital music library in Prime Music. This is an epic feature from the user perspective, but as a Amazon Echo developer of Alexa Skills, user access to such a gigantic musical database means that whatever invocation phrase you choose for your Skill is very likely to conflict with a random song, album, playlist, or artist.
Earlier this summer, we released a guided meditation Skill for Echo called “Clear Mind”. This skill would allow you to choose between three different kinds of guided meditations: Breath Awareness, Relaxation, or Loving Kindness. Turns out there was organic demand and excitement for this Skill, but we still ended up with a bunch of 1-star ratings in the Alexa Store.
Why? Because users would try to activate the Skill by saying “Alexa, start Clear Mind”, but would end up activating a random musical track with the words “clear” and “mind” in the title. This led frustrated users to write 1-star reviews claiming the Skill “didn’t work”.
Even checking against search results on the Amazon Music library doesn’t necessarily mean you’re in the clear. We tried switching the name of our skill to “Everyday Meditation”, which seemed to only have 3 conflicts on Prime Music. However, when we tested the invocation phrase in development, we discovered to our chagrin that there was a song called “Everyday Meditation” by Meditation House that conflicted with our Skill each time.
As an Amazon Echo developer, you can avoid this problem by searching for potential naming conflicts on both Amazon, the Amazon Music library, and having beta users test your invocation phrases. We eventually renamed our meditation Skill to Peaceful Habit, which luckily no musicians thought of before we did.
2. You can’t respond to or remove erroneous reviews
Much like in the iOS app store, the easiest way for a customer to report a bug is to give you a one-star review. The same frustration is true for Alexa. After receiving a slew of 1-star reviews claiming that our Skill “didn’t work”, we lost organic traffic that could have been recouped if we had the opportunity to respond to reviews informing users of how to get around the Prime Music conflict.
Reviews can be cleared if you engage with Amazon Echo developer support, but this process can take days and involves quite a bit of back and forth.
3. You can’t remove a Skill without a drawn-out conversation with developer support
On iOS, you can push one button in your developer console that removes an app from distribution. While this seems like a no-brainer capability to offer developers, we discovered you can’t remove a Skill from the Alexa Store without engaging in a lengthy discussion with Amazon’s developer support on why you’re doing it.
Here are the steps we went through to remove Clear Mind from the Skills store:
- Googled how to remove a Skill
- Found a post in an Amazon developer forum where developers complained about not being able to remove a Skill
- Clicked on a link in the post to an Amazon customer support form
- Filled out the form
- Waited two days for a response
- Received a response requesting an explanation of why we needed to remove the Skill rather than update or fix it & a link to the same post in the developer forum we had previously referenced
- Filled out the form AGAIN
- Finally received a confirmation that the skill would be removed
- Breathed a sigh of relief
- Vowed to publish our Skill removal travails in a ranty article 😛
- Published our travails in a ranty article
- Sent ranty article to Amazon support, which promised to escalate
- Got email notice from Amazon that the Clear Mind Skill had been removed
- Noticed weeks later that Clear Mind was still in the store
- Sent angry followup email to Amazon asking why
- Got email from Amazon saying they have internally escalated the issue
- Still waiting on Skills removal to this day…
4. Alexa is still missing essential functionality to independently complete most user actions
Most critically of all of these pitfalls, the Amazon Echo platform doesn’t equip designers and developers with the minimum functionality to craft an amazing experience for users of Alexa Skills.
When we first released Clear Mind, which is a guided meditation Skill, the Echo platform would only allow developers to play 90 second audio files or pause Alexa for 90 seconds. Unless you’re a buddhist monk who’s a super pro at meditation, 90 seconds is not enough for a guided meditation track to ease you into a peaceful, focused state.
Only at the end of August did Amazon enable full audio streaming capabilities for Skills developers. As soon as we got the news, we scrambled to revamp our meditation Skill with 5, 10, and 20 minute meditation tracks, finally found a name that (*fingers crossed*) won’t mis-invoke songs from Prime Music, and are excited to finally produce a Skill that might actually be useful.
Many other Skills, even those from major brands, also struggle to move beyond lackluster user experiences.
With the Domino Skill, you can only reorder a pizza you’ve already ordered previously. If you haven’t ordered from Domino’s before, you’d have to download the mobile app and order a pizza just so you can re-order it on the Echo.
With the Uber Skill, you can call an Uber, but you can’t get updates when it is nearby or has arrived. You’d have to go back to the mobile app for those functions, which undermines the voice-only, hands-free, device-independent promise of the Echo.
There are many legitimate, complex security and privacy reasons why persistent background awareness has not yet been enabled for Echo Skills developers. But the lack of this functionality means that you likely won’t be able to complete full action loops in your Skills without forcing your users to pick up their phones or handle companion apps.
Until these challenges are fixed, Echo developers & designers won’t be able to fully deliver on the seamless & convenient experiences that voice UI has the potential to create.