40% of a typical bots’ users only engage in one conversation.

This statistic, calculated by Ilker Koksal, co-founder of Botanalytics, suggests bot makers need to invest more effort to measure bot metrics and performance and deliver value to users. Traditional metrics like DAU / MAU and analytics tools like Google Analytics or Mixpanel work well for websites and mobile apps, but the unique conversational nature of chatbots requires a different perspective on performance.

Some traditional metrics may even be misleading. Session length is often used as a proxy for user engagement on web and mobile. However, many chatbots are utilitarian and should be a functional shortcut compared to their app or website counterparts. Increased session length could mean users are confused or the conversational flow is inefficient.

With the skyrocketing popularity of chatbots, bot developers have collected enough data to learn what is and isn’t working. Bot analytics companies like Dashbot and Botanalytics have collectively pushed close to 100 million messages and get a bird’s eye view for what bot metrics are most useful for developers. Developers on their platforms have tried dozens of new measurements to identify the best ways to improve their bots.

Here’s what we’ve learned are the 5 chatbot metrics that produce the most useful insights.

What Are The Most Critical Bot Metrics You Need To Track?


1. Active & Engaged Rates

Many users barely interact with a chatbot before churning off. 40% of a bot’s users only interact one time. Given the high churn, identifying and nurturing active and engaged users is key to long-term success.

Dennis Yang, co-founder of Dashbot, recommends using Active and Engaged Rates to combat churn. When a user reads a message in a session, that session is considered “active”. When a user responds with a message in a session, that session is considered “engaged”.

Active rate = number of active sessions of a user / total number of sessions of that user.

Engaged rate = number of engaged sessions of a user / total sessions of that user.

How do you optimize active and engaged rates? Yang suggests you answer this question:

What are the top messages users send my chatbot?

Machaao, a popular Facebook Messenger chatbot for cricket fans, increased their user engagement by 300% by analyzing and adapting to how the most active and engaged users spoke to the bot. Users’ messages reflect their expectations around how a bot should behave. Fitting their mental models is usually a winning strategy to boosting engagement.

“We figured out that the top message sent to our bot was the Like button,” says Harshal Dhir, founder of Machaao. Inspired by watching active rates, engaged rates, and top messages, Machaao enabled easier expression of Likes and also prioritized news and schedule formats that matched the expectations of their users.


2. Confusion Triggers

The nascent chatbot industry has yet to develop the optimal user experience with conversational UI. Challenges exist throughout the funnel: bringing users to a bot, communicating functionality, driving towards action, and handling inquiries and errors.

Given the huge range of possible user input, chatbots often misinterpret or can’t understand what a user wants. Thus, the incidences when your bot says a version of “I don’t understand” must be closely watched. Also useful is seeing what user inputs caused the bot’s confusion.

StreakTrivia is a daily trivia bot that runs a massively multiplayer trivia game on Facebook Messenger every day. Their bot asks players True or False questions and then presents the users with quick reply buttons to answer with. By closely watching confusion rate, StreakTrivia’s team was able to catch an issue they would have otherwise missed. Turns out when the bot was confused, this was often due to the user typing in “true” or “false” instead of using the provided buttons.

Tracking confusion rate also helps triage when human intervention is needed. Just as bad customer support associates ruin a customer’s opinion of your brand, so will a bad chatbot experience. Honing in on high-risk scenarios and escalating to trained staff dramatically reduces churn and provides a critical opportunity to learn about user needs.


3. Conversation Steps

“A long conversation doesn’t necessarily mean an engaged user,” says Ilker Koksal of Botanalytics. He points out that chatbots like Uber’s want users to order a car with as few steps as possible. If a user’s conversation with an Uber chatbot takes more than 50 back-and-forth messages, the experience is a clear failure since the user would be better off using the app.

Koksal defines “conversation step” as a single back-and-forth exchange between a user and a bot. For example, if a user says “hi” and the bot replies with “hi” back, that’s one conversation step.

“Every chatbot needs to know their average conversation steps,” says Koksal. Utility-driven chatbots have lower average conversation steps vs. entertainment-driven chatbots. Regardless of the chatbot type, conversations that either significantly exceed or fall short of the average conversation step usually indicate a bad user experience. Either a user gave up too quickly or a bot took too long to complete a user’s goal.

PennyCat, a Facebook Messenger bot that allows users to play games and find coupons, uses conversation steps to segment and re-direct their users. The team easily separates users into “Game Lovers” and “Discount Lovers” because Game Lovers’ conversations typically exceed 40 conversation steps.

Once PennyCat identified the “Game Lovers”, they targeted them with select coupons in order to convert them to “Discount Lovers”. Segmenting and targeting users based on conversation steps led to a 70% increase in coupon use.


4. Average Number of Conversations / User

The number of conversations a user has with a bot is just like the number of sessions a user starts with a mobile app. The metric is highly correlated with engagement. According to Koksal, the average conversations per month range from 1.42 to 4.79 for the bots on the Botanalytics platform.

Paying attention to how Average # Conversations / User fluctuates over time gives bot developers insight into potential shortcomings and how to fix them. One recruiting bot on the Facebook Messenger platform noticed when their bot’s average conversations across users dropped. To resolve the problem, they studied users whose number of conversations were below the bot’s average and noticed that they weren’t engaging with the job offerings presented. The team changed the bot to ask users for more background information, delivered more relevant results, and saw a boost to engagement and retention.

Average # of Conversations / User can also reveal if a new feature is working. StreakTrivia started out with 2.5 average # of conversations per user, but then saw a 224% gain to 8.1 after they implemented a “play with friends” feature.


5. Retention: 1 Day, 7 Day, and 30 Day

Retention is not a chatbot-specific metric, but the best retention period to focus on varies based on a bot’s purpose.

For example, finding a job usually takes a minimum of 20 days of searching, so a 1 Day or 7 Day retention metric is insufficient. When the recruiting bot mentioned earlier noticed that retention dropped after 16 days, they fixed their problem by increasing the quality and relevance of jobs shown after day 12.

“Bots that offer repeat services like food delivery should focus on 7 Day Retention,” says Koksal, “Whereas a content or media bot relies on daily engagement and benefits most from analyzing 1 Day Retention. If a user doesn’t like the format of the content presented, they’re unlikely to come back the next day.”



The 5 bot metrics developers have found most useful are:

  1. Active / Engaged Rates
  2. Confusion Triggers
  3. Conversation Steps
  4. Average Number of Conversations Per User
  5. 1 Day, 7 Day, and 30 Day Retention

By tracking these 5 bot metrics, bot developers can develop nuanced awareness of problem areas in their conversational flow, segment users to provide the best user experience, and boost long term use and engagement.