Upskilling Test Engineers for Chatbot Projects

If you have a hammer, every problem looks like a nail.

This is Mjolnir, Thor’s hammer
  • I have to test a Whatsapp chatbot, can you help me to set up Appium for it ?
  • For our client I have to test a chatbot embedded in their app, can I test it with Botium ?
  • I have troubles with testing the customer support chatbot on our website, Selenium says <some random Selenium error code>
  • … and so on
  1. are extremely slow in execution, as they are basically running in realtime, and even for a medium-size chatbot project there typically is a 5-figure number of test cases for having a satisfying test coverage — running those tests in and E2E scenario will take hours in best case
  2. require a high amount of computing resources or access to expensive browser/device cloud services
  3. are flaky as the required infrastructure is error-prone as well
  4. cannot provide a holistic view of the test object quality, as some important assertions as the pure NLP performance are technically not possible at all with E2E testing.

API First

The most important metric for a chatbot is: is it able to do a meaningful conversation with a client ? In every chatbot project team there are conversation designers which, well, design the conversations that will make up the final user experience. The chatbot engine is trained (or coded) to provide the logic for these conversations.

Conversation flow as visualized by Botium

Testing the NLP engine

Most chatbots have some kind of natural language processing (NLP) component as part of the processing pipeline — it enabled users to communicate with the chatbot in natural language, and that’s what actually makes up a chatbot. As a test engineer it is your job to explore the limits of the NLP engine, and this requires basic skills in machine learning concepts, such as

  • intents, entities and prediction confidence
  • accuracy, sensitivity, specificity, precision, recall, F1-score
  • confusion matrix

E2E Smoketest

Testing the end-user experience on user interface level is an important part of a testing strategy. When doing it right you now have the confidence the conversation flow and the NLP component are doing their work, so it is now time to add some user interface testing to the mix. The recommendation is to

  • do a small number of test cases, which cover all of the possible user interaction elements
  • do those tests on a mix of representative browser versions / operating systems / smartphone devices, both virtual and physical

Non-Functional Testing

Finally, there are also non-functional tests like performance tests and security tests to add to the test mix. Opposed to the other test types those are typically done on certain milestones in the project.


A new generation of apps such as chatbots require a new generation of testing tools, like Botium. Test engineers have to develop additional skills for testing conversational interfaces like chatbots.

Botium Test Project Types

Get your free Botium Box Mini instance here



Co-Founder and CTO Botium🤓 — Guitarist 🎸 — 3xFather 🐣

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store
Florian Treml

Co-Founder and CTO Botium🤓 — Guitarist 🎸 — 3xFather 🐣