Security Threats and Security Testing for Chatbots

UPDATE 2020/11/01: Botium’s free plan is live! With Botium Box Mini you will be able to:

  • use multiple chatbot technologies
  • set up test automation in a few minutes
  • enjoy a new improved user interface
  • get the benefits of a hosted, free service

Take it for a test drive

This article is pointing out security threats and attack vectors of typical chatbot architectures — based on OWASP Top 10 and adversarial attacks.

The well-known OWASP Top 10 is a list of top security threats for a web application. Most chatbots out there are available over a public web frontend, and as such all the OWASP security risks apply to those chatbot frontends as well. Out of these risks there are two especially important to defend against, as in contrary to the other risks, those two are nearly always a serious threat when talking about chatbots — XSS and SQL Injection.

Recently another kind of security threat came up, specifically targeting NLP models — so-called “adversarial attacks”.

Cross-Site Scripting - XSS

  • There is a chat window with an input box
  • Everything the user enters in the input box is mirrored in the chat window
  • Chatbot response is shown in the chat window

The XSS vulnerability is in the second step — when entering text including malicious Javascript code, the XSS attack is fullfilled with the chatbot frontend running the injected code:

<script>alert(document.cookie)</script>

This vulnerability is easy to defend by validating and sanitizing user input, but even companies like IBM published vulnerable code on Github still available now or only fixed recently.

Possible Chatbot Attack Vector

  1. Attacker tricks the victim to click a hyperlink pointing to the chatbot frontend including some malicious code in the hyperlink
  2. The malicious code is injected into the website
  3. It reads the victims cookies and sends it to the attacker without the victim even noticing
  4. The attacker can use those cookies to get access to the victim’s account on the company website

SQL Injection — SQLI

  • User tells the chatbot some information item
  • The chatbot backend queries a data source for this information item
  • Based on the result a natural language response is generated and presented to the user

With SQL Injection, the attacker may trick the chatbot backend to consider malicious content as part of the information item:

my order number is "1234; DELETE FROM ORDERS"

Developers typically trust their tokenizers and entity extractors to defend against injection attack.

Possible Chatbot Attack Vector

Adversarial Attack

An adversarial attack tries to identify blind spots in the classifier by applying tiny, in worst case invisible changes (noise) to the classifier input data. A famous example is to trick an image classifier to a wrong classification by adding some tiny noise not visible for the human eye.

A more dangerous real-life attack is to trick an autonomous car to ignore a stop sign by adding some stickers to it.

Image for post
Image for post
An adversarial example for a picture classifier. Adding a tiny amount of noise causes the model to classify this pig as an airliner. Image from this article.

The same concept can by applied to voice apps — some background noise not noticed by human listeners could trigger IoT devices in the same room to unlock the front door or place online shop orders.

When talking about text-based chatbots, the only difference is that it is not possible to totally hide added noise from the human eye, as noise in this case means changing single characters or whole words.

There is an awesome article “What are adversarial examples in NLP?“ from the TextAttack makers available here.

Possible Chatbot Attack Vector

  • The attacker tricks the victim to play a manipulated audio file from a malicious web site
  • In the background, the voice device is activated and commands embedded into the audio file are executed

To be honest, it is hard to imagine a real-life security threat for text-based chatbots.

  • User Experience is an important success factor for a chatbot. An NLP model not robust enough to handle typical human typing habits with typographic errors, character swapping, emojis provides a bad user experience, even without any malicious attacker involved.
  • It could be possible to trick a banking chatbot to doing transactions with some hidden commands and at the same time deny that the transaction was wanted, based on the chatbot logs … (I know, not that plausible …)

Security and Penetration Testing with Botium Box

Penetration Testing with OWASP ZAP Zed Attack Proxy

E2E Test Sets for SQL Injection and XSS

  • Over 70 different XSS scenarios
  • More than 10 different SQL Injection scenarious
  • Exceptional cases like character encoding, emoji flooding and more

Humanification Testing

  • simulating typing speed
  • common typographic errors based on keyboard layout
  • punctuation marks
  • and more …

Read more in the Botium Wiki.

Paraphrasing

Load Testing

Give Botium Box a test drive today — start with the free trial, we are happy to hear from you if you find it useful!

Image for post
Image for post
Botium Socks

Looking for contributors

Written by

Founder and CTO Botium

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store