(No Ratings Yet)

January 9, 2021 1

Bad Bots: What They Are and How to Fight Them

in Hashing Out Cyber Security

Bad internet bot traffic rose by 18.1% in 2019, and it now accounts for nearly one-quarter of all internet traffic

Information

Note: This article, which was originally published in 2019, has been updated to include related news & media resources.

The figure above, which comes from Imperva’s 2020 Bad Bot Report, should come as a warning to all internet users, especially companies and organizations who maintain their own infrastructure online to take this problem seriously.

The pervasiveness of malicious bots (or “bad bots” for short) not only places additional strain on networks and leads to additional infrastructure costs, it also indicates rampant cyberattacks and malicious activities that are committed by cybercriminals and threat groups.

Imperva’s report even reveals concerning trends such as attempts to rebrand bad bots as legitimate services, the growth of massive credential stuffing attacks being carried out by malicious actors using bots, and the increasing complexity of bot operations. Considering these developments, it’s only prudent to understand what bad bots are and how to fight them.

Let’s hash it out.

What Are Internet Bots and How Are They Used?

Graphic illustraiton representing internet bots

Simply put, internet bots are software applications that are designed to automate many tedious and mundane tasks online. They’ve become an integral part of what makes the internet tick and are used by many internet applications and tools.

For example, internet search engines like Google rely on bots that crawl through web content in order to index information. Bots go through millions of web pages’ text to find and index terms that these pages contain. So, when a user searches for a particular term, the search engine will know which pages contain that particular information.

Travel aggregators use bots to continuously check and gather information on flight details and hotel room availabilities so that they can display the most up-to-date information for users. This means that users no longer need to check different websites individually. The aggregators’ bots consolidate all of the information, allowing the service to display the data all at once.

Thanks to developments in artificial intelligence and machine learning, bots are also being used to complete more complex tasks. Business intelligence services use bots to crawl through product reviews and social media comments to provide insights on how a particular brand is perceived.

How Bots Can Positively (and Negatively) Impact Your Organization

Imagine if these tasks were done manually by a human. It would be quite a slow and error-prone process. By using bots, these tasks are completed quickly and more accurately. This frees up your organization’s “human assets” to collaborate and focus on higher-level projects and goals.

Bots have an impact on the infrastructure of websites and applications they come into contact with. Since bots essentially “visit” websites, they consume computing resources such as server loads and bandwidth. Because of this, even these good bots can inadvertently cause harm. An aggressive search engine or aggregator bot can take down a site with limited resources. Fortunately, proper site configuration can prevent this from happening.

What Are Bad Bots?

In general, bot activity is already something that most organizations have been dealing with for years. However, what’s worrisome is the traffic that comes from the “bad bots” — the bots that have been appropriated by malicious actors to serve as tools for various hacking and fraud campaigns.

The most common uses for bad bots include:

Web scraping — Hackers can steal web content by crawling websites and copying their entire contents. Fake or fraudulent sites can use the stolen content to appear legitimate and trick visitors.
Data harvesting — Aside from stealing entire websites’ content, bots are also used to harvest specific data such as personal, financial, and contact information that can be found online.
Price scraping — Product prices can also be scraped from ecommerce websites so that they can be used by companies to undercut their competitors.
Brute-force logins and credential stuffing — Malicious bots interact with pages containing log-in forms and attempt to gain access to sites by trying out different username and password combinations.
Digital ad fraud — Hackers can game pay-per-click (PPP) advertising systems by using bots to “click” on ads on a page. Unscrupulous site owners can earn from these fraudulent clicks.
Spam — Bot can also automatically interact with forms and buttons on websites and social media pages to leave phony comments or false product reviews.
Distributed denial-of-service attacks — Malicious bots can be used to overwhelm a network or server with immense amounts of traffic. Once the allotted resources are used, sites and applications supported or hosted by the network will become inaccessible to legitimate users.

Hackers are also becoming more sophisticated and creative in how they use these bots. To start, they’re designing bots that are capable of circumventing conventional bot mitigation solutions, thus making them harder to detect. Some enterprising parties even create seemingly legitimate services out of bad bots. Bots can be used to help buyers get ahead of queues in time-sensitive transactions such as buying limited edition products or event tickets.

Hackers can perform these activities on a massive scale because through the use of massive botnets, which are networks composed of devices capable of running bots. Many of these devices are compromised from previous hacks. The Mirai botnet, which is responsible for several massive denial-of-service attacks, is composed of tens of thousands of compromised internet-of-things (IoT) devices such as IP cameras and routers.

To put it succinctly, industries are suffering from these bad bots. According to the same Imperva report, the hardest hit sectors are financial services (47.7%), education (45.7%), IT and services (45.1%), and marketplaces (39.8%) — industries where bots look to breach accounts through brute force, steal intellectual property, and scrape prices, respectively.

Malicious Bots: How Do We Fight Them?

Falling victim to bad bots can have serious consequences. Aside from the computing resources it consumes, bot traffic can affect business performance. Price scraping can leave businesses at a pricing disadvantage against their competitors. Content scraping can hurt search rankings. Spam can affect a site’s image and credibility in the eyes of search engines.

Getting breached can open up networks to other forms of cyber attacks including data theft and ransomware. Clearly, steps must be taken to prevent them from running rampant.

Here are three critical measures to fight back against these bad bots:

1. Recognize the Problem

Organizations must be proactive in dealing with bad bots. This starts with recognizing and identifying the problem. IT teams can assess if their networks are being attacked by bots by taking a look at their web analytics and review their traffic.

Bad bots graphic that shows a breakdown of website traffic for bots and humans — Image Source: Imperva

Spikes in bandwidth consumption and log-in attempts can be signs of increased bot activities. Traffic from unusual countries of origin can also hint at bad bots probing a site for vulnerabilities. Checking IP addresses and geolocations of traffic sources can reveal potential bot activity.

Bad bots graphic that is a visual breakdown of global traffic by region — Image Source: Google Analytics

Business performance can also be an indication of malicious bot activity. For instance, a sudden drop in conversion rates for ecommerce sites can allude to price scraping.

2. Employ Defensive and Protective Measures

It’s critical for organizations to adopt and enhance cybersecurity measures that protect their respective infrastructure. Among the best practices to implement are:

Using robots.txt. A robots.txt file placed in the index of a website can prevent bots such as search engine crawlers from overloading it with requests. The file essentially tells bots which pages are to be included in the crawl. However, it’s important to note that using robots.txt only helps with mostly legitimate crawlers that support such directives and doesn’t necessarily keep bad bots out. Still, this can help prevent overly aggressive crawlers from taking sites down.
Using challenges to distinguish between human users and bot traffic. Bots can be programmed to automatically fill out forms to spam or credential stuff websites and web applications. Using challenges that require human input or user validation such as CAPTCHA can help prevent bots from properly executing their intended hacks.
Adopting network protection solutions. In most cases, it’s best for organizations to invest in more advanced forms of protection. Cloud application security solutions and cloud-based web application firewalls (WAFs) now employ advanced methods to stop bot traffic from even interacting with a site. These solutions are capable of identifying and blocking bots according to their behaviors, origins, and signatures. Some industry-leading solutions are even capable of preventing massive DDoS attacks from causing any downtime to sites under their protection.
Deploying strict access controls. Multi-factor authentication requires users to provide additional credentials such as one-time-passwords (OTP). These can be implemented to deter bot attacks such as credential stuffing. Using identity and access management (IAM) also allows administrators to strictly define which resources within their network can be accessed by specific user accounts. This way, in the event that a bot “cracks” the credentials of one account, its access to the network is still limited (thereby minimizing the potential damage).

3. Monitor and Test Security

It’s important to constantly monitor and test the behavior of all security measures that are put in place. Misconfiguration or faulty implementation does happen. As such, checks like penetration tests and attack simulations should be performed routinely to verify if the measures work as intended. Adopting even the most expensive tools and solutions would only lead to waste if they are improperly configured.

Bad bots graphic: Google reCAPTCHA image showing the number of site requests and site traffic to help identify suspicious activity — Image Source: Google reCAPTCHA

It’s also crucial to test if the measures are having a negative effect on business goals. Poorly configured bot detection can prevent good bots from getting through. Blocking search engine crawlers can instantly tank a site’s ranking. If a site relies on partnerships with aggregators to drive their business, inadvertently blocking aggregator bots can likewise break the service altogether.

Final Thoughts on Bad Bots

Site owners should pay close attention to their traffic considering how malicious bots continue to run rampant. Left unchecked, bad bot traffic can evolve from a nuisance to something more serious such as a full-on cyber attack in no time. Knowing how to mitigate bad bot traffic can help to safeguard your infrastructure and create a more secure internet for everyone.

Updated on January 9, 2021

Why Has Click Fraud Had a Busy 2020

Cybersecurity Firm Helps Advertisers Avoid ‘Click Fraud’

Goldman Sachs leads acquisition of bot mitigation company White Ops Paul Sawers

How to Fix ‘ERR_SSL_PROTOCOL_ERROR’ on Google Chrome

5 Ways to Determine if a Website is Fake, Fraudulent, or a Scam – 2018

Re-Hashed: How to clear HSTS settings in Chrome and Firefox

Re-Hashed: How to Fix SSL Connection Errors on Android Phones

Re-Hashed: The Difference Between SHA-1, SHA-2 and SHA-256 Hash Algorithms

Re-Hashed: Troubleshoot Firefox’s “Performing TLS Handshake” Message

How to fix the SSL_ERROR_RX_RECORD_TOO_LONG Firefox Error

The Difference Between Root Certificates and Intermediate Certificates

The difference between Encryption, Hashing and Salting

Rehash: How to Fix the SSL/TLS Handshake Failed Error

How to Remove a Root Certificate

This is what happens when your SSL certificate expires

How strong is 256-bit Encryption?

The 25 Best Cyber Security Books — Recommendations from the Experts

Cipher Suites: Ciphers, Algorithms and Negotiating Security Settings

Browser Watch: SSL/Security Changes in Chrome 58

Re-Hashed: How to Trust Manually Installed Root Certificates in iOS 10.3

Taking a Closer Look at the SSL/TLS Handshake

Executing a Man-in-the-Middle Attack in just 15 Minutes

How to View SSL Certificate Details in Chrome 56

Reminder: SSL Certificate Validity Is Dropping to 200 Days

Code Signing Certificates’ Lifespans to Drop to One Year

Signature Verification: How to Verify a Digital Signature Online

An Explainer Guide on Multi-Perspective Issuance Corroboration (MPIC)

‘World Quantum Readiness Day’ Returns with the Latest in PQC

Email Certificate Standards Updated to Support ACME Automation & Future PQC Security

Critical Infrastructure Protection: Securing Essential Systems Against Cyber Threats

Social Engineering Statistics 2025: When Cyber Crime & Human Nature Intersect

Chrome: New SSL Certificates Can’t Support Client Authentication Starting June 15, 2026

Monitoring Should Take Center Stage as Let’s Encrypt Abandons SSL Expiration Notifications

CISO Survival Guide: 8 Cyber Security Challenges & How to Navigate Them

Microsoft to Enforce Bulk Sender Authentication Requirements Starting May 5

Industry to Shift to 47-Day SSL/TLS Certificate Validity by 2029

By the Numbers: 50 Cyber Crime Statistics for 2025

SSL and TLS Versions: Celebrating 30 Years of History

X9’s New PKI System Is Purpose-Built for the Financial Industry

9 Essential Tips for Businesses Responding to Consumer Privacy Requests

6 Updates to Know About Mark Certificates and BIMI

WHOIS Domain Control Validation Will Phase Out Starting Jan. 8

10 Phishing Awareness Tips to Keep the Grinches Away

5 Ways to Determine if a Website is Fake, Fraudulent, or a Scam – 2018

How to Fix ‘ERR_SSL_PROTOCOL_ERROR’ on Google Chrome

Re-Hashed: How to Fix SSL Connection Errors on Android Phones

Cloud Security: 5 Serious Emerging Cloud Computing Threats to Avoid

This is what happens when your SSL certificate expires

Re-Hashed: Troubleshoot Firefox’s “Performing TLS Handshake” Message

Report it Right: AMCA got hacked – Not Quest and LabCorp

Re-Hashed: How to clear HSTS settings in Chrome and Firefox

Re-Hashed: The Difference Between SHA-1, SHA-2 and SHA-256 Hash Algorithms

The Difference Between Root Certificates and Intermediate Certificates

The difference between Encryption, Hashing and Salting

Re-Hashed: How To Disable Firefox Insecure Password Warnings

Cipher Suites: Ciphers, Algorithms and Negotiating Security Settings

The Ultimate Hacker Movies List for December 2020

Anatomy of a Scam: Work from home for Amazon

The Top 9 Cyber Security Threats That Will Ruin Your Day

How strong is 256-bit Encryption?

Re-Hashed: How to Trust Manually Installed Root Certificates in iOS 10.3

How to View SSL Certificate Details in Chrome 56

A Call To Let’s Encrypt: Stop Issuing “PayPal” Certificates

Bad internet bot traffic rose by 18.1% in 2019, and it now accounts for nearly one-quarter of all internet traffic

What Are Internet Bots and How Are They Used?

How Bots Can Positively (and Negatively) Impact Your Organization

What Are Bad Bots?

Malicious Bots: How Do We Fight Them?

1. Recognize the Problem

2. Employ Defensive and Protective Measures

3. Monitor and Test Security

Final Thoughts on Bad Bots

Recent News Related to Bot Traffic

Author

Recent Posts

Follow Us

Free Ebooks