Vadim Smirnov for CKEditor

Posted on Feb 28

How to Detect Bot Traffic using Next.js Middleware: A Quick Guide

#nextjs #javascript #security #webdev

Bots make up a significant portion of internet traffic. Some, like Googlebot, are beneficial, while others — such as scrapers — can harm your site. In this article, we'll explore how to detect and block bots using Next.js Middleware.

What is Bot Detection and Why Does it Matter?

Not all web traffic is human. In fact, almost half of web traffic is automated. Bots can:

Scrape content
Spam your website
Slow down your app

Why Should You Implement Bot Detection?

From a security point of view, bot detection functionality is an essential part of the security toolset.

With bot detection, you can improve security by blocking malicious traffic, reduce server load by filtering unnecessary requests, and protect analytics accuracy by preventing bot traffic.

Since Next.js Middleware runs before a request is processed, it’s the perfect solution for implementing bot detection.

What is Middleware in Next.js?

Middleware in Next.js allows you to run code and execute logic before completing the request.

Additionally, Middleware can be run globally. However, you can modify this behavior for specific routes. It's important to mention that Middleware works really well with platforms like Vercel and Netlify.

How Next.js Middleware Works

Let's simplify the explanation by identifying key facts:

Middleware must be placed in the project root as a middleware.ts or middleware.js file depending if you are using TypeScript or not.
It automatically applies to all routes unless configured otherwise.
It uses the NextResponse to modify requests and perform specific actions like redirecting or rewrite

A Simple Middleware Implementation

Before we add bot detection, let's create a basic Middleware implementation to log requests. First, we need to create middleware.ts in the root directory.

For a basic implementation, our Middleware will do the following:

Log every incoming request in the terminal.
Pass the request through without modification.

import { NextResponse } from "next/server";

const EXCLUDED_PATHS = ["/_next/", "/static/"];
const EXCLUDED_EXTENSIONS = [".svg", ".js", ".css", ".ico", ".png", ".jpg", ".jpeg", ".gif", ".webp", ".woff2"];

export function middleware(req: Request) {
  const url = new URL(req.url);

  if (EXCLUDED_PATHS.some(path => url.pathname.startsWith(path)) || EXCLUDED_EXTENSIONS.some(ext => url.pathname.endsWith(ext))) {
    return NextResponse.next();
  }

  console.log(`Request received: ${req.url}`);
  return NextResponse.next();
}

Now, let's run our Next.js application to verify that our Middleware is functioning as expected.

Start your Next.js app by running the npm run dev in the terminal. Once you visit the localhost:3000 in the browser, check the terminal — you should see the following log there:

Request received: <http://localhost:3000/>

Extend Middleware with Bot Detection

Now, let's improve our Middleware to detect bot-like User-Agents. We can do that by updating middleware.ts with Bot Detection:

import { NextResponse } from "next/server";

const EXCLUDED_PATHS = ["/_next/", "/static/"];
const EXCLUDED_EXTENSIONS = [".svg", ".js", ".css", ".ico", ".png", ".jpg", ".jpeg", ".gif", ".webp", ".woff2"];
const BOT_PATTERNS = [/bot/i, /crawler/i, /spider/i, /curl/i, /wget/i];

export function middleware(req: Request) {
  const url = new URL(req.url);
  const userAgent = req.headers.get("user-agent") || "";

  if (EXCLUDED_PATHS.some(path => url.pathname.startsWith(path)) || EXCLUDED_EXTENSIONS.some(ext => url.pathname.endsWith(ext))) {
    return NextResponse.next();
  }

  if (BOT_PATTERNS.some(pattern => pattern.test(userAgent))) {
    console.log(`Bot detected! (${userAgent}) - Redirecting to /bot-detected.`);
    return NextResponse.redirect(new URL("/bot-detected", req.url));
  }

  console.log(`Request received: ${req.url} | User-Agent: ${userAgent}`);

  return NextResponse.next();
}

Let's review what changed in this version of our Middleware:

We extract the User-Agent header from the request.
Check if it matches one of the bot keywords (bot, crawler, spider, etc.).
Redirect detected bots to /bot-detected.
Log each request’s User-Agent for debugging.

Test the Middleware

Now, let's test our middleware using curl and simulate a bot request. Run the following in your terminal without stopping the Next.js application locally:

curl -i -A "Googlebot" <http://localhost:3000>

If you implemented the Middleware correctly, the following response will be visible in the terminal:

HTTP/1.1 307 Temporary Redirect
location: /bot-detected
Date: Mon, 24 Feb 2025 17:00:34 GMT
Connection: keep-alive
Keep-Alive: timeout=5
Transfer-Encoding: chunked

/bot-detected

As you can see, the bot detection works!

Review of Common Bot Detection Techniques

Now that our basic bot detection Middleware works, let’s explore some common techniques.

User-Agent Header Checking

This is the solution that we implemented as a simple demonstration of bot detection.

It was very straightforward and was great for demo purposes. However, many bots use agents similar to real users, which makes this approach sub-optimal for real-life applications.

IP Address Filtering

This approach can be more effective in filtering and can be applied to real-life, production-ready applications. However, you can't rely on an array of addresses you manually gathered, so you must use anAPI that analyses IP.

That approach offers better protection but may be ineffective for more advanced bots.

Third-Party Bot Detection APIs

Building on the previous approach, if we want enterprise-grade protection, a complete dedicated bot detection service like BotD or DataDome is the most solid means that guarantees high detection accuracy.

While these solutions are not free, they are good enterprise-level solutions for production-ready applications.

If you want to get started with one of the solutions that were mentioned above, Vercel provides starter projects that allow you to see both of them in action:

Conclusion

Bots can have a huge impact on your website, from content scraping and spam to security risks and performance issues. Fortunately, Next.js Middleware provides an efficient way to filter out unwanted traffic before it reaches your app.

While basic bot detection is a great starting point, bots can bypass simple User-Agent checks. If you need stronger protection, consider third-party services like BotD or DataDome to detect even the most advanced bots.

Now, it's time for you to find the optimal solution for your needs!

DEV Community

How to Detect Bot Traffic using Next.js Middleware: A Quick Guide

What is Bot Detection and Why Does it Matter?

Why Should You Implement Bot Detection?

What is Middleware in Next.js?

How Next.js Middleware Works

A Simple Middleware Implementation

Extend Middleware with Bot Detection

Test the Middleware

Review of Common Bot Detection Techniques

User-Agent Header Checking

IP Address Filtering

Third-Party Bot Detection APIs

Conclusion

Top comments (0)

Read next

The State of Open-Source Tailwind CSS Component Frameworks: A Developer's Guide

Leveraging Rails Enums for Cleaner and More Efficient Code

I built my SaaS app fast. Now it's marketing time.

The Cybersecurity Risks of AI-Generated Code: What You Need to Know