Aadarsh Nagrath

Posted on Dec 11

Top 10 Common Database Performance Mistakes in API Development

Building an API server is an exciting endeavor, but it comes with its share of challenges—particularly when it comes to database performance. Regardless of the technology stack, developers often stumble into the same performance pitfalls time and again. These mistakes can be avoided with some foresight and careful planning.

In this article, I’ll walk you through common database performance mistakes in API development, how to identify them, and strategies to address them effectively.

Mistake #1: Querying Information That Rarely Changes

During my work on APIs, I’ve encountered scenarios where the same static data was queried repeatedly—an unnecessary drain on resources. For example, if your application uses data like country codes or configuration flags that seldom change, there’s no need to hit the database for every request.

A better approach is to load such static datasets into memory when the server starts and access them from a singleton. Here’s an example in pseudocode:

private static Task<List<Country>>? _cachedCountries = null;

private Task<List<Country>> GetCachedCountries()
{
    if (_cachedCountries == null) {
        _cachedCountries = Database.Countries.ToListAsync();
    }
    return _cachedCountries;
}

Storing static data in memory reduces unnecessary database queries, improves response time, and keeps your API performant.

Mistake #2: Status Checks That Overuse the Database

Health checks are essential for monitoring an API server’s status, ensuring it’s configured correctly, connected to external services, and has valid database credentials. However, using database queries in health checks can overwhelm your system if the status page is called frequently.

A smarter approach is to cache the result of database checks for a short duration, like 30 seconds. This reduces the query load without compromising on functionality:

public static DateTime LastCheckTime = DateTime.MinValue;
public const int SECONDS_FOR_RETEST = 30;

public static bool Status()
{
    var now = DateTime.UtcNow;
    if ((now - LastCheckTime).TotalSeconds > SECONDS_FOR_RETEST) {
        // Perform database checks here
        LastCheckTime = now;
    }
    return true;
}

Mistake #3: Excessive Queries for API Authentication

API authentication often involves multiple database calls to validate a user’s credentials, permissions, and preferences. These queries can create a bottleneck when users make frequent requests.

How to Address:

Use a hash table in memory for the user’s credentials as the first check.
Fall back to a fast key-value store like Redis.
If neither has the data, query the database and cache the result.

public async Task<bool> AuthenticateUser(string userId)
{
    if (_memoryCache.TryGetValue(userId, out var user)) {
        return ValidateUser(user);
    }

    var userData = await _redis.GetAsync(userId) ?? await _database.GetUserAsync(userId);
    _memoryCache.Set(userId, userData, TimeSpan.FromMinutes(5));
    return ValidateUser(userData);
}

Mistake #4: Not Indexing Properly

Indexes significantly speed up data retrieval operations, but many developers either forget to add indexes or add them without proper planning.

How to Address:

Analyze your query patterns and add indexes to frequently queried columns.
Use compound indexes for queries involving multiple columns.

Mistake #5: Using Object Relational Mappers Inefficiently

Object Relational Mappers (ORMs) simplify database interactions but can inadvertently lead to performance issues. For example, avoid making database calls inside a loop:

// Inefficient Example
foreach (var item in records)
{
    count += await GetCountAsync(item);
}

// Optimized Example
count = await _database.GetAggregateCountAsync(ids);

Mistake #6: Ignoring the Impact of “Fast” Queries

Even fast queries can add up. A single query taking 1 millisecond might seem negligible, but ten such queries in an API request add 10 milliseconds of latency.

How to Address:

Minimize the total number of queries per request.
Combine queries when possible.

Even seemingly negligible queries can accumulate latency.

Code Example:

// Example of reducing multiple fast queries into one combined query
// Before: Multiple fast queries
public async Task<List<User>> GetUsersWithDetails(List<int> userIds) {
    var users = new List<User>();
    foreach (var id in userIds) {
        users.Add(await _database.Users.FindAsync(id));
    }
    return users;
}

// After: Combined query
public async Task<List<User>> GetUsersWithDetails(List<int> userIds) {
    return await _database.Users.Where(u => userIds.Contains(u.Id)).ToListAsync();
}

This approach reduces the number of database round trips from n queries to a single query.

Mistake #7: Overusing Joins in Large Datasets

Excessive or poorly structured joins on large tables can degrade performance.

How to Address:

Use denormalization judiciously.
Break down complex queries into smaller parts.

Overusing joins can lead to performance bottlenecks.

Code Example:

-- Before: Complex join with large datasets
SELECT orders.id, customers.name
FROM orders
JOIN customers ON orders.customer_id = customers.id
WHERE customers.country = 'USA';

-- After: Denormalized dataset
SELECT id, customer_name
FROM orders_with_customer_names
WHERE country = 'USA';

By creating a denormalized or materialized view (orders_with_customer_names), you can eliminate the need for repetitive joins in high-traffic queries.

Mistake #8: Neglecting Connection Pooling

Failing to manage database connections efficiently can result in bottlenecks. Repeatedly opening and closing connections increases latency.

How to Address:

Use connection pooling provided by your database driver or framework.
Fine-tune the connection pool settings based on your application’s load patterns.

Efficient connection pooling is crucial for handling high-concurrency applications.

Code Example:

# Example using a connection pool with SQLAlchemy in Python
from sqlalchemy import create_engine
from sqlalchemy.orm import sessionmaker

# Using a connection pool
engine = create_engine(
    "postgresql+psycopg2://user:password@localhost/dbname",
    pool_size=10,  # Maximum connections in the pool
    max_overflow=5,  # Connections beyond the pool size
    pool_timeout=30  # Timeout in seconds
)
Session = sessionmaker(bind=engine)
session = Session()

# Using the session
result = session.execute("SELECT * FROM users LIMIT 10")

This configuration allows the application to reuse connections, reducing the overhead of creating new ones.

Mistake #9: Not Monitoring and Profiling Queries

Without actively monitoring query performance, developers often miss slow or expensive queries.

How to Address:

Use tools like EXPLAIN or EXPLAIN ANALYZE to understand query execution plans.
Set up automated query performance monitoring and alerts.

Proactively monitoring queries helps identify inefficiencies.

Code Example:

-- Profiling a query
EXPLAIN ANALYZE
SELECT * FROM orders
WHERE status = 'Pending';

-- Output shows the query execution plan and time, enabling optimization.

Set up query monitoring in your application, for example, with an ORM:

// Example in Entity Framework Core
var query = _context.Users.Where(u => u.IsActive);
var sql = query.ToQueryString(); // Outputs the SQL query string for review

Mistake #10: Overlooking Database Write Costs

Frequent write operations, such as logging or updating audit tables, can create contention.

How to Address:

Batch writes where possible.
Use asynchronous operations for non-critical writes.
Optimize transactional scope to avoid holding locks unnecessarily.

Proactively monitoring queries helps identify inefficiencies.

Code Example:

-- Profiling a query
EXPLAIN ANALYZE
SELECT * FROM orders
WHERE status = 'Pending';

-- Output shows the query execution plan and time, enabling optimization.

Set up query monitoring in your application, for example, with an ORM:

// Example in Entity Framework Core
var query = _context.Users.Where(u => u.IsActive);
var sql = query.ToQueryString(); // Outputs the SQL query string for review

Conclusion

Database performance issues can undermine an otherwise robust API. By caching static data, optimizing health checks, reducing redundant queries, and using ORMs efficiently, you can build APIs that are fast, reliable, and scalable.

If you’re tackling similar challenges, feel free to reach out. Let’s build better APIs together!

DEV Community

Top 10 Common Database Performance Mistakes in API Development

Mistake #1: Querying Information That Rarely Changes

Mistake #2: Status Checks That Overuse the Database

Mistake #3: Excessive Queries for API Authentication

How to Address:

Mistake #4: Not Indexing Properly

How to Address:

Mistake #5: Using Object Relational Mappers Inefficiently

Mistake #6: Ignoring the Impact of “Fast” Queries

How to Address:

Code Example:

Mistake #7: Overusing Joins in Large Datasets

How to Address:

Code Example:

Mistake #8: Neglecting Connection Pooling

How to Address:

Code Example:

Mistake #9: Not Monitoring and Profiling Queries

How to Address:

Code Example:

Mistake #10: Overlooking Database Write Costs

How to Address:

Code Example:

Conclusion

Top comments (0)

Read next

Best AI project ideas, in google's opinion 🫶

🚀 Attention Developers: The Unwritten Laws of Software Development You NEED to Know! 🚀

Circle Cursor Js #GSAP

I've tried v0 and Figma, but neither feels right. v0 doesn't offer visual editing, and Figma doesn't have great code export or the ability to write code. Curious if anyone finds this annoying and if other devs want to visually edit + code?