What Is the N+1 Query Problem and How to Detect It

In this article, we’ll dive deep into the N + 1 query problem using a practical example, its effects on application performance, and how innovative solutions can help detect and mitigate the N + 1 query problem.

If you have used any ORM framework like Hibernate, you can attest to how easy it is to define data models in simple regular classes and map them to any relational database. Much of the convenience that we enjoy when working with Object Oriented Programming languages and relational databases has been made possible by ORM tools like Hibernate.

Although programming languages have different ORM tools, their general purpose is similar: They speed up development, keep your code DRY, improve security, and lessen the need for mastering raw SQL queries.

As a developer, however, you should also watch out for the shortcomings of any technology. The infamous N + 1 query problem is an anti-pattern that stems from the leaky abstraction that ORMs provide. Because ORMs make it extremely simple to access data without making queries, they also make it easy to access that data inefficiently. This problem is mainly associated with how ORM frameworks handle the lazy loading of related entries. You can find the details below.

What is The N+1 Query Problem?​

To better understand the N + 1 query problem, we’ll first need to understand the concept of data fetching in ORM tools. By default, most ORM tools, such as Hibernate and Django, use lazy loading as the default data fetching behavior. This means they only fetch data explicitly requested from the database and delay any related data you have not requested.

Let’s look at an example. Suppose you are building a blog application with multiple blog posts, and every blog post is linked to its respective author through a one-to-many relationship. If your application fetches a list of blogs and then subsequently fetches the associated author for each blog using a separate query, this results in many roundtrips to the database.

// Fetching All Blogs
List<Blog> blogs = blogRepository.findAll();
// N+1 Queries: One Query for Blogs, N Queries for Authors
for (Blog blog : blogs) {
    Author author = blog.getAuthor(); // Triggers a separate query for each blog post's author
}

Enter fullscreen mode Exit fullscreen mode

The tricky part is noticing that this is happening. In your code, you would seem only to be iterating over objects. Because ORMs provide an abstraction over SQL, they also make it very difficult to guess which object property access will result in a query. In this case, you are running one query to fetch the list of blog posts and then running multiple queries that fetch N number of authors corresponding to every blog post, hence the name N + 1 query problem.

The more blog posts and Authors in your database, the more severe the problem can be, impacting application performance and user experience. Here are some issues that can be caused by the N + 1 query problem:

  • Slow response times: The N + 1 query problem slows down data retrieval, which makes your application less responsive to user requests.

  • Limited scalability: As your application grows, the N + 1 query problem can limit your application’s ability to handle the growing user demands.

  • Increase Database load: The increased number of queries due to the N + 1 query problem strains your database.

  • Limited throughput: The N + 1 query problem reduces the number of transactions your database can handle per section.

  • Degrades user experience: With slow load times, users are likely to wait longer for pages to load, which is frustrating, to say the least.

Fixing N+1 Query Issues

Most ORMs also provide the means to avoid N+1 queries by controlling the lazy vs. eager loading of related entities. With Lazy load, the related entity is loaded as a proxy, only to be initialized when one of the entity properties is accessed in the code. On the other hand, Eager loading performs a SQL join when initializing the original entity so that no further queries are necessary.

Hibernate provides several ways to configure eager loading, either on the relationship mapping itself or per query:

1. Configuring the entity relationship as eager

This is the most aggressive of the three strategies as it fundamentally changes how the entity is loaded. Most ORM tools like Hibernate lazy load-related entities by default, resulting in multiple database queries. Configuring the relationship between related entities as eager instructs your ORM to fetch the main entity and its related entities in a single query.

@Entity
public class Post {
    @Id
    @GeneratedValue(strategy = GenerationType.IDENTITY)
    private Long id;
    private String title;
    private String content;
    @ManyToOne(fetch = FetchType.EAGER) // Eager loading
    @JoinColumn(name = "author_id")
    private Author author;
    // Getters and setters
}

Enter fullscreen mode Exit fullscreen mode

Use this setting only if you know that ‌entity relationships and properties will almost always be accessed.

2. Setting batching preferences on the mapping

When dealing with entities that have complex relationships, this can be a good solution to the N + 1 query problem. Setting a batching preference allows you to batch multiple queries into one single query. In our earlier Post and Author example, this means that we can fetch all Posts and their respective Authors in a single query. If you are using Springboot, you can use the @BathSize annotation on lazy association.

@Entity
public class Author {
    @Id
    @GeneratedValue(strategy = GenerationType.IDENTITY)
    private Long id;
    private String name;
    @OneToMany(mappedBy = "author", fetch = FetchType.LAZY)
    @BatchSize(size = 10) // Batch loading strategy
    private List<Post> posts = new ArrayList<>();
    // Getters and setters
}

Enter fullscreen mode Exit fullscreen mode

Instructing your ORM tool to batch multiple queries reduces the number of roundtrips to the database when fetching related entities, effectively solving the N + 1 query problem.

3. Eager loading on query level

You can also prevent N + 1 query problems by explicitly specifying queries that should be eagerly loaded alongside the main entity. This approach is useful when you know exactly which related entities you want to fetch, eliminating the need for subsequent querying.

public interface PostRepository extends JpaRepository<Post, Long> {
    @Query("SELECT p FROM Post p JOIN FETCH p.author")
    List<Post> findAllPostsWithAuthorsEagerly();
}

Enter fullscreen mode Exit fullscreen mode

In cases where you only need to fetch related entities occasionally, and the relationships between your entities are fairly simple, then eager loading on a query level is an overkill. You can opt to use manual loading or ‌rely on the ORM’s default eager loading.

4. Manually load-related entities in the code

This approach gives you granular control when loading related entities from the database. ORM frameworks typically provide methods that allow you to load related entities. For instance, if your application has a Post entity related to Author, you can manually load related Authors using a method such as post.getAuthor(). You avoid running into the N + 1 query problem using this manual approach.

@Service
public class PostService {
    @Autowired
    private PostRepository postRepository;
    public Post getPostWithAuthor(Long postId) {
        Post post = postRepository.findById(postId).orElse(null);
        if (post != null) {
            // Manually load the author for every blog post
            Hibernate.initialize(post.getAuthor());
        }
        return post;
    }
}

Enter fullscreen mode Exit fullscreen mode

Understanding Continuous Feedback

Digma Continuous Feedback is a runtime linter that lets developers quickly identify risky code, potential errors, and bottlenecks in complex codebases. It uses OpenTelemetry behind the scenes to collect data such as traces, logs, and metrics when you run your code locally.

Once the data is collected, Digma looks for regressions, anomalies, code smells, or other patterns that can be useful for knowing about your code while in the development phase. This allows you to accelerate your development cycles and, at the same time, catch potential problems in your code.

How Continuous Feedback Can Help Detect The N+1 Query Problem

As your application grows, such query issues can become serious performance bottlenecks that significantly increase the database load and slow down application performance

. If you have a complex application performing hundreds of queries to the database, it can be difficult to spot them by simply reading through the code. However, with Digma, you can easily detect such queries even when working with complex codebases.

Problems such as repeated queries associated with N+1 Select statements are easy to detect in the runtime data. Specifically, in traces that Digma automatically collects using OpenTelemetry.

Here is what an N+1 Select query looks like in a trace:

Once Digma has collected and analyzed the data, it then highlights that the second query responsible for fetching user roles has a potential N + 1 query problem. Digma then generates an insightful alert next to the query source code/

Apart from that, you can also notice that Digma also highlights the impact on ‌response time, showing that the endpoint handling this query (e.g., GET /owner) is affected by the N+1 problem. While the duration of the query time of 1.1 ms for the 10 repeat DB calls seems small, this can quickly increase if your application makes more queries, taking a toll on your application’s performance.

Digma also goes a step further and provides insights at the endpoint level. This allows you to note the endpoint that is potentially affected by an N + 1 query problem, the severity of the impact, and the method causing the issue, as shown in the output below.

Armed with Digma insights above, at the query and endpoint level, you can easily find and fix code that’s causing N + 1 query problems.

Case Study: Reducing Execution Time for Specific APIs

Markus Westergren is an experienced Senior Consultant/Staff Engineer with a demonstrated history of working in the information technology and services industry. Markus is a Beta user of Digma and recently shared with us that they could detect such N+1 query issues in their code, which reduced execution time for specific APIs.

What coding issues did you encounter that led you to try Digma?

I recently joined a team that is developing a new product. There were performance issues with some use cases. My job was to find and fix them. I used manual analysis and a profiler to try and pinpoint hotspots in the code. Any changes I made had to be tested and measured to see if an issue was fixed.

How did Digma help you solve these issues?
With Digma, I get continuous feedback on the application’s performance. It will instantly show me where the biggest bottlenecks are, so I don’t have to search for them myself. I can now get direct feedback on how my latest fix performs compared with the previous version. It also found bottlenecks that we did not even know about. It takes the guesswork out of the picture. We save lots of time finding and fixing performance issues.

How did Digma help handle suspected N-plus-1 in SQL queries?

We are using Hibernate, and it makes it much easier to see what’s really going on.

We have already identified some cases where Hibernate produces N-plus-1 queries which are affecting the performance of the application.

We’ve made big improvements, and Digma has helped us identify some cases where N+1 queries may affect the application’s performance.

Besides the insights on the repeating database spans, Digma also provides key endpoint insights that allow us to detect and fix N + 1 queries related to specific endpoints, among other performance issues at their source.

Digma has certainly been a great addition to our development workflow, and we will continue leveraging it to optimize and improve our code and provide the best user experience.

Conclusion

The N + 1 query problem is a common problem when working with ORM frameworks such as Hibernate. Although different solutions, such as overriding lazy loading provided by ORM frameworks, can help avoid this problem, detecting such queries is crucial. Digma is a reliable tool that can save you the manual effort of finding such queries. With Digma in your development workflow, you can leverage continuous evidence-based feedback to optimize your code.

Happy Observing:)

原文链接:What Is the N+1 Query Problem and How to Detect It

© 版权声明
THE END
喜欢就支持一下吧
点赞15 分享
评论 抢沙发

请登录后发表评论

    暂无评论内容