Background
We have a .NET 6 application with a search functionality that uses LINQ queries with .Include()
that needs to join more than 30 tables to fetch data. The feature worked perfectly at the beginning, even after importing legacy data into the new system. But gradually, the feature has begun to slow down considerably because of the huge growth in data volume.
Cartesian Explosion 💥
The LINQ query was experiencing "Cartesian explosion". 😞
What is Cartesian explosion?
Cartesian Explosion is an Entity Framework performance problem arising when using the Include method in loading more than one level of related data. This can, in turn, multiply the count of records coming back, so it takes time and consumes additional memory.
The following query was pulling data in 30 secs. (Query changed for privacy purposes).
_context.SomeTableName
.AsNoTracking()
.Include(x => x.Tasks.OrderBy(y => y.Id))
.ThenInclude(x => x.Owner)
.Include(x => x.Tasks)
.ThenInclude(x => x.TaskStatus)
.Include(x => x.Tasks)
.ThenInclude(x => x.TaskComments)
.ThenInclude(x => x.CreatedBy)
.OrderBy(x => x.CreatedOn)
.Include(x => x.Participants.OrderBy(y => y.Id))
.ThenInclude(x => x.User)
.Include(x => x.Participants)
.ThenInclude(x => x.Role)
.Include(x => x.Regions)
.ThenInclude(x => x.Region)
.Include(x => x.Countries)
.ThenInclude(x => x.Country)
.Include(x => x.Product)
.Include(x => x.Groups)
.ThenInclude(x => x.Group)
.Include(x => x.Businesses)
.ThenInclude(x => x.Business)
.Include(x => x.Tiers)
.ThenInclude(x => x.Tier)
.Include(x => x.Segments)
.ThenInclude(x => x.Segment)
.Include(x => x.Stocks)
.ThenInclude(x => x.Stock)
.Include(x => x.Types)
.ThenInclude(x => x.Type)
.Include(x => x.Policy)
.Include(x => x.FinalOutcome)
.Include(x => x.Status)
.Include(x => x.Attachments)
.AsSplitQuery() to the Rescue
There are a number of ways to make a LINQ query more efficient, but applying this EF Core feature enhanced performance by 4X.
By adding .AsSplitQuery()
to the LINQ query, the time it took to retrieve data was cut down to 7.5 seconds.
Before vs After
Below are the Before vs After results:
What's the Magic?
Rather than passing a single SQL query with join, EF Core when utilizing .AsSplitQuery()
will pass several SQL queries without join, each retrieving smaller amounts of data and then merging the results, this aids in avoiding the risk of cartesian explosion.
Trade-offs
There are also few trade-offs linked to this approach:
Multiple Round-Trips: Increases the database round-trips, which may introduce latency, this can worst if the UI app and API are in different regions.
Consistency Issues: Data may get changed between the execution of split queries, which results in inconsistent data.
Smaller Data: This may not be helpful for smaller dataset.
Thank You!
If you enjoy the content then please leave a like. Thank you! 🙂
Top comments (0)