In a previous post. I applied the ESR rule to multiple databases. I used a simplified example with no equality predicate to focus on covering the Sort and Range by the index. However, for the demo in MongoDB, I added an Equality predicate that doesn't filter anything, {$gt:MinKey}
, to get a full Equality, Sort, Range.
On my first day as a Developer advocate for MongoDB, I discussed this with the product managers, and the reason and possible optimization are tracked in the improvement idea Tighten index bounds and allow compound index to be chosen when predicate on leading field is not provided.
In this article, I'm showing it, with a simple workaround.
A table to experiment with the ESR rule
Here is a collection with fields that I'll use for queries with Equality, Sort, Range:
mdb> db.demoesr.drop();
mdb> db.demoesr.insertMany([
{ e:42, s:"a", r:10 },
{ e:42, s:"b", r:20 },
{ e:42, s:"b", r:10 },
{ e:42, s:"d", r:30 },
{ e:42, r:40 } // add one doc with no sort value
]);
Index and query on Equality, Sort, Range (ESR)
I create a compound index on fields "e", "s", and "r", in the order recommended by the ESR rule:
mdb> // index on Equality, Sort, Range
mdb> db.demoesr.createIndex(
{ e: 1 , s: 1, r : 1 }
);
e_1_s_1_r_1
I query with an equality predicate on "e", a range pI execute a query with an equality predicate on "e", a range predicate on "r", and a sort on "s", along with a projection of only indexed fields to ensure the query is fully covered by the index (the id must be explicitly excluded from the projection "_id":0
):
mdb> db.demoesr.find(
{ e:{$eq: 42} ,r:{$gt:10} } , {e:1,s:1,r:1,"_id":0 }
).sort({s:1});
[
{ e: 42, s: null, r: 40 },
{ e: 42, s: 'b', r: 20 },
{ e: 42, s: 'd', r: 30 }
]
I know that the result comes from the index entry directly, without fetching the document, because I can see s: null
for the document where "s" doesn't exist. The index has a value for all index key fields, with a null value when it does not exist in the document.
I confirm from the execution plan that there's no FETCH and no documents were read:
mdb> db.demoesr.find(
{ e:{$eq: 42} ,r:{$gt:10} } , {e:1,s:1,r:1,"_id":0 }
).sort({s:1}).explain("executionStats").executionStats;
{
nReturned: 3,
totalKeysExamined: 5,
totalDocsExamined: 0,
executionStages: {
stage: 'PROJECTION_COVERED',
nReturned: 3,
transformBy: { e: 1, s: 1, r: 1, _id: 0 },
inputStage: {
stage: 'IXSCAN',
nReturned: 3,
keyPattern: { e: 1, s: 1, r: 1 },
indexName: 'e_1_s_1_r_1',
direction: 'forward',
indexBounds: {
e: [ '[42, 42]' ],
s: [ '[MinKey, MaxKey]' ],
r: [ '(10, inf.0]' ]
},
keysExamined: 5,
seeks: 3,
}
}
This is the perfect execution plan for such queries as Equality, Sort, and Range were pushed down to the IXSCAN:
-
e: [ '[42, 42]' ]
is first in the index bounds and returns a single range for the filter on "e". The range holdskeysExamined: 5
index entries. -
s: [ '[MinKey, MaxKey]' ]
is second in the index bounds and returns this range ordered by "s", still returningkeysExamined: 5
index entries. -
r: [ '(10, inf.0]' ]
cannot use a single seek to get this range because there are multiple values in the preceding "s" but the filter is applied to the index entries and returnsnReturned: 3
index entries.
Additionally, because all fields in the projection are in the index entries, no documents had to be fetched: totalDocsExamined: 0
This is a perfect example of the ESR rule with a covering index where avoiding a sort is the goal.
In my example, I've only one value for "e". Let's remove the equality condition.
Index and query on Sort, Range (SR)
I want to run the same without an equality predicate. To avoid a sort operation, the index starts with the "s" column:
mdb> // index on Sort, Range
mdb> db.demoesr.createIndex(
{ s: 1, r : 1 }
);
s_1_r_1
Here is a query with only a Range and Sort:
mdb> db.demoesr.find(
{ r:{$gt:10} } , {e:1,s:1,r:1,"_id":0 }
).sort({s:1});
[
{ e: 42, r: 40 },
{ e: 42, s: 'b', r: 20 },
{ e: 42, s: 'd', r: 30 }
]
The non-existent "s" is absent in the result, instead of being shown as null
, which indicates that what is displayed was retrieved from the document rather than the index entry.
I verify this with the execution plan:
mdb> db.demoesr.find(
{ r:{$gt:10} } , {e:1,s:1,r:1,"_id":0 }
).sort({s:1}).explain("executionStats").executionStats;
n Returned: 3,
totalKeysExamined: 5,
totalDocsExamined: 5,
executionStages: {
stage: 'PROJECTION_SIMPLE',
nReturned: 3,
transformBy: { e: 1, s: 1, r: 1, _id: 0 },
inputStage: {
stage: 'FETCH',
filter: { r: { '$gt': 10 } },
nReturned: 3,
docsExamined: 5,
inputStage: {
stage: 'IXSCAN',
nReturned: 5,
keyPattern: { s: 1, r: 1 },
indexName: 's_1_r_1',
direction: 'forward',
indexBounds: { s: [ '[MinKey, MaxKey]' ], r: [ '[MinKey, MaxKey]' ] },
keysExamined: 5,
seeks: 1,
This is not good because no filtering has been applied to the IXSCAN: : [ '[MinKey, MaxKey]'
and docsExamined: 5
have been fetched, more than what we need for the result: nReturned: 3
.
Adding an unbounded range on the sort field
The workaround is to get back to an ESR query with a predicate on the first column, which covers the sort. I can't use an equality predicate as it may have multiple values, and use MinKey
or maxKey
to get a full range on this value:
mdb> db.demoesr.find(
{ s:{$gte:MinKey}, r:{$gt:10} } , {e:1,s:1,r:1,"_id":0 }
).sort({s:1});
[
{ e: 42, r: 40 },
{ e: 42, s: 'b', r: 20 },
{ e: 42, s: 'd', r: 30 }
]
After my earlier observation that a fully covered query would show s: null
coming from the index entry, the documents were retrieved from the collection. This is necessary because "e" is not in the index used here, so the document must be fetched and examined to obtain its value for the projection.
I need to display the execution plan to grasp the benefits of adding a dummy filter to the sort field, when it is the first field of the index:
mdb> db.demoesr.find(
{ s:{$gte:MinKey}, r:{$gt:10} } , {e:1,s:1,r:1,"_id":0 }
).sort({s:1}).explain("executionStats").executionStats;
{
nReturned: 3,
executionTimeMillis: 0,
totalKeysExamined: 5,
totalDocsExamined: 3,
executionStages: {
stage: 'PROJECTION_SIMPLE',
nReturned: 3,
transformBy: { e: 1, s: 1, r: 1, _id: 0 },
inputStage: {
stage: 'FETCH',
nReturned: 3,
docsExamined: 3,
inputStage: {
stage: 'IXSCAN',
nReturned: 3,
keyPattern: { s: 1, r: 1 },
indexName: 's_1_r_1',
direction: 'forward',
indexBounds: { s: [ '[MinKey, MaxKey]' ], r: [ '(10, inf.0]' ] },
keysExamined: 5,
seeks: 3,
The index bound on the first column used for Sort is the same: s: [ '[MinKey, MaxKey]' ]
, which confirms that adding s:{$gte:MinKey}
didn't change the number of index entries (keysExamined: 5
). What has changed is that the range predicate is now pushed down to the IXSCAN as r: [ '(10, inf.0]' ]
, and only docsExamined: 3
documents have been fetched to get the nReturned: 3
documents for the result.
Even if I've added a range predicate using MinKey or MaxKey, it is equivalent to the Equality in the ESR rule because what matters is that it returns a single range, ensuring that it is sorted on the following field of the composite index.
In summary, the ESR (Equality, Sort, Range) Rule outlined by MongoDB is an excellent framework for designing your composite indexes. However, it's essential to review the execution plan to ensure you meet your objective: avoiding the retrieval of too many documents.
Top comments (0)