Comprehensive Guide to MongoDB Aggregation Framework
MongoDB’s Aggregation Framework is a powerful tool that allows you to process and transform data stored in MongoDB documents. It enables complex data manipulations such as filtering, grouping, sorting, and joining data in a flexible and efficient way. The aggregation framework is essential for tasks like creating reports, performing data analysis, and transforming data in MongoDB.
This article provides a complete and detailed guide to MongoDB Aggregation Framework, explaining the key stages, operators, syntax, and real-world examples. We will walk through each aggregation stage, how it works, and what the output will look like for every example.
1. What is Aggregation in MongoDB?
Aggregation in MongoDB refers to the process of transforming and combining data from multiple documents into a meaningful result. It is similar to the GROUP BY
operation in SQL but far more flexible and powerful.
MongoDB offers a variety of methods for performing aggregation, but the aggregation pipeline is the most common and powerful method. The aggregation pipeline processes data by passing it through multiple stages, each performing specific operations on the data.
2. The Aggregation Pipeline
The aggregation pipeline consists of a sequence of stages, each transforming the data in some way. The data flows through these stages sequentially, with each stage receiving the output of the previous stage.
Basic Syntax:
db.collection.aggregate([
{ stage1 },
{ stage2 },
{ stage3 },
...
])
Each stage is enclosed in {}
, and multiple stages are passed in an array. The data is processed step by step, with each stage operating on the results of the previous one.
Key Stages in the Aggregation Pipeline
-
$match: Filters the data based on conditions (similar to SQL
WHERE
). -
$group: Groups documents based on a field and performs aggregation operations (similar to SQL
GROUP BY
). - $project: Reshapes the data by selecting and renaming fields.
- $sort: Sorts the documents in the specified order.
- $limit: Limits the number of documents passed to the next stage.
- $skip: Skips a specified number of documents.
- $lookup: Performs a left outer join between two collections.
3. Aggregation Stages Explained
3.1 $match Stage
The $match
stage filters documents based on a specified condition. It is equivalent to the WHERE
clause in SQL.
Syntax:
{
$match: { field: value }
}
Example:
Filter documents where age
is greater than 30:
db.users.aggregate([
{ $match: { age: { $gt: 30 } } }
])
Output:
[
{ "_id": 1, "name": "Alice", "age": 35 },
{ "_id": 2, "name": "Bob", "age": 40 }
]
3.2 $group Stage
The $group
stage groups documents based on a specific field and allows you to perform aggregation operations such as sum
, avg
, count
, etc.
Syntax:
{
$group: {
_id: <expression>,
field1: { <operator>: <expression> },
field2: { <operator>: <expression> }
}
}
Example:
Group users by their age
and calculate the average salary
for each age group:
db.users.aggregate([
{
$group: {
_id: "$age", // Group by age
averageSalary: { $avg: "$salary" }
}
}
])
Output:
[
{ "_id": 25, "averageSalary": 5000 },
{ "_id": 30, "averageSalary": 6000 },
{ "_id": 35, "averageSalary": 7000 }
]
3.3 $project Stage
The $project
stage reshapes each document by selecting and/or renaming fields, adding new fields, or excluding fields. It is similar to the SELECT
clause in SQL.
Syntax:
{
$project: {
field1: 1,
field2: 0,
newField: { <expression> }
}
}
Example:
Project only the name
and age
fields, and create a new field ageInMonths
:
db.users.aggregate([
{
$project: {
name: 1,
age: 1,
ageInMonths: { $multiply: ["$age", 12] }
}
}
])
Output:
[
{ "name": "Alice", "age": 25, "ageInMonths": 300 },
{ "name": "Bob", "age": 30, "ageInMonths": 360 }
]
3.4 $sort Stage
The $sort
stage orders the documents in ascending or descending order based on a field or fields.
Syntax:
{
$sort: { field: 1 } // 1 for ascending, -1 for descending
}
Example:
Sort users by age
in descending order:
db.users.aggregate([
{ $sort: { age: -1 } }
])
Output:
[
{ "_id": 2, "name": "Bob", "age": 30 },
{ "_id": 1, "name": "Alice", "age": 25 }
]
3.5 $limit Stage
The $limit
stage limits the number of documents passed to the next stage in the pipeline. This is useful for pagination and restricting the number of results.
Syntax:
{
$limit: <number>
}
Example:
Limit the result to 5 documents:
db.users.aggregate([
{ $limit: 5 }
])
3.6 $skip Stage
The $skip
stage skips a specified number of documents and passes the remaining documents to the next stage in the pipeline. This is useful for pagination purposes.
Syntax:
{
$skip: <number>
}
Example:
Skip the first 5 documents:
db.users.aggregate([
{ $skip: 5 }
])
3.7 $lookup Stage (Join)
The $lookup
stage is used to perform a left outer join between two collections. It is similar to SQL JOIN
operations and allows you to combine data from multiple collections.
Syntax:
{
$lookup: {
from: "other_collection", // The collection to join
localField: "field_in_local_collection", // Field from the local collection
foreignField: "field_in_foreign_collection", // Field from the foreign collection
as: "output_field" // The name of the field to store the results
}
}
Example:
Join users
with orders
:
db.users.aggregate([
{
$lookup: {
from: "orders", // The collection to join
localField: "order_id", // Field in `users` collection
foreignField: "_id", // Field in `orders` collection
as: "order_details" // Name of the new field in the output
}
}
])
Output:
[
{
"_id": 1,
"name": "Alice",
"order_id": 101,
"order_details": [
{ "_id": 101, "product": "Laptop" },
{ "_id": 102, "product": "Phone" }
]
},
{
"_id": 2,
"name": "Bob",
"order_id": 102,
"order_details": [{ "_id": 102, "product": "Phone" }]
}
]
Conclusion
MongoDB's Aggregation Framework provides a powerful and flexible way to manipulate and process data. By utilizing aggregation stages like $match
, $group
, $project
, $sort
, $limit
, and $lookup
, you can create complex data processing pipelines to achieve a wide variety of results.
Through the aggregation pipeline, you can filter, transform, group, and join data, providing a rich set of tools to extract meaningful insights from your MongoDB collections.
This guide covered the essential stages and operations in MongoDB’s aggregation framework, providing examples and explanations of how to use them effectively. Whether you're performing simple queries or complex data transformations, the aggregation framework is a core component of MongoDB that can help you achieve your goals efficiently.
Top comments (0)