Introduction
Aggregation operation are very important in any type of database whether it is SQL or NoSQL .To perform aggregations operations MongoDB groups values from multiple documents together and then performs a variety of operations on grouped data to return a single result .SQL uses aggregate function to return a single value calculated from values in columns.
MongoDB has three ways to perform aggregation:
- The aggregation pipeline
- The map reduce Function
- Single purpose aggregation method
In this article we will focus on Aggregation pipeline .
Aggregation Pipeline
Aggregation pipelines can have one or more "stages". The order of these stages are important. Each stage acts upon the results of the previous stage.
examples
db.posts.aggregate([
// Stage 1: Only find documents that have more than 1 like
{
$match: { likes: { $gt: 1 } }
},
// Stage 2: Group documents by category and sum each categories likes
{
$group: { _id: "$category", totalLikes: { $sum: "$likes" } }
}
])
Various stages in pipeline are
- $project select, reshape data
- $match filter data
- $group aggregate data
- $sort shorts data
- $skip skips data
- $limit limit data
- $unwind normalizes data
Stages of pipeline
$project
in project phase we can added a key ,remove a key , reshape a key.There are some simple functions that we can use on the key $toUpper, $toLower, $add,$multiply etc
db.mycollection.aggregate([{
$project: {
_id: 0,
'department': {
$toUpper: '$dept_name'
},
'new_experience': {
$add: ['$experience', 1]
}
}
}])
$match
It works exactly like ‘where clause' in SQL to filter out the records. The reason we might want to match, is because we would like to filter the results and only aggregate a portion of the documents or search for particular parts of the results set, after we do the grouping. Let's say, in our collection, we want to aggregate documents having department equals to sales. The query will be.
db.mycollection.aggregate([{
$match: {
dept_name: 'Sales'
}
}])
$group
As the name suggested $group group the document based on some key
,lets say we want to group employs on their department name and we want to find the number of employees in each department .
db.mycollection.aggregate([{
$group: {
_id: '$dept_name',
no_of_employees: {
$sum: 1
}
}
}])
$sort
Sort helps you to sort the data after aggregation, in ascending or descending order as per your need. Let’s say, we want to group department name in ascending order and find out the number of employees.
db.mycollection.aggregate([{
$group: {
_id: '$dept_name',
no_of_employees: {
$sum: 1
}
}
}, {
$sort: {
_id: 1
}
}])
$skip and $limit
$skip and $limit, as the names suggest, skip and limit work respectively when we do a simple find. It doesn’t make any sense to skip and limit unless we first sort, otherwise, the result is undefined.
We first skip records and then we limit those.
db.mycollection.aggregate([{
$group: {
_id: '$dept_name',
no_of_employees: {
$sum: 1
}
}
}, {
$sort: {
_id: 1
}
}, {
$skip: 2
}, {
$limit: 1
}])
$unwind
As we know in MongoDB, documents can have arrays. It is not easy to group on something within an array. $unwind first un-joins the array data and then basically rejoins it in a way that lets us do grouping calculations on it.
Let’s say, we have a document like this.
{
a: somedata,
b: someotherdata,
c: [arr1, arr2, arr3]
}
After $unwind on‘ c’, we will get three documents.
{
a: somedata,
b: someotherdata,
c: arr1
} {
a: somedata,
b: someotherdata,
c: arr2
} {
a: somedata,
b: someotherdata,
c: arr3
}
By mastering these stages, you can harness the full power of MongoDB's aggregation pipeline to analyze and manipulate data efficiently. Let me know if you need further clarification!
Top comments (0)