DEV Community

Iftakhar
Iftakhar

Posted on

MongoDB Aggregation Pipeline

Introduction

Aggregation operation are very important in any type of database whether it is SQL or NoSQL .To perform aggregations operations MongoDB groups values from multiple documents together and then performs a variety of operations on grouped data to return a single result .SQL uses aggregate function to return a single value calculated from values in columns.

MongoDB has three ways to perform aggregation:

  1. The aggregation pipeline
  2. The map reduce Function
  3. Single purpose aggregation method

In this article we will focus on Aggregation pipeline .

Aggregation Pipeline

Aggregation pipelines can have one or more "stages". The order of these stages are important. Each stage acts upon the results of the previous stage.
examples

db.posts.aggregate([
  // Stage 1: Only find documents that have more than 1 like
  {
    $match: { likes: { $gt: 1 } }
  },
  // Stage 2: Group documents by category and sum each categories likes
  {
    $group: { _id: "$category", totalLikes: { $sum: "$likes" } }
  }
])
Enter fullscreen mode Exit fullscreen mode

Various stages in pipeline are

  • $project select, reshape data
  • $match filter data
  • $group aggregate data
  • $sort shorts data
  • $skip skips data
  • $limit limit data
  • $unwind normalizes data

Stages of pipeline

$project

in project phase we can added a key ,remove a key , reshape a key.There are some simple functions that we can use on the key $toUpper, $toLower, $add,$multiply etc

db.mycollection.aggregate([{
    $project: {
        _id: 0,
        'department': {
            $toUpper: '$dept_name'
        },
        'new_experience': {
            $add: ['$experience', 1]
        }
    }
}])
Enter fullscreen mode Exit fullscreen mode

$match

It works exactly like ‘where clause' in SQL to filter out the records. The reason we might want to match, is because we would like to filter the results and only aggregate a portion of the documents or search for particular parts of the results set, after we do the grouping. Let's say, in our collection, we want to aggregate documents having department equals to sales. The query will be.

db.mycollection.aggregate([{
    $match: {
        dept_name: 'Sales'
    }
}])
Enter fullscreen mode Exit fullscreen mode

$group

As the name suggested $group group the document based on some key
,lets say we want to group employs on their department name and we want to find the number of employees in each department .

db.mycollection.aggregate([{
    $group: {
        _id: '$dept_name',
        no_of_employees: {
            $sum: 1
        }
    }
}])
Enter fullscreen mode Exit fullscreen mode

$sort

Sort helps you to sort the data after aggregation, in ascending or descending order as per your need. Let’s say, we want to group department name in ascending order and find out the number of employees.

db.mycollection.aggregate([{
    $group: {
        _id: '$dept_name',
        no_of_employees: {
            $sum: 1
        }
    }
}, {
    $sort: {
        _id: 1
    }
}])
Enter fullscreen mode Exit fullscreen mode

$skip and $limit

$skip and $limit, as the names suggest, skip and limit work respectively when we do a simple find. It doesn’t make any sense to skip and limit unless we first sort, otherwise, the result is undefined.

We first skip records and then we limit those.

db.mycollection.aggregate([{
    $group: {
        _id: '$dept_name',
        no_of_employees: {
            $sum: 1
        }
    }
}, {
    $sort: {
        _id: 1
    }
}, {
    $skip: 2
}, {
    $limit: 1
}])
Enter fullscreen mode Exit fullscreen mode

$unwind

As we know in MongoDB, documents can have arrays. It is not easy to group on something within an array. $unwind first un-joins the array data and then basically rejoins it in a way that lets us do grouping calculations on it.

Let’s say, we have a document like this.

{
    a: somedata,
    b: someotherdata,
    c: [arr1, arr2, arr3]
}

After $unwind on‘ c’, we will get three documents.

{
    a: somedata,
    b: someotherdata,
    c: arr1
} {
    a: somedata,
    b: someotherdata,
    c: arr2
} {
    a: somedata,
    b: someotherdata,
    c: arr3
}
Enter fullscreen mode Exit fullscreen mode

By mastering these stages, you can harness the full power of MongoDB's aggregation pipeline to analyze and manipulate data efficiently. Let me know if you need further clarification!

Top comments (0)