What is MongoDB aggregate pipeline?-CodePudding

Using Mongoose driver

Consider the following code :

Connecting to database

const mongoose = require("mongoose");
require("dotenv").config();

mongoose.connect(process.env.DB);
const userSchema = new mongoose.Schema({ name: String }, {collection: 'test'});
const Model = mongoose.model('test', userSchema);

Creating dummy document

async function createDocs() {
    await Model.deleteMany({});
    await Model.insertMany([{name: "User1"}, {name: "User2"}, {name: "User3"},{name: "User4"}])
}
createDocs();

Filtering data using Model.find()

async function findDoc () {
    let doc = await Model.find({name: 'User1'});
    console.log(`Using find method : ${doc}`);
}
findDoc();

Filtering data using Model.aggregate()


async function matchDoc() {
    let doc = await Model.aggregate([
        {
            $match: {name : 'User1'}
        }
    ])
    console.log(`Using aggregate pipeline : `, doc);
}
matchDoc();

• Both the processes produce the same output

Q1) What is an aggregate pipeline and why use it?

Q2) Which method of retrieving data is faster?

CodePudding user response：

I will not get too much into this as there is a lot of information online. But essentially the aggregation pipeline gives you access to a lot of strong operators - mainly used for data analysis and object restructuring, for simple get and set operations there is no use for it.

A "real life" example of when you'd want to use the aggregation pipeline is if you want to calculate an avg of a certain value in your data, obviously this is just the tip of the iceberg in terms of power that this feature allows.

find is slightly faster, the aggregation pipeline has some overhead when compared to the query language as each stage has to load the BSON documents into memory, where find doesn't. If your use case is indeed just a simple query then find is the way to go.

CodePudding user response：

You are checking a smaller piece of the aggregation pipeline.

You cannot do pipeline with a single query using find

Let's say you want to find all the orders which have a product that was purchased within a duration. Here, orders and customers are two different collections, You need multiple stages.

Let's say you stored data in a different format, For ex, date as a string, integers as a decimal. If you want to convert on the fly, you can use aggregation.
If you want to use aggregation operators in an update operation from mongo 4.2 , It helps.
You can restructure data in find. Let's say I just need a few fields after aggregation from an array field.
You cannot find a particular array or object element matching a condition using simple find. elemMatch is not that powerful.
You cannot group things with simple find

And many more.

I request you to check aggregate operators and relevant examples from the documentation

Data retrieving depends on not depend on the aggregation or find. It depends on the data, hardware, and index settings.