Home > Software design >  percentage in mongo/ aggregation over collection
percentage in mongo/ aggregation over collection

Time:02-26

hello
I've this kind of data :

[
  {
    "id_article": "ARE-12-P",
    "status": "Failed",
    "NbOfarticles": 2,
    "DateExtraction": "21/10/2021"
    
  },
  {
    "id_article": "ARE-12-P",
    "status": "ok",
    "NbOfarticles": 21,
    "DateExtraction": "21/10/2021"

  },
  {
    "id_article": "WW-XX-P",
    "status": "ok",
    "NbOfarticles": 32,
    "DateExtraction": "21/11/2021"

  },
  {
    "id_article": "WW-XX-P",
    "status": "Failed",
    "NbOfarticles": 32,
    "DateExtraction": "21/11/2021"

  },
]

and I'd like to have this kind of aggregation (with percentage of articles by day , for each entry):

[
  {
    "id_article": "ARE-12-P",
    "status": "Failed",
    "Percent" : "0.30",
    "DateExtraction": "21/10/2021"
    
  },


  ,<... and so on ... >
]

Do you have any idea how I can proceed with mongo aggregations ?

Many many thanks !

CodePudding user response:

count each document

db.collection.aggregate([
  {
    $setWindowFields: {
      partitionBy: "$DateExtraction",
      sortBy: {},
      output: {
        s: {
          $sum: "$NbOfarticles",
          window: {
            documents: [
              "unbounded",
              "unbounded"
            ]
          }
        }
      }
    }
  },
  {
    "$set": {
      "Percent": {
        $round: [
          {
            "$divide": [
              "$NbOfarticles",
              "$s"
            ]
          },
          2
        ]
      }
    }
  }
])

mongoplayground

CodePudding user response:

For those not yet on v5.0, here is an alternative that uses the "group and push $$CURRENT" approach to hang on to all the inputs that go into a group so they can be used later.

db.foo.aggregate([
    {$group: {_id: "$DateExtraction", total: {$sum:"$NbOfarticles"},
              X: {$push: "$$CURRENT"} }}
    ,{$unwind: "$X"}

    // At this point, you can see all the raw materials for the percentage are
    // now at hand.  Reading this next stage "inside out", we first compute
    // pct, then assign it to key 'pct' in a new object, which we then merge
    // with the existing object (effectively, this is 'X.pct = calc'), and
    // lastly we turn the full-fledged 'X' object -- which includes '_id' and
    // everything else -- into the root doc: 
    ,{$replaceRoot: {newRoot: {$mergeObjects: ["$X",
                       {"pct":{$round:[{$divide:["$X.NbOfarticles","$total"]},2]}}
                                              ]}
    }}
]);

Here is a variation that moves the work into a $map before the $unwind. Operating on the array X inside the doc instead of as unwound objects gives us the additional opportunity to cut down the size of the array and thus create fewer new objects, although in this example we do not do that. Note that the heart of the matter (calc pct, set up a new object with just pct:calc, and merge) is the same for both approaches.

db.foo.aggregate([
    {$group: {_id: "$DateExtraction", total: {$sum:"$NbOfarticles"}, X: {$push: "$$CURRENT"} }}

    ,{$project: {X: {$map: {input: "$X",
        in: {$mergeObjects: [ "$$this",
              {'pct':{$round:[{$divide:["$$this.NbOfarticles","$total"]},2]} }
                            ]
            }
                           }}
    }}

    ,{$unwind: "$X"}
    ,{$replaceRoot: {newRoot: "$X"}}
]);
  • Related