Home > Net >  how to calculate the statistic inside an array in mongodb aggregation?
how to calculate the statistic inside an array in mongodb aggregation?

Time:12-29

I am working on mongodb to build a simple survey app, the hardest part is to get the statistic using mongodb 's aggregation framework.

Here is one submit answers of the survey. The answers property is a array carrying the answers of each question .

{
   uid:'xxxx',
   surveyId:'xxxxx',
   answers:[
     {   answer: A },    // question 1 ‘s answer
     {   answer: B },    // question 2 ‘s answer 
     {   answer: C }      // question 3 ‘s answer 
   ]
}

The final result I want to have is to know the overall statistic of the survey.

   Question 1:
   A  50%  B 40% C 10%
   Question 2:
   A  60%  B 40% C 0%

The tricky part of mongodb aggregation is how to deal the 'array'

CodePudding user response:

A simple option is to $unwind and $group:

db.collection.aggregate([
  {$unwind: {path: "$answers", includeArrayIndex: "index"}},
  {$group: {
      _id: {
        surveyId: "$surveyId",
        index: "$index"
      },
      A: {$sum: {$cond: [{$eq: ["$answers.answer", "A"]}, 1, 0]}},
      B: {$sum: {$cond: [{$eq: ["$answers.answer", "B"]}, 1, 0]}},
      C: {$sum: {$cond: [{$eq: ["$answers.answer", "C"]}, 1, 0]}},
      all: {$sum: 1}
  }},
  {$project: {
      A: {$multiply: [{$divide: ["$A", "$all"]}, 100]},
      B: {$multiply: [{$divide: ["$B", "$all"]}, 100]},
      C: {$multiply: [{$divide: ["$C", "$all"]}, 100]},
      question: {$add: ["$_id.index", 1]},
      surveyId: "$_id.surveyId",
      _id: 0
  }}
])

See how it works on the playground example

Or a more generic approach can use $group twice without knowing the answer options for each question:

db.collection.aggregate([
  {$unwind: {path: "$answers", includeArrayIndex: "index"}},
  {$group: {
      _id: {
        surveyId: "$surveyId",
        index: "$index",
        res: "$answers.answer"
      },
      count: {$sum: 1}
  }},
  {$group: {
      _id: {
        surveyId: "$_id.surveyId",
        index: "$_id.index"
      },
      data: {$push: {answer: "$_id.res", count: "$count"}},
      all: {$sum: "$count"}
  }},
  {$project: {
      data: {
        $map: {
          input: "$data",
          in: {
            answer: "$$this.answer",
            percent: {$multiply: [{$divide: ["$$this.count", "$all"]}, 100]}
          }
        }
      },
      question: {$add: ["$_id.index", 1]},
      surveyId: "$_id.surveyId",
      _id: 0
  }}
])

See how it works on the playground example

CodePudding user response:

> db.collection.find({},{"_id":0});
< { uid: 'xxxx',
    surveyId: 'xxx',
    answers: [ { answer: 'A' }, { answer: 'B' }, { answer: 'C' } ] }
  { uid: 'xxxx',
    surveyId: 'xxx',
    answers: [ { answer: 'B' }, { answer: 'C' }, { answer: 'C' } ] }
  { uid: 'xxxx',
    surveyId: 'xxx',
    answers: [ { answer: 'B' }, { answer: 'B' }, { answer: 'B' } ] }
  { uid: 'xxxx',
    surveyId: 'xxx',
    answers: [ { answer: 'B' }, { answer: 'A' }, { answer: 'C' } ] }
  { uid: 'xxxx',
    surveyId: 'xxx',
    answers: [ { answer: 'B' }, { answer: 'B' }, { answer: 'C' } ] }

Using custom $accumulator,

> db.collection.aggregate([
  {
      $group: {
          "_id": "$surveyId",
          "statistics": {
              $accumulator: {
                  initArgs: [3],                        //number of questions as argument for the init function
                  init: function(questions){            //initialise 3 arrays for each question, with all counts set to 0
                      let answerArray = new Array();
                      for(i=1;i<=questions;i  ){
                          answerArray.push({"question":i,"A":0,"B":0,"C":0,"total":0});
                      }
                      return answerArray;
                  },
                  accumulateArgs: ["$answers"],         //answers list as argument for accumulate function
                  accumulate: function(state, answers) {            //increment the counter/total for each answer
                      for(i=0;i<answers.length;i  ){
                          switch(answers[i].answer){
                              case "A": state[i].A = state[i].A 1; break;
                              case "B": state[i].B = state[i].B 1; break;
                              case "C": state[i].C = state[i].C 1; break;
                          }
                          state[i].total = state[i].total 1
                      }
                      return state;
                  },
                  merge: function(state1, state2) {                 //merge function to merge the intermediate results
                      for(i=0;i<state.length;i  ){
                          state[i].A = state1[i].A state2[i].A
                          state[i].B = state1[i].B state2[i].B
                          state[i].C = state1[i].C state2[i].C
                          state[i].total = state1[i].total state2[i].total
                          }
                      return state;
                  },
                  finalize: function(state){                        //finalize the result by calculating % ratio
                      for(i=0;i<state.length;i  ){
                          state[i].A = state[i].A*100/state[i].total;
                          state[i].B = state[i].B*100/state[i].total;
                          state[i].C = state[i].C*100/state[i].total;
                          delete state[i].total;
                      }
                      return state;
                  },
                  lang: "js"
              }
          }
      }
  }
  ]);
< {
      "_id" : "xxx",
      "statistics" : [ 
          {
              "question" : 1.0, "A" : 20.0, "B" : 80.0, "C" : 0.0
          }, 
          {
              "question" : 2.0, "A" : 20.0, "B" : 60.0, "C" : 20.0
          }, 
          {
              "question" : 3.0, "A" : 0.0, "B" : 20.0, "C" : 80.0
          }
      ]
  }
  • Related