Home > Net >  Mongodb aggregate frequencies of every field (dichotomous) in one query
Mongodb aggregate frequencies of every field (dichotomous) in one query

Time:09-26

i am fairly new to mongodb and try to develop a nice way of evaluating a so called multiple choice question.

The data looks like this:

db.test.insertMany([
  {
    Q2_1: 1,
    Q2_2: -77,
    Q2_3: 1
  },
  {
    Q2_1: -77,
    Q2_2: -77,
    Q2_3: 1
  },
  {
    Q2_1: 1,
    Q2_2: 0,
    Q2_3: 0
  },
  {
    Q2_1: 0,
    Q2_2: 1,
    Q2_3: 0
  }
])

In this example we have 4 probands, who gave answers to 3 items. Every field can contain one of three values -77, 0, 1

-77: proband did not see the item. So it is neither calculated in the 'base' NOR in 'abs'.

0: proband did see the item, but did not choose it. (counts for 'base' BUT NOT for 'abs')

1: proband did see the item, and chose it. (counts for 'base' AND for 'abs')

now i want a result for every item. So item 1 (Q2_1 has the key value of 1 and so on)

so item 1 would have been seen by 3 probands so the 'base' would be 3. it would have been chosen by two probands so the 'abs' would be 2. and therefore the 'perc' would be 0.666666.

expected result array:

[
  {
    "key": 1,
    "abs": 2,
    "base": 3,
    "perc": 0.6666666666
  },
  {
    "key": 2,
    "abs": 1,
    "base": 2,
    "perc": 0.5
  },
  {
    "key": 3,
    "abs": 2,
    "base": 4,
    "perc": 0.5
  }
]

Is it possible to do this evaluation in one aggregation query and get this expected result array?

thanks for help :-)

CodePudding user response:

Query

  • $objectToArray to remove the data from the keys (you should not save data on fields, fields are for the schema only, MongoDB query language is not made for data in fields)
  • unwind and replace root
  • group and 2 condition based accumulators base and abs
  • add the perc and fix the key, split on _ and take the second part
  • sort by key

*query is bigger because data on fields doesn't work good in MongoDB, so try to avoid it

Playmongo (you can put the mouse in the end of each stage to see what it does)

aggregate(
[{"$unset": ["_id"]}, {"$set": {"data": {"$objectToArray": "$$ROOT"}}},
 {"$unwind": "$data"}, {"$replaceRoot": {"newRoot": "$data"}},
 {"$group": 
   {"_id": "$k",
    "base": {"$sum": {"$cond": [{"$eq": ["$v", -77]}, 0, 1]}},
    "abs": {"$sum": {"$cond": [{"$eq": ["$v", 1]}, 1, 0]}}}},
 {"$set": {"key": {"$arrayElemAt": [{"$split": ["$_id", "_"]}, 1]}}},
 {"$set": {"_id": "$$REMOVE", "perc": {"$divide": ["$abs", "$base"]}}},
 {"$sort": {"key": 1}}])
  • Related