I have a collection that have a field named "data" that can have any fields, and I have to get all existing fields in all collections in that "data" field or get the documents that have diferents fields in that "data" field.
for example, if I have:
[
{
_id: "45454",
name: "fulano",
city: "cali",
data: {
age: 12,
lastName: "panguano",
cars: 0
}
},
{
_id: "67899",
name: "juanito",
city: "cali",
data: {
age: 23,
lastName: "merlano",
cars: 2
}
},
{
_id: "67899",
name: "olito",
city: "nw",
data: {
lastName: "betito",
cars: 2
}
},
{
_id: "11223",
name: "cabrito",
city: "trujillo",
data: {
age: 28,
cars: 1,
moto: 3
}
},
]
what I would like to get:
["age", "lastName", "cars", "moto"]
or :
documents where the "data" fields vary, regardless of their values.
[
{
_id: "45454",
name: "fulano",
city: "cali",
data: {
age: 12,
lastName: "panguano",
cars: 0
}
},
{
_id: "67899",
name: "olito",
city: "nw",
data: {
lastName: "betito",
cars: 2
}
},
{
_id: "11223",
name: "cabrito",
city: "trujillo",
data: {
age: 28,
cars: 1,
moto: 3
}
}
]
THE COLLECTION HAVE SO MANY DOCUMENTS CAN BE A PROBLEM IF I USE FINDALL AND THEN USE A LOOP LIKE FOR (FOR THE RESOURCES)
CodePudding user response:
Here's a way using javascript once you have an array of all documents in the collection:
let arr = [
{
_id: "45454",
name: "fulano",
city: "cali",
data: {
age: 12,
lastName: "panguano",
cars: 0
}
},
{
_id: "67899",
name: "juanito",
city: "cali",
data: {
age: 23,
lastName: "merlano",
cars: 2
}
},
{
_id: "67899",
name: "olito",
city: "nw",
data: {
lastName: "betito",
cars: 2
}
},
{
_id: "11223",
name: "cabrito",
city: "trujillo",
data: {
age: 28,
cars: 1,
moto: 3
}
},
]
You can use the .map method to get an array of the data objects like so:
arr = arr.map(obj => obj.data)
This will return
[
{
"age": 12,
"lastName": "panguano",
"cars": 0
},
{
"age": 23,
"lastName": "merlano",
"cars": 2
},
{
"lastName": "betito",
"cars": 2
},
{
"age": 28,
"cars": 1,
"moto": 3
}
]
Then you can get an array of data object keys by looping through the array of data objects like so:
let dataKeys = [];
arr.forEach(obj => {
dataKeys = [...dataKeys, ...Object.keys(obj)]
})
This returns an array of non unique keys:
dataKeys = [
"age",
"lastName",
"cars",
"age",
"lastName",
"cars",
"lastName",
"cars",
"age",
"cars",
"moto"
]
Then filter out the unique keys using .filter and .findIndex methods:
let uniqueKeys = dataKeys.filter((elem, index) => dataKeys.findIndex(obj => obj === elem) === index)
And this will give you
[
"age",
"lastName",
"cars",
"moto"
]
CodePudding user response:
Regardless how you execute this (in memory or on the db) this is a very expensive query, with that said I agree doing this in memory is the wrong approach.
Here's how to do it using the aggregation pipeline and some standard operators like $map
and $objectToArray
:
db.collection.aggregate([
{
$project: {
keys: {
$map: {
input: {
"$objectToArray": "$data"
},
in: "$$this.k"
}
}
}
},
{
"$unwind": "$keys"
},
{
$group: {
_id: "$keys"
}
}
])