I am fairly new with MongoDB, so apologizes for what might be a basic question.
I have a bunch of documents in a collection (call it CollA
), and taking a subset of it and the particular field in question would look like:
docs = [
{Value: "1,000,000 "},
{Value: "40,000"},
{Value: "2,000,000 "},
{Value: 500000},
{Value: 400000}]
Note the mixed float and string datatypes within the field "Value".
I am trying to find a way to:
- Convert "Value" to a float (so I assume remove the "," and do a
$ToDouble
on it) - In the cases where the string contains a " ", I would like to set a new field "Truncated" to "true".
The desired output would be
docs = [
{Value: 1,000,000,
Truncated: true},
{Value: 40,000},
{Value: 2,000,000,
Truncated: true}},
{Value: 500000},
{Value: 400000}]
So now they would all be "doubles", and the new field would be there in the documents where the " " existed.
Thanks so much!
CodePudding user response:
You can use $type to differentiate strings vs numbers. For string it's a combinations of if-else conditions, to add "Truncated" field, remove non-number characters, and eventually convert to number with $toDouble. The final $replaceWith is to bring additional fields to the root level:
db.colA.aggregate([
{
"$project": {
doc: {
"$cond": {
"if": {
$eq: [
{
"$type": "$Value"
},
"string"
]
},
"then": {
"$cond": {
"if": {
"$regexMatch": {
"input": "$Value",
"regex": "\\ $"
}
},
"then": {
"Value": {
"$toDouble": {
"$replaceAll": {
"input": {
"$replaceAll": {
"input": "$Value",
"find": " ",
"replacement": ""
}
},
"find": ",",
"replacement": ""
}
}
},
"Truncated": true
},
"else": {
"Value": {
"$toDouble": {
"$replaceAll": {
"input": "$Value",
"find": ",",
"replacement": ""
}
}
}
}
}
},
"else": {
"Value": "$Value"
}
}
}
}
},
{
"$replaceWith": "$doc"
}
])
CodePudding user response:
Query
- if not string (number) convert to double
- else(if string)
- split on (first $let)
- save the first part, removing , and to double (second $let)
- if 2 parts add the first part and
Truncated
- else add only the first part
aggregate(
[{"$replaceRoot":
{"newRoot":
{"$cond":
[{"$not": [{"$eq": [{"$type": "$Value"}, "string"]}]},
{"Value": {"$toDouble": "$Value"}},
{"$let":
{"vars": {"parts": {"$split": ["$Value", " "]}},
"in":
{"$let":
{"vars":
{"parts1":
{"$toDouble":
{"$replaceAll":
{"input": {"$arrayElemAt": ["$$parts", 0]},
"find": ",",
"replacement": ""}}}},
"in":
{"$cond":
[{"$arrayElemAt": ["$$parts", 1]},
{"Value": "$$parts1", "Truncated": true},
{"Value": "$$parts1"}]}}}}}]}}}])