I am trying to use $addFields to dynamically create a field where any camelCase words or any other non-standard word breaks (word.break, word_break) in a certain field are replaced with a space to be caught in a phrase match.
Viewing the string operators I thought $replaceAll might do the trick but I can't see any way that I can reference a captured group from the find field, nor can the find field use a simple $regex?
In theory I was thinking it would go something like this:
aggregate.push({
$addFields: {
businessNameBreakWords: {
$replaceAll: {
input: '$businessName',
find: regex would go here, e.g. /([a-z])([A-Z])/,
replacement: '$1 $2,
},
},
},
});
Is what I'm trying to do possible?
Input (i.e. the fields as stored in mongo) e.g.
Lisbon FunSushiBar
Escape Room @york.dungeons.minster
Output Using add fields I want it to look something like:
Lisbon Fun Sushi Bar
Escape Room york dungeons minster
I need to do this using addFields or a projection as this aggregation includes a compound phrase search which I need to run against the created field. This is so someone can search "Lisbon Sushi" and that result appear with a high match score, which it currently doesn't due to the camelCasing containing sushi not being a word boundary.
Thanks.
Note: I have also tried $function but this is unavailable to me with the error $function not allowed in this atlas tier
Mongo v: 4.4.1
CodePudding user response:
Split the businessName
string into an array of unicode characters. Reduce the array of characters into a string, replacing special characters with spaces and uppercase characters with space character.
db.users.aggregate([
{
"$addFields": {
"nameRange": {
"$map": {
"input": {
"$range": [
0,
{
"$strLenCP": "$businessName"
},
1
]
},
"as": "inp",
"in": {
"$substr": [
"$businessName",
"$$inp",
1
]
}
}
}
}
},
{
"$addFields": {
"businessNameBreakWords": {
"$reduce": {
"input": "$nameRange",
"initialValue": "",
"in": {
"$cond": [
{
"$regexMatch": {
"input": "$$this",
"regex": "^[A-Z]$"
}
},
{
"$concat": [
"$$value",
" ",
"$$this"
]
},
{
"$cond": [
{
"$regexMatch": {
"input": "$$this",
"regex": "^[.@_]$"
}
},
{
"$concat": [
"$$value",
" ",
]
},
{
"$concat": [
"$$value",
"$$this"
]
}
]
}
]
}
}
}
}
},
{
"$project": {
"businessName": 1,
"businessNameBreakWords": 1
}
}
])
Link to Mongo Playground