Home > Back-end >  Using the Aggregate Framework, how can I embelish a Collection in MongoDB with data from a pipeline
Using the Aggregate Framework, how can I embelish a Collection in MongoDB with data from a pipeline

Time:11-26

db.Parent.insertMany([
{ParentName: "MS", Type: "TAA"},
{ParentName: "MS", Type: "TAB"},
{ParentName: "MS", Type: "TAC"},
{ParentName: "MS", Type: "TAD"},
{ParentName: "GP", Type: "TAA"},
{ParentName: "GP", Type: "TAB"},
{ParentName: "GP", Type: "TAC"},
])

db.Mapper.insertMany([
{Type: "TAA", Identifier: "RA"},
{Type: "TAB", Identifier: "SA"},
{Type: "TAC", Identifier: "TA"},
{Type: "TAD", Identifier: "WA"},
])

db.Items.insertMany ([
{Name: "One", Identifier: "RA"},
{Name: "Two", Identifier: "RA"},
{Name: "Three", Identifier: "RA"},
{Name: "Four", Identifier: "SA"},
{Name: "Five", Identifier: "SA"},
{Name: "Six", Identifier: "WA"},
])

For a given input parameter of "GP" I want to embelish Items with the Type from Parent, mapped via Mapper, so that the output is as follows. I want all Items regardless of whether there are any matches from Parent/Mapper or not.

[
{Name: "One",  Identifier: "RA", Type: "TAA"},
{Name: "Two",  Identifier: "RA", Type: "TAA"},
{Name: "Three",Identifier: "RA", Type: "TAA"},
{Name: "Four", Identifier: "SA", Type: "TAB"},
{Name: "Five", Identifier: "SA", Type: "TAB"},
{Name: "Six",  Identifier: "WA},
]

I got as far as selecting the right rows from Mapper for a given input of "GP" in Parent and got stuck there.

I'm not sure if I could switch the query around and start with Items and then insert any matches.

db.Parent.aggregate(
{$match: {ParentName: "GP}},
{$project: {Type:1, _id:0}},
{$group: {_id:1, types: {$addToSet: "$Type"}}},
{$lookup: {
  from: "Mapper",
  let: {types: "$types"},
  pipeline:[{
    $match: {$expr: {$in: ["$Type", "$$types"]}}    
  }],
  as: "mapper"
}},
??? Right outer join with Items here ???
).pretty()

CodePudding user response:

Query1

  • not nested join(unwind first)
  • lookup with Mapper on Identifier
  • unwind and project to keep only what we need
  • lookup with Parent with 2 creterias
    • "$ParentName" = "GP"
    • and same Type
  • unwind with preserveNullAndEmptyArrays to keep those that didn't join also
  • project to keep only what is needed

Test code here

items.aggregate(
[{"$lookup": 
    {"from": "Mapper",
      "localField": "Identifier",
      "foreignField": "Identifier",
      "as": "joined"}},
  {"$unwind": {"path": "$joined"}},
  {"$project": {"Type": "$joined.Type", "Name": 1, "Identifier": 1}},
  {"$lookup": 
    {"from": "Parent",
      "let": {"type": "$Type", "indentifier": "$Identifier"},
      "pipeline": 
      [{"$match": 
          {"$expr": 
            {"$and": 
              [{"$eq": ["$ParentName", "GP"]}, {"$eq": ["$$type", "$Type"]}]}}},
        {"$set": {"Identifier": "$$indentifier"}}],
      "as": "joined"}},
  {"$unwind": {"path": "$joined", "preserveNullAndEmptyArrays": true}},
  {"$project": 
    {"_id": 0, "Name": 1, "Identifier": "$joined.Identifier", "Type": 1}}])

Query2

  • nested join
  • join on itemIndetifier
  • nested join on type only if ParentName="GP"
  • unwind with option preserveNullAndEmptyArrays like above
  • final unwind
  • project to keep what we need

Test code here

items.aggregate(
[{"$lookup": 
    {"from": "Mapper",
      "let": {"itemIndetifier": "$Identifier"},
      "pipeline": 
      [{"$match": {"$expr": {"$eq": ["$$itemIndetifier", "$Identifier"]}}},
        {"$lookup": 
          {"from": "Parent",
            "let": {"type": "$Type"},
            "pipeline": 
            [{"$match": 
                {"$expr": 
                  {"$and": 
                    [{"$eq": ["$ParentName", "GP"]},
                      {"$eq": ["$$type", "$Type"]}]}}}],
            "as": "joined1"}},
        {"$unwind": 
          {"path": "$joined1", "preserveNullAndEmptyArrays": true}}],
      "as": "joined2"}},
  {"$unwind": {"path": "$joined2"}},
  {"$project": 
    {"_id" : 0,"Name": 1, "Identifier": 1, "Type": "$joined2.joined1.Type"}}])

*Indexes to work on pipeline lookup you need MongoDB 5, if you you benchmark those 2 on your data, send some feedback if you can on which was faster. I think both queries do what you need.

  • Related