Home > Software engineering >  Retrieve duplicate documents from mongo collection matching certain conditions
Retrieve duplicate documents from mongo collection matching certain conditions

Time:06-25

Below is a mongo collection named customerFinance containing a list of 4 documents

[
    {"VIN": 100,
    "CUSTOMER_TYPE": "1-PRIMARY BORROWER",
    "CUSTOMER_FIRST_NAME":"MATT",
    "DEAL_TYPE":"LEASE",
    "CONTRACT_START_DATE": "04/30/2021"},
    {"VIN": 101,
    "CUSTOMER_TYPE": "1-PRIMARY BORROWER",
    "CUSTOMER_FIRST_NAME":"JOHN",
    "DEAL_TYPE":"LEASE",
    "CONTRACT_START_DATE": "04/30/2021"},
    {"VIN": 101,
    "CUSTOMER_TYPE": "6-CO BORROWER",
    "CUSTOMER_FIRST_NAME":"SUSAN",
    "DEAL_TYPE":"LEASE",
    "CONTRACT_START_DATE": "04/30/2022"},
    {"VIN": 100,
    "CUSTOMER_TYPE": "1-PRIMARY BORROWER",
    "CUSTOMER_FIRST_NAME":"MATT",
    "DEAL_TYPE":"RETAIL",
    "CONTRACT_START_DATE": "04/30/2022"}
]

My goal is to retrieve only the first and fourth document from this collection as you can see they both have the following fields same - VIN, CUSTOMER_TYPE, CUSTOMER_FIRST_NAME

With the below piece of Java code, I am able to filter those VINs whose count is greater than 1

GroupOperation groupByVin = group("VIN").count().as("VIN_COUNT");
MatchOperation filterVins = match(new Criteria("VIN_COUNT").gt(1));
Aggregation aggregation = newAggregation(groupByVin, filterVins);
AggregationResults<Document> result = mongoTemplate.aggregate(aggregation, "customerFinance", Document.class);
return result.getMappedResults();

And I get the below result

   [{"_id": 100,
    "VIN_COUNT": 2}]

But I want my result to look like the below format

[
{"VIN": 100,
 "CUSTOMER_TYPE": "1-PRIMARY BORROWER",
 "CUSTOMER_FIRST_NAME":"MATT",
 "DEAL_TYPE":"LEASE",
 "CONTRACT_START_DATE": "04/30/2021"
},
{"VIN": 100,
 "CUSTOMER_TYPE": "1-PRIMARY BORROWER",
 "CUSTOMER_FIRST_NAME":"MATT",
 "DEAL_TYPE":"RETAIL",
 "CONTRACT_START_DATE": "04/30/2022"
}
]

Could you please help me understand on how can I achieve this?

CodePudding user response:

With mongoDB syntax you can do it like this:

Since you want the original documents in your results, you need to keep them during the $group step. One way of doing it, is $push: "$ROOT". The last two steps are just to format it as you wanted, since they are grouped together.

db.collection.aggregate([
  {$group: {_id: "$VIN", data: {$push: "$$ROOT"}, count: {$sum: 1}}},
  {$match: {count: {$gt: 1}}},
  {$unwind: "$data"},
  {$replaceRoot: {newRoot: "$data"}}
])

See how it works on the playground example

  • Related