Home > Software design >  mongoDB delete n documents
mongoDB delete n documents

Time:12-16

I have a huge mongoDB collection with 1 million records. Now , except some random 10k records , i would like to delete rest of the records. Is there any way to delete all the records (except keeping 10k) in a single shot? For example , we have rownum in oracle. Is it even possible to do it in mongoDB

my sample collection :

{
    "_id" : ObjectId("5efe4fd0fe4we5e9000185c660a"),
    "modelNumber" : "517083bb-2b35-4fb9-b3b9-004101342418",
    "name" : "abc"
}

db.collection.count() gives me 1 milion records.

I want to wipe out all 1 million records except keeping some random 10k records.

CodePudding user response:

I think you cannot do it in a single shot. According to me you first have to use aggregation pipeline to filter out the id's of the documents you want to delete and then apply deleteMany with id's as filter.

Following code explains this:-

list_of_ids=db.collection.aggregate([{'$sample': {'size': 990000 }}, {'$project' : {'_id' : 1}} ]);
results = db.collection.delete_many({'_id': {'$in': list_of_ids}})

This is how you can achieve your result.

CodePudding user response:

Deleting almost 1 million records will take some time. As you like to keep 10k random documents, I assume these are some test-data. I would suggest to create a new collection with 10k documents and drop the not needed one:

db.collection.renameCollection('collection-old')
db.getCollection('collection-old').aggregate([
   { $limit: 10000 },
   { $merge: { into: 'collection' } }
])
db.getCollection('collection-old').drop()

Note, don't forget to create your indexes on new collection, if applicable.

When you use $merge then documents are merged with existing documents in case your application permanently inserts new documents.

  • Related