Home > Software engineering >  How to restructure a collection in MongoDB
How to restructure a collection in MongoDB

Time:10-21

I'm looking to restructure my MongoDB collection and haven't been able to do so. I'm quite new to it and looking for some help. I'm struggling to access move the data within the "itemsList" field.

My collection documents are currently structured like this:

{
  "_id": 1,
  "pageName": "List of Fruit",
  "itemsList":[
    {
      "myID": 101,
      "itemName": "Apple"
    },
    {
      "myID": 102,
      "itemName": "Orange"
    }
  ]
},
{
  "_id": 2,
  "pageName": "List of Computers",
  "itemsList":[
    {
      "myID": 201,
      "itemName": "MacBook"
    },
    {
      "myID": 202,
      "itemName": "Desktop"
    }
  ]
}

The end result

But I would like the data to be restructured so that the value for "itemName" is it's own document.

I would also like to change the name of "myID" to "itemID".

And save the new documents to another collection.

{
  "_id": 1,
  "itemName": "Apple",
  "itemID": 101,
  "pageName": "List of Fruit"
},
{
  "_id": 2,
  "itemName": "Orange",
  "itemID": 102,
  "pageName": "List of Fruit"
},
{
  "_id": 3,
  "itemName": "MacBook",
  "itemID": 201,
  "pageName": "List of Computers"
},
{
  "_id": 4,
  "itemName": "Desktop",
  "itemID": 202,
  "pageName": "List of Computers"
}

What I've tried

I have tried using MongoDB's aggregate functionality, but because there are multiple "itemName" fields in each document, it will add both of them to one Array - instead of one in each document.

db.collection.aggregate([
  {$Project:{
    itemName: "$itemsList.itemName",
    itemID: "$itemsList.otherID",
    pageName: "$pageName"
  }},
  {$out: "myNewCollection"}
])

I've also tried using PyMongo 3.x to loop through the document's fields and save as a new document, but haven't been successful.

Ways to implement it

I'm open to using MongoDB's aggregate functionality, if it can move these items to their own documents, or a Python script (3.x) - or any other means you think can help.

Thanks in advance for your help!

CodePudding user response:

You just need a $unwind to "break" the array. Then you can do some data wrangling and output to your collection.

Note that as you didn't specify the exact requirement for the _id. You might need to take extra handling. Below demonstration use the native _id generation, which will auto assigned ObjectIds.

db.collection.aggregate([
  {
    "$unwind": "$itemsList"
  },
  {
    "$project": {
      "_id": 0,
      "itemName": "$itemsList.itemName",
      "itemID": "$itemsList.myID",
      "pageName": "$pageName"
    }
  },
  {
    $out: "myNewCollection"
  }
])

Here is the Mongo playground for your reference.

  • Related