I am new to Mongodb and I have been assigned a task to extract data from mongoDB and create a csv and load it to Oracle database. Below is the data in Mongodb
{{
"_id":"69ajdslsdfdksjfef9",
"col1":"456780",
"refNum":"ref001"
}
{
"clients":{
"CLI_1": "9876547390",
"CLI_2": "fsdfasl"
}
{"names":[
{
"first":"dfsakfj",
"middle":"hgfgas",
"last":"komdssdfsd"
},
{
"first":"dfskdajf",
"middle": "fgjfgjfl",
"last": "ghfghsdklfg"
}]}
second row from collection
{{
"_id":"69ajdslsdfdksjfef9",
"col1":"456780",
"refNum":"ref001"
}
{
"clients":{
"CLI_1": "9876547390",
"CLI_2": "fsdfasl"
}
{"names":[
{
"first":"dfsakfj",
"middle":"hgfgas",
"last":"komdssdfsd"
}]}
I am using pymongo utility to query and create a dataframe before generating the csv file. However, I am able to create a dataframe but not able parse "names" properly as it is inconsistent when compared between the rows. Could anyone please share how to parse and create a csv with fields _id, first, middle, last
Thank You
CodePudding user response:
There is mongodb aggregation pipeline named $unwind which can help you with the task to flatten the names array so it is suitable for csv output as follow:
db.collection.aggregate([
{
$unwind: "$names"
},
{
$project: {
_id: 1,
first: "$names.first",
middle: "$names.middle",
last: "$names.last"
}
}
])// _id, first, middle, last