Home > OS >  Import CSV with Array object in MongoDB 4.2
Import CSV with Array object in MongoDB 4.2

Time:06-06

I've got a CSV file where there a column fill with Array data, written as ['foo', 'bar', 'baz'] in the file.

Manufacturer;EngineCount;ImagesURL
Boeing,2,['https://url.com/image1', 'https://url.com/image2', 'https://url.com/image3']
Airbus,4,['https://url.com/image1', 'https://url.com/image2', 'https://url.com/image3']

I can't find a way to import it as an Array in MongoDB (4.2 ), it's always imported as string "['https://url.com/image1', 'https://url.com/image2', 'https://url.com/image3']". Maybe it's not possible with .csv file.

So I found a solution on stackoverflow where we update the fields to parse them as an array using JSON.parse function. But I new to MongoDB and I can't success to update the field with old value in MongoDB :

I tried db.planes.updateMany({}, {$set: {ImagesURL:JSON.parse("$ImagesURLS")}}).
It's return error SyntaxError: Unexpected token $ in JSON at position 0 Where is my mistake ?

Thank you.

CodePudding user response:

Option 1: Transform the input abit and use the option --useArrayIndexFields as follow:

input.csv: ( remove the brackets and add index in the ImagesURL header per array field )

 Manufacturer,EngineCount,ImagesURL.0,ImagesURL.1,ImagesURL.2
 Boeing,2,'https://url.com/image1', 'https://url.com/image2', 'https://url.com/image3'
 Airbus,4,'https://url.com/image1', 'https://url.com/image2', 'https://url.com/image3'

And do:

 mongoimport --db=myDB --col=planes --type=csv --headerline --useArrayIndexFields --file=input.csv

And you will have the documents imported as:

{Manufacturer:"Boeing",EngineCount:2,ImagesURL:['https://url.com/image1', 'https://url.com/image2', 'https://url.com/image3']}

{Manufacturer:"Airbus",EngineCount:4,ImageURL:['https://url.com/image1', 'https://url.com/image2', 'https://url.com/image3']}

Option 2: Update after the import based on field type is string

// the document inserted as string: "["https://url.com/image1","https://url.com/image2"]"

   var ops = [];
   db.planes.find({ "ImagesURL": { "$type": 2} }).forEach(doc => {
   var ImagesURL = doc.ImagesURL.split(',').map( e => e.replace(/"|\[|\]|\\/gm,'').toString() );
   ops.push({
"updateOne": {
  "filter": { "_id": doc._id },
  "update": { "$set": { "ImagesURL": ImagesURL } }
  }
 });

     if ( ops.length >= 100 ) {
        db.planes.bulkWrite(ops);
        ops = [];
        }             
  });

if ( ops.length > 0 ) {
 db.planes.bulkWrite(ops);
 ops = [];
}
  • Related