Home > Software engineering >  Most efficient way to map two arrays of objects to prepare for 'upload'
Most efficient way to map two arrays of objects to prepare for 'upload'

Time:10-24

Sorry if the title is a bit confusing, I wasn't sure how to word it in a couple of words.

I'm currently dealing with a situation where the user uploads a .csv or excel file, and the data must be mapped properly to prepare for a batch upload. It will make more sense as you read the code below!

First step: The user uploads the .csv/excel file, it's transformed into an array of objects. Generally the first array will be the headers.

The data will look like the below(including headers). This will be anywhere between 100 items to up to ~100,000 items:

const DUMMY_DATA = [
['First Name', 'Last Name', 'company', 'email', 'phone', 'Address', 'City', 'State', 'Zip Code'],
['Lambert', 'Beckhouse', 'StackOverflow', '[email protected]', '512-555-1738', '316 Arapahoe Way', 'Austin', 'TX', '78721'],
['Maryanna', 'Vassman', 'CDBABY', '[email protected]', '479-204-8976', '1126 Troy Way', 'Fort Smith', 'AR', '72916']
]

Once this is uploaded, the user will map out each field to the proper schema. This could either be all of the fields or only a select few.

For instance the user only wants to exclude the address portion except for zip code. We would get back the 'mapped fields' array, renamed to the proper schema names (i.e. First Name => firstName):

const MAPPED_FIELDS = [firstName, lastName, company, email, phone, <empty>, <empty>, <empty>, zipCode]

I've made it so the indexes of the mapped fields will always match the 'headers'. So any unmapped headers will have an value.

So in this scenario we know to only upload the data (of DUMMY_DATA) with the indexes [0, 1, 2, 3, 4, 8].

We then get to the final part where we want to upload the proper fields for all the data, so we would have the properly mapped schemas from MAPPED_FIELDS matching the mapped values from DUMMY_DATA...

const firstObjectToBeUploaded = {
  firstName: 'Lambert',
  lastName: 'BeckHouse',
  company: 'StackOverflow',
  email: '[email protected]',
  phone: '512-555-1738',
  zipCode: '78721'
}

try {
  await uploadData(firstObjectToBeUploaded)
} catch (err) {
  console.log(err)
}

All the data will be sent to an AWS lambda function written in Node.js to handle the upload / logic.

I'm struggling a bit on how to implement this efficiently as the data can get quite large.

CodePudding user response:

You can map the DUMMY_DATA array (minus the headers) into a set of arrays with values being

  1. the key from MAPPED_FIELDS and
  2. the corresponding DUMMY_DATA value with the same index

You can then filter those arrays to remove null keys and turn them into objects using Object.fromEntries:

const DUMMY_DATA = [
['First Name', 'Last Name', 'company', 'email', 'phone', 'Address', 'City', 'State', 'Zip Code'],
['Lambert', 'Beckhouse', 'StackOverflow', '[email protected]', '512-555-1738', '316 Arapahoe Way', 'Austin', 'TX', '78721'],
['Maryanna', 'Vassman', 'CDBABY', '[email protected]', '479-204-8976', '1126 Troy Way', 'Fort Smith', 'AR', '72916']
]

const MAPPED_FIELDS = ['firstName', 'lastName', 'company', 'email', 'phone', null, null, null, 'zipCode']

const objectsToUpload = DUMMY_DATA.slice(1).map(data =>
  Object.fromEntries(MAPPED_FIELDS
    .map((key, idx) => [key, data[idx]])
    .filter(a => a[0])
  )
)

console.log(objectsToUpload)

CodePudding user response:

If you're looking for some performance gains at larger array sizes you can apply the same logic as Nick's answer but implemented in standard for loops. Here isolating the entries() of the MAPPED_FIELDS array once before the loop to avoid repeated generation of the entries iterator and simply skipping null keys rather than filtering them later.

const DUMMY_DATA = [
  ['First Name', 'Last Name', 'company', 'email', 'phone', 'Address', 'City', 'State', 'Zip Code'],
  ['Lambert', 'Beckhouse', 'StackOverflow', '[email protected]', '512-555-1738', '316 Arapahoe Way', 'Austin', 'TX', '78721'],
  ['Maryanna', 'Vassman', 'CDBABY', '[email protected]', '479-204-8976', '1126 Troy Way', 'Fort Smith', 'AR', '72916']
];

const MAPPED_FIELDS = ['firstName', 'lastName', 'company', 'email', 'phone', null, null, null, 'zipCode'];
const MAPPED_FIELDS_ENTRIES = [...MAPPED_FIELDS.entries()];

const objectsToUpload = [];
for (const datum of DUMMY_DATA.slice(1)) {
  const obj = {};
  for (const [idx, key] of MAPPED_FIELDS_ENTRIES) {
    if (key !== null) {
      obj[key] = datum[idx];
    }
  }
  objectsToUpload.push(obj);
}

console.log(objectsToUpload);

  • Related