Using JavaScript: I have an array of objects that I'm trying to determine the duplicate entries of so that I can eventually pass those duplicate entries to a separate function to remove them from a database.
My sample array could be:
const myArray = [
{ 'id': 111, 'lorem': 'ipsum' },
{ 'id': 222, 'lorem': 'dorem' },
{ 'id': 111, 'lorem': 'polus' },
{ 'id': 111, 'lorem': 'waifu' },
]
I'd want to return an array of all items that would be duplicate by the key id
. In this example, my returned array would be:
[
{ 'id': 111, 'lorem': 'ipsum' },
{ 'id': 111, 'lorem': 'polus' },
{ 'id': 111, 'lorem': 'waifu' },
]
Most of the online tutorials have me iterating over a short list of data, and is great for such small data examples. But my dataset is in the thousands, if not millions, as my data grows. So I'm trying to find a smarter way of handling this logic.
I understand that I can run a Set()
, but that doesn't actually give me the duplicate entries - that gives me an array with non-duplicates. My need is to return such duplicates, not to have a new array of non-duplicate entries.
Without using a third party such as lodash or underscore, how would I ideally iterate over an array with unknown size, to eventually return the duplicate items for me to pass up the stream for processing?
CodePudding user response:
This may take a while
I recommend you do this on the server
// create an array with random IDs
const myArray = []
for (let i = 0; i < 10000; i ) {
myArray.push({
id: String(Math.floor(Math.random() * 10000)).padStart(3, "0"),
"lorem": "ipsum"
})
}
// examine them
const ids = []
const dupes = []
myArray.forEach(({id}) => {
if (ids.includes(id)) dupes.push(id);
else ids.push(id)
})
console.log(ids.length, dupes, dupes.length)
<iframe name="sif1" sandbox="allow-forms allow-modals allow-scripts" frameborder="0"></iframe>
CodePudding user response:
- Using
Array#reduce
, iterate over the array while updating aMap
to group items byid
- Using
Map#values
, get the list of grouped arrays - Using
Array#filter
, keep the arrays with more than one item - Using
Array#flat
, return all arrays in one list
const myArray = [ { 'id': 111, 'lorem': 'ipsum' }, { 'id': 222, 'lorem': 'dorem' }, { 'id': 111, 'lorem': 'polus' }, { 'id': 111, 'lorem': 'waifu' } ];
const duplicates =
[...myArray.reduce((map, item) => // group items by id
map.set(item.id, [...(map.get(item.id) ?? []), item])
, new Map)
.values()] // get grouped arrays
.filter(list => list.length > 1) // keep duplicates
.flat(); // return one array
console.log(duplicates);
<iframe name="sif2" sandbox="allow-forms allow-modals allow-scripts" frameborder="0"></iframe>