I have to read a large .csv file line by line, then take first column from a file which are countries and count duplicates. for example if file contains:
USA
UK
USA
output should be :
USA - 2
UK -1
code:
const fs = require('fs')
const readline = require('readline')
const file = readline.createInterface({
input: fs.createReadStream('file.csv'),
output: process.stdout,
terminal: false
})
file.on('line', line => {
const country = line.split(",", 1)
const number = ??? // don't know how to check duplicates
const result = country number
if(lineCount >= 1 && country != `""`) {
console.log(result)
}
lineCount
})
CodePudding user response:
So for starters, Array.prototype.split returns an array, you seem to want the first value from the array when you split it since you limit it to one. You can read about it here: https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/String/split
Next you can create a map of all of the countries, and store the amount of times they were seen, and then log the results when the file has finished being read
const countries = {}
let lineCount = 0
file.on('line', line => {
// Destructure the array and grab the first value
const [country] = line.split(",", 1)
// Calling trim on the country should remove outer white space
if (lineCount >= 1 && country.trim() !== "") {
// If the country is not in the map, then store it
if (!countries[country]) {
countries[country] = 1
} else {
countries[country]
}
}
lineCount
})
// Add another event listener for when the file has finished being read
// You may access the country data here, since this callback function
// won't be called till the file has been read
// https://nodejs.org/api/readline.html#event-close
file.on('close', () => {
for (const country in countries) {
console.log(`${country} - ${countries[country]}`)
}
})