Home > Software engineering >  How do I combine several files removing BOM using Windows command line?
How do I combine several files removing BOM using Windows command line?

Time:04-12

I have several very large CSV (technically TSV) files that I need to append together. I had used:

copy file1.txt   file2.txt   ...   fileN.txt combined.txt

but then discovered that each file has a BOM at the start () which then appears multiple times in the middle of the file.

However, the files are very big (30-40 million lines each) so I can't open them in NotePad and re-save them to remove the BOMs, so need a command-line solution (either cmd or PowerShell), and ideally something that doesn't require downloading extra libraries.

To recap:

  • Files are too large to open in e.g. NotePad , so solution needs to be for command line
  • This is on Windows, not *nix

(in my case N=4, so I could cope with a solution that removes the BOM from an individual file, and so run this for each file first before combining)

Edit: This may be a possible solution: Batch script remove BOM () from file but my knowledge of encodings and PowerShell/batch is so poor that I can't even tell if it's applicable or not!

  • Related