I am trying to import a ~2.5 GiB .csv file containing 7 million records.
----- 2021-09-13 06:28 2745868408 thefile.txt
After 3 hours I stopped the following command. Task Manager was reporting memory utilization near 100% and CPU utilization on all cores ~90%.
$x = Import-Csv -Path '.\thefile.txt' -Delimiter '|'
Are there any known limits for Import-Csv
? Must Get-Content|ForEach-Object
be used?
PS C:\> $PSVersionTable.PSVersion.ToString()
7.1.4
CodePudding user response:
You might have more luck using it inside a pipeline, instead of assigning the entire output to a variable.
However...
PowerShell, or scripting in general, is meant to make everyday tasks as easy as possible. That's why things like performance or memory consumption have lower priority over other considerations, such as simplicity and usability.
If you're faced with a very high-load and performance-intensive task, more often than not a script tool is not the ideal option anymore.
Native PowerShell is fine for your everyday 1kb csv files, but for this case, you should probably consider a 3rd party library. Of course, you could still use that one inside of PowerShell. It's .Net after all, that's why it's such a great tool imho.
As has been commented, I don't think there is any hard-coded limitation of the cmdlet. The limit is only your hardware and the simple fact, that the cmdlet wasn't designed to handle huge files performantly, but to be easy to use for everyday cases.
CodePudding user response:
Fully agree with @marsze
Just a test you can do : If you just want to look for specifics records in the .csv file, you should avoid to try to load it into memory, but piping it into a filter. I don't use this method with import-csv, but with get-content and it allow me to find specific reccords in 2Gb log files with correct performances.