Home > Software design >  Powershell script using a lot memory on server when outputting csv file
Powershell script using a lot memory on server when outputting csv file

Time:12-01

I am currently trying to get the last modified date of all files on a 128.5GB folder containing multiple sub-directories and files. However, whenever the scripts runs, its uses almost all the memory on the server. (I assume that this is because its trying to fit all of the data in the memory before outputting it in a .csv files). Is there a way that we could still output the data on a .csv file without using all the memory on the server. Please find my below script:-

$results = Get-ChildItem -Force -Recurse -File -Path "C:\inetpub\wwwroot\" | Sort LastWriteTime -Descending | Select-Object FullName, LastWriteTime 

$results | Export-Csv "C:\Users\serveradmin\Documents\dates.csv" -notype 


CodePudding user response:

Powershell can be memory intensive and slow... So I wrote you a script in python. I tested it on my mac, works a charm. I've left notes on the script. Just ammend the folder path to be scanned and where you want to save the csv file. It will be faster, and uses less memory :o)

#Import Python Modules
import os,time
import pandas as pd

#Function to Scan files
def get_information(directory):
    file_list = []
    for i in os.listdir(directory):
        a = os.stat(os.path.join(directory,i))
        file_list.append([i,time.ctime(a.st_atime),time.ctime(a.st_ctime),time.ctime(a.st_mtime)]) #[file,most_recent_access,created]
    return file_list

#Enter Folder Path To Be Scanned
flist = get_information("/Users/username/FolderName1/FolderName2/data")
#print(flist)

#Build DataFrame Table
df = pd.DataFrame(flist)

#Insert DataFrame Table Colimns
df.columns = ['file name', 'last access time', 'last change time', 'last modification time']

#Print output as test
#print(df)

#Bulid Filepath for output
src_path ="/Users/username/FolderName1/"
csvfilename = "output.csv"
csvfile = src_path   csvfilename

#Export to CSV
df.to_csv(csvfile, index=False)

CodePudding user response:

For what it's worth, I successfully did this with all the 1.8 million files on my hard drive over 8 minutes.

# 5 min
Get-ChildItem -Force -Recurse -File -ea 0 | 
Select-Object @{n='lastwritetime';
e={$_.lastwritetime.tostring('yyyy MM dd HH mm ss')}}, fullname | 
export-csv sort.csv

# 3 min
import-csv sort.csv | Sort LastWriteTime -Descending | export-csv sort2.csv
  • Related