Home > Blockchain >  Dynamic File Name based on pandas DataFrame Key Column
Dynamic File Name based on pandas DataFrame Key Column

Time:10-25

I have the following template DataFrame:

df = pd.DataFrame({
    'File_name_Column': ['File1', 'File2', 'File3', 'File1', 'File2', 'File3'],
    'Column3': ['xxr', 'xxv', 'xxw', 'xxt', 'xxe', 'xxz'],
    'Column4': ['wer', 'cad', 'sder', 'dse', 'sdf', 'csd'],
    'Column5': ['xxr', 'xxv', 'xxw', 'xxt', 'xxe', 'xxz'],
    'Column6': ['xxr', 'xxv', 'xxw', 'xxt', 'xxe', 'xxz'],})

I want to write several .txt files named based on the column "File_name_Column".

I want to use something like this, but it's not working:


df.to_csv(f'{df_File_name_Column}.txt', sep='|', index=False, header=False)

Desired Output:
File1.txt
'xxr'|'wer'|'xxr'|'xxr'
'xxt'|'dse'|'xxt'|'xxt'

File2.txt
'xxv'|'cad'|'xxv'|'xxv'
'xxe'|'sdf'|'xxe'|'xxe'

File3.txt
'xxw'|'sder'|'xxw'|'xxw'
'xxz'|'csd'| 'xxz'| 'xxz'

Note¹: This is millions of rows dataframe

Note²: I cannot use Open() Function, because I'm migrating this pipeline to a platform that don't support this function.

CodePudding user response:

One approach, groupby to_csv:

for key, group in df.groupby("File_name_Column"):
    group.drop("File_name_Column", 1).to_csv(f"{key}.txt", sep='|', index=False, header=False)
  • Related