The following code check SampleData.txt and produces Result1.txt. I want to create another file Result2.txt from same data that will contain only 1 column. I am new to pandas and cant figure out what is needed to be modified to create Result2.txt
import pandas as pd
from tabulate import tabulate
dl = []
with open('SampleData.txt', encoding='utf8', errors='ignore') as f:
for line in f:
parts = line.split()
if not parts[3][:2].startswith('($'):
parts.insert(3,'0')
if len(parts) > 5:
temp = ' '.join(parts[4:])
parts = parts[:4] [temp]
parts[1] = int(parts[1])
parts[2] = float(parts[2].replace(',', ''))
parts[3] = float(parts[3].strip('($)').replace(',', ''))
dl.append(parts)
headers = ['ID', 'TRANS', 'VALUE', 'AMOUNT', 'CODE']
df = pd.DataFrame(dl,columns=headers)
pd.set_option('colheader_justify', 'center')
df = df.groupby(['ID','CODE']).sum().reset_index().round(2)
df = df.sort_values('TRANS',ascending=False)
df['AMOUNT'] = '($' df['AMOUNT'].astype(str) ')'
df = df[headers]
print (df.head(n=40).to_string(index=False))
print()
df.to_csv("Out1.txt", sep="\t", index=None, header=None)
SampleData.txt
0xdata1 1 2,200,000 test1(test1)
0xdata2 1 9,500,000,000 ($70.30) test2(test2)
0xdata3 1 4.6 ($14.08) test3(test3)
0xdata4 1 0.24632941 test4(test4)
0xdata5 1 880,000,000 ($1.94) test5(test5)
Result1.txt #-- Fine and working
0xdata1 1 2,200,000 test1(test1)
0xdata2 1 9,500,000,000 ($70.30) test2(test2)
0xdata3 1 4.6 ($14.08) test3(test3)
0xdata4 1 0.24632941 test4(test4)
0xdata5 1 880,000,000 ($1.94) test5(test5)
Result2.txt #-- Additional output needed and what I am trying to produce
0xdata1
0xdata2
0xdata3
0xdata4
0xdata5
CodePudding user response:
You can select just the column that you want to save as in you case
df['ID'].to_csv("Out_ID.txt", sep="\t", index=None, header=None)
This should solve your problem!