I'm trying to convert a text file to a CSV to ease my workflow for other applications. The issue I'm having is that the provided text file has an excess amount of space in the file and reads into one column when exported to a CSV using pandas. I've tried specifying column names as an argument but it doesn't work.
The text file reads like this,
85350 5211 APPLE LN O
85805 201 ORANGE ST I
84412 1313 BANANA RD I
It is exported into a CSV like this,
85350 5211 APPLE LN O,
85805 201 ORANGE ST I,
84412 1313 BANANA RD I
I want the exported CSV to have columns and look something similar to this, with the columns being Number Address In_Out,
Number,Address,In_Out
85350,5211 APPLE LN,O
85805,201 ORANGE ST,I
84412,1313 BANANA RD,I
CodePudding user response:
pandas
has a method to read a fixed-width text file. There are additional parameters to indicate the width of columns if its default to infer columns isn't correct, but in this case it works:
import pandas as pd
df = pd.read_fwf('input.txt', header=None)
df.to_csv('output.csv', index=False, header=['Number','Address','In_Out'])
output.csv:
Number,Address,In_Out
85350,5211 APPLE LN,O
85805,201 ORANGE ST,I
84412,1313 BANANA RD,I
CodePudding user response:
The problem is that your file has spaces but the address also has spaces. But the first and last column don't so you can probably can do something like this:
import sys
for line in sys.stdin:
line = line.strip()
left, right = line.split( ' ', 1 )
mid, right = right.rsplit( ' ', 1 )
print( ",".join( [left,mid,right] ) )
which will give you this:
$ python test.py < data.file
85350, 5211 APPLE LN ,O
85805, 201 ORANGE ST ,I
84412, 1313 BANANA RD ,I
However you can also try pandas read_fwf as your file looks like fixed width.
>>> a = pandas.read_fwf( 'data.file', widths=[8,28,1], names=('Zip','Address','Status') )
>>> a
Zip Address Status
0 85350 5211 APPLE LN O
1 85805 201 ORANGE ST I
2 84412 1313 BANANA RD I
CodePudding user response:
rows=[]
#Open csv module create the file to write it to etc etc etc
with open('file.txt') as f:
row=[]
for line in f.readlines():#gets each and every line from the file
words=line.split()#splitting each word at space
row=[words[0],f"{words[1]} {words[2]} {words[3]}",words[4]]
rows.append(row)#appending to the rows list
csv.writerows(rows)