I have a file with data which is not easy to make stucure ready to create dataframe.
SFE, 8924, 3,CONV,1,R5.0
1.267065000E-04 1.267065000E-04 1.267065000E-04 1.267065000E-04
SFE, 8924, 3,CONV,2,R5.0
761.000000 761.000000 761.000000 761.000000
SFE, 8925, 3,CONV,1,R5.0
1.289895000E-04 1.289895000E-04 1.289895000E-04 1.289895000E-04
SFE, 8925, 3,CONV,2,R5.0
761.000000 761.000000 761.000000 761.000000
There are spaces, multispaces, comas and tabs. How to merge 1st and 2nd line together (3rd 4th and so on)?
Desired outcome:
SFE,8924,3,CONV,1,R5.0,1.267065000E-04,1.267065000E-04,1.267065000E-04,1.267065000E-04
SFE,8924,3,CONV,2,R5.0,761.000000,761.000000,761.000000,761.000000
SFE,8925,3,CONV,1,R5.0,1.289895000E-04,1.289895000E-04,1.289895000E-04,1.289895000E-04
SFE,8925,3,CONV,2,R5.0,761.000000,761.000000,761.000000,761.000000
an then pandas should have no problem to make df.
For now i have such code (file have some text at the beggining, so i read starting in 45line):
data=[]
file = open('7HA03_thermal_final_filled.txt', 'r ')
with file as f:
lines=f.readlines()[45:]
for line in lines:
data.append(line)
file.close()
df=pd.DataFrame(data)
Tried to play with odds and even lines but still have one columns with strings. Can share more not succesful code but i believe there is some easier way to join lines and clrear it from different separators.
CodePudding user response:
This should work:
data = []
with open("7HA03_thermal_final_filled.txt") as f:
content = f.readlines()
for i in range(1, len(content) 1):
if i % 2 == 0:
first_line = content[i-2].strip()
first_line = "".join(first_line.split())
second_line = content[i-1].strip()
second_line = " ".join(second_line.split())
second_line_modified = ",".join(x for x in second_line.strip().split(' '))
data.append(f'{first_line},{second_line_modified}')
df = pd.DataFrame([string.split(",") for string in data])