Home > OS >  Merge two lines into one and create Panas DataFrame
Merge two lines into one and create Panas DataFrame

Time:03-08

I have a file with data which is not easy to make stucure ready to create dataframe.

SFE,     8924,   3,CONV,1,R5.0
 1.267065000E-04 1.267065000E-04 1.267065000E-04 1.267065000E-04
SFE,     8924,   3,CONV,2,R5.0
  761.000000      761.000000      761.000000      761.000000    
SFE,     8925,   3,CONV,1,R5.0
 1.289895000E-04 1.289895000E-04 1.289895000E-04 1.289895000E-04
SFE,     8925,   3,CONV,2,R5.0
  761.000000      761.000000      761.000000      761.000000

There are spaces, multispaces, comas and tabs. How to merge 1st and 2nd line together (3rd 4th and so on)?

Desired outcome:

SFE,8924,3,CONV,1,R5.0,1.267065000E-04,1.267065000E-04,1.267065000E-04,1.267065000E-04
SFE,8924,3,CONV,2,R5.0,761.000000,761.000000,761.000000,761.000000    
SFE,8925,3,CONV,1,R5.0,1.289895000E-04,1.289895000E-04,1.289895000E-04,1.289895000E-04
SFE,8925,3,CONV,2,R5.0,761.000000,761.000000,761.000000,761.000000

an then pandas should have no problem to make df.

For now i have such code (file have some text at the beggining, so i read starting in 45line):

data=[]
file = open('7HA03_thermal_final_filled.txt', 'r ')
with file as f:
    lines=f.readlines()[45:]
    for line in lines:
        data.append(line)
file.close()
df=pd.DataFrame(data)

Tried to play with odds and even lines but still have one columns with strings. Can share more not succesful code but i believe there is some easier way to join lines and clrear it from different separators.

CodePudding user response:

This should work:

data = []
with open("7HA03_thermal_final_filled.txt") as f:
    content = f.readlines()
    for i in range(1, len(content) 1):
        if i % 2 == 0:
            first_line = content[i-2].strip()
            first_line = "".join(first_line.split())
            second_line = content[i-1].strip()
            second_line = " ".join(second_line.split())
            second_line_modified = ",".join(x for x in second_line.strip().split(' '))
            data.append(f'{first_line},{second_line_modified}')
df = pd.DataFrame([string.split(",") for string in data])
  • Related