So I'm working on taking a txt file and converting it into a csv data table.
I have managed to convert the data into a csv file and put it into a table, but I have a problem with extracting the numbers. In the data table that I made, it's giving me text as well as the value (intensity = 12345).
How do I only put the numerical values into the table?
I tried using regular expressions, but I couldn't get it to work. I would also like to delete all the lines that contain saturated, fragmented and merged. I initially created a code that would delete every uneven line, but this is a code that will be used for several files, so the odd lines in other files might have different data in them. How would I go about doing that?
This is the code that I currently have, plus a picture of what the output looks like.
import pandas as pd
parameters = pd.read_csv("ScanHeader1.txt", header=None)
parameters.columns = ['Packet Number', 'Intensity','Mass/Position']
parameters.to_csv('ScanHeader1.csv', index=None)
df = pd.read_csv('ScanHeader1.csv')
print(df)
I would really appreciate some tips or pointers on how I can do this. Thanks :)
CodePudding user response:
you can try this
def fun_eq(x):
x = x.split(' = ')
return x[1]
def fun_hash(x):
x = x.split(' # ')
return x[1]
df = df.iloc[::2]
df['Intensity'] = df['Intensity'].apply(fun_eq)
df['Mass/Position'] = df['Mass/Position'].apply(fun_eq)
df['Packet Number'] = df['Packet Number'].apply(fun_hash)