I have a file with many lines. Each line has the same length.
The first three lines are:
0 MSG_201901010100.nc [98.22227, 0.00014308207] [3948.8948, 0.0057524233]
1 MSG_201901010200.nc [197.27554, 0.00028737469] [9986.71, 0.014547813]
2 MSG_201901010300.nc [218.46107, 0.00031823604] [12044.043, 0.017544765]
How can I read in the file and assign the content to lists, arrays or a Dataframe?
As lists I would like to have:
a = [0,1,2]
b = [R10CH20P_MSG_201901010100.nc, R10CH20P_MSG_201901010200.nc, R10CH20P_MSG_201901010300.nc]
c1 = [98.22227, 197.27554, 218.46107]
c2 = [0.00014308207, 0.00028737469,0.00031823604]
d1 = [3948.8948, 9986.71, 12044.043]
d2 = [0.0057524233, 0.014547813, 0.017544765]
I tried to read the file with Pandas:
import pandas as pd
df = pd.read_table(filename, sep='\s ', names=['a', 'b', 'c1', 'c2', 'd1', 'd2' ])
But this produces wrong assignments:
print(df)
a b c1 c2 d1 d2
0 0 MSG_201901010100.nc [98.22227, 0.00014308207] [3948.8948, 0.0057524233]
1 1 MSG_201901010200.nc [197.27554, 0.00028737469] [9986.71, 0.014547813]
2 2 MSG_201901010300.nc [218.46107, 0.00031823604] [12044.043, 0.017544765]
For example print(df['c1']
) gives:
0 [98.22227,
1 [197.27554,
2 [218.46107,
Name: c1, dtype: object
and
print(df['c1'].values)
shows:
['[98.22227,' '[197.27554,' '[218.46107,']
CodePudding user response:
This should work for your use case,
with open('data.txt', 'r') as f:
lines = f.readlines()
data = []
for s in lines:
s = s.replace('[', '')
s = s.replace(']', '')
s = s.replace(',', '')
data.append(s.split())
df = pd.DataFrame(data=data, columns=['a', 'b', 'c1', 'c2', 'd1', 'd2'])
And finally,
a = df['a'].tolist()
b= df['b'].tolist()