I have the following dataframe in pandas:
a = ['[16.01319488 6.1095932 -0.14837995]',
'[16.10400501 6.23724404 -0.1727245 ]',
'[16.195107 6.36434895 -0.19695716]',
'[16.2864465 6.49178233 -0.22124142]',
'[16.37796913 6.62041857 -0.24574078]',
'[16.46962054 6.75113206 -0.27061875]',
'[16.56134636 6.88479719 -0.29603881]',
'[16.65309334 7.02229002 -0.32216479]',
'[16.74480491 7.16448166 -0.34915957]',
'[16.83642781 7.31224812 -0.37718693]',
'[16.92790769 7.46646379 -0.4064104 ]',
'[17.0190533 7.62784345 -0.4369622 ]',
'[17.10912594 7.79646343 -0.46884957]',
'[17.19725 7.97224045 -0.50204846]']
b = [0.0,
0.01999999989745438,
0.03999999979490875,
0.05999999969236312,
0.0799999995898175,
0.09999999948727188,
0.1199999993847262,
0.1399999992821806,
0.159999999179635,
0.1799999990770894,
0.1999999989745438,
0.2199999988719981,
0.2399999987694525,
0.2599999986669069]
b
dDictionary = {
'A':a,
'B': b
}
test = pd.DataFrame(dDictionary)
Each value in Column 'A' consists of three values that I want to split into three seperate columns. Is there a simple and robust way to do this?
CodePudding user response:
Use Series.str.strip
with Series.str.split
and casting to floats:
test[['c','d','e']] = test.A.str.strip('[]').str.split(expand=True).astype(float)
print (test)
A B c d e
0 [16.01319488 6.1095932 -0.14837995] 0.00 16.013195 6.109593 -0.148380
1 [16.10400501 6.23724404 -0.1727245 ] 0.02 16.104005 6.237244 -0.172725
2 [16.195107 6.36434895 -0.19695716] 0.04 16.195107 6.364349 -0.196957
3 [16.2864465 6.49178233 -0.22124142] 0.06 16.286447 6.491782 -0.221241
4 [16.37796913 6.62041857 -0.24574078] 0.08 16.377969 6.620419 -0.245741
5 [16.46962054 6.75113206 -0.27061875] 0.10 16.469621 6.751132 -0.270619
6 [16.56134636 6.88479719 -0.29603881] 0.12 16.561346 6.884797 -0.296039
7 [16.65309334 7.02229002 -0.32216479] 0.14 16.653093 7.022290 -0.322165
8 [16.74480491 7.16448166 -0.34915957] 0.16 16.744805 7.164482 -0.349160
9 [16.83642781 7.31224812 -0.37718693] 0.18 16.836428 7.312248 -0.377187
10 [16.92790769 7.46646379 -0.4064104 ] 0.20 16.927908 7.466464 -0.406410
11 [17.0190533 7.62784345 -0.4369622 ] 0.22 17.019053 7.627843 -0.436962
12 [17.10912594 7.79646343 -0.46884957] 0.24 17.109126 7.796463 -0.468850
13 [17.19725 7.97224045 -0.50204846] 0.26 17.197250 7.972240 -0.502048
If need remove A
use DataFrame.pop
:
test[['c','d','e']] = test.pop('A').str.strip('[]').str.split(expand=True).astype(float)
print (test)
B c d e
0 0.00 16.013195 6.109593 -0.148380
1 0.02 16.104005 6.237244 -0.172725
2 0.04 16.195107 6.364349 -0.196957
3 0.06 16.286447 6.491782 -0.221241
4 0.08 16.377969 6.620419 -0.245741
5 0.10 16.469621 6.751132 -0.270619
6 0.12 16.561346 6.884797 -0.296039
7 0.14 16.653093 7.022290 -0.322165
8 0.16 16.744805 7.164482 -0.349160
9 0.18 16.836428 7.312248 -0.377187
10 0.20 16.927908 7.466464 -0.406410
11 0.22 17.019053 7.627843 -0.436962
12 0.24 17.109126 7.796463 -0.468850
13 0.26 17.197250 7.972240 -0.502048
CodePudding user response:
Here is another approach to expand the column 'A'
dynamically using pandas.concat
.
expanded_test = (pd.concat([test['A'].str.strip('[]').str.split(expand=True)],
axis=1, keys=test.columns)
)
expanded_test.columns = expanded_test.columns.map(lambda x: '_'.join((x[0], str(x[1] 1))))
out = test.join(expanded_test)
>>> print(out)
A B A_1 A_2 A_3
0 [16.01319488 6.1095932 -0.14837995] 0.00 16.01319488 6.1095932 -0.14837995
1 [16.10400501 6.23724404 -0.1727245 ] 0.02 16.10400501 6.23724404 -0.1727245
2 [16.195107 6.36434895 -0.19695716] 0.04 16.195107 6.36434895 -0.19695716
3 [16.2864465 6.49178233 -0.22124142] 0.06 16.2864465 6.49178233 -0.22124142
4 [16.37796913 6.62041857 -0.24574078] 0.08 16.37796913 6.62041857 -0.24574078
5 [16.46962054 6.75113206 -0.27061875] 0.10 16.46962054 6.75113206 -0.27061875
6 [16.56134636 6.88479719 -0.29603881] 0.12 16.56134636 6.88479719 -0.29603881
7 [16.65309334 7.02229002 -0.32216479] 0.14 16.65309334 7.02229002 -0.32216479
8 [16.74480491 7.16448166 -0.34915957] 0.16 16.74480491 7.16448166 -0.34915957
9 [16.83642781 7.31224812 -0.37718693] 0.18 16.83642781 7.31224812 -0.37718693
10 [16.92790769 7.46646379 -0.4064104 ] 0.20 16.92790769 7.46646379 -0.4064104
11 [17.0190533 7.62784345 -0.4369622 ] 0.22 17.0190533 7.62784345 -0.4369622
12 [17.10912594 7.79646343 -0.46884957] 0.24 17.10912594 7.79646343 -0.46884957
13 [17.19725 7.97224045 -0.50204846] 0.26 17.19725 7.97224045 -0.50204846
[Finished in 1.0s]