I have a Pandas dataframe that looks like this:
import pandas as pd
data = {
'a' : [['Foo', 49.51, -120.69], ['Foo', 49.51, -120.69], ['Foo', 49.51, -120.69], ['Foo', 49.51, -120.69]],
'b' : [['YLK', 44.48, -79.55], ['HG76', 44.60, -65.76], ['DEF', 49.52, -113.99], ['YXZ', 47.96, -84.78]],
'c' : [1628.931942, 1949.748061, 2556.622213, 301.193418]
}
df = pd.DataFrame(data)
df
a b c
0 [Foo, 49.51, -120.69] [YLK, 44.48, -79.55] 1628.931942
1 [Foo, 49.51, -120.69] [HG76, 44.6, -65.76] 1949.748061
2 [Foo, 49.51, -120.69] [DEF, 49.52, -113.99] 2556.622213
3 [Foo, 49.51, -120.69] [YXZ, 47.96, -84.78] 301.193418
I would like to split out columns a
and b
such that their elements become their own columns, like this:
a b c d e f g
0 Foo 49.51 -120.69 YLK 44.48 -79.55 1628.931942
1 Foo 49.51 -120.69 HG76 44.6 -65.76 1949.748061
2 Foo 49.51 -120.69 DEF 49.52 -113.99 2556.622213
3 Foo 49.51 -120.69 YXZ 47.96 -84.78 301.193418
How would I do this?
Thanks!
CodePudding user response:
You can do DataFrame
call
df = df.join(pd.DataFrame(df.pop('a').tolist(),index = df.index))
Out[215]:
b c 0 1 2
0 [YLK, 44.48, -79.55] 1628.931942 Foo 49.51 -120.69
1 [HG76, 44.6, -65.76] 1949.748061 Foo 49.51 -120.69
2 [DEF, 49.52, -113.99] 2556.622213 Foo 49.51 -120.69
3 [YXZ, 47.96, -84.78] 301.193418 Foo 49.51 -120.69
CodePudding user response:
here is one way to do it
df[['a1','a2','a3']]=df['a'].apply(lambda x: ','.join(map(str, x)) ).str.split(',', expand=True)
df[['b1','b2','b3']]=df['b'].apply(lambda x: ','.join(map(str, x)) ).str.split(',', expand=True)
df.drop(columns=['a','b'], inplace=True)
df
c a1 a2 a3 b1 b2 b3
0 1628.931942 Foo 49.51 -120.69 YLK 44.48 -79.55
1 1949.748061 Foo 49.51 -120.69 HG76 44.6 -65.76
2 2556.622213 Foo 49.51 -120.69 DEF 49.52 -113.99
3 301.193418 Foo 49.51 -120.69 YXZ 47.96 -84.78
CodePudding user response:
The correct way is to use the .to_list()
method of a pandas.Series
and then just assign the values to new columns like this:
df[["a1", "a2", "a3"]] = df["a"].to_list()
df[["b1", "b2", "b3"]] = df["b"].to_list()
In one line:
df[["a1", "a2", "a3", "b1", "b2", "b3"]] = (df["a"] df["b"]).to_list()
If you dont need a and b you can drop it
df.drop(columns=["a", "b"], inplace=True)