If I have the following dataframe:
Index Col1 Col2 Col3
1 10 x 40
2 y 50
3 z 60
4 20 a 30
I would like to merge rows that have a blank Col1 with the previous row that is not blank in Col1.
Expected output:
Index Col1 Col2 Col3
1 10 x,y,z 40,50,60
4 20 a 30
Is this possible?
Thanks
CodePudding user response:
Quite possible, what you need to do is create a unique group value that increments at every value that is not null.
in one go
df.drop('Col1',axis=1).groupby((df['Col1'].isna()==False).cumsum()).agg(list)
#p.s if really want strings use
#df.drop('Col1',axis=1).groupby((df['Col1'].isna()==False
# ).cumsum()).astype(str).agg(','.join)
Col2 Col3
Col1
1 [x, y, z] [40, 50, 60]
2 [a] [30]
the key
here is the condition:
df[['Col1']].assign(con=df['Col1'].isna()==False)
Col1 con #for condition
0 10.0 True <-- first group
1 NaN False
2 NaN False
3 20.0 True <-- second group
now, creating a cumulative sum allows you to create your grouper object.
df[['Col1']].assign(con=(df['Col1'].isna()==False).cumsum())
Col1 con
0 10.0 1
1 NaN 1
2 NaN 1
3 20.0 2
CodePudding user response:
We can do
out = df.drop(labels = 'Col1',axis = 1).astype(str).groupby(df['Col1'].mask(df['Col1']=='').ffill()).agg(','.join).reset_index()
Out[85]:
Col1 Col2 Col3
0 10.0 x,y,z 40,50,60
1 20.0 a 30