Home > Enterprise >  Concatenate/merge rows for one column in Pandas DataFrame
Concatenate/merge rows for one column in Pandas DataFrame

Time:07-22

I have a following Dataframe in Pandas:

C1    C2   C3   C4 
2021  4    1X   "First text "
2021  NaN  NaN  "continued"
2021  NaN  NaN  "still Continued"
2021  5    1Y   "second text"
2021  NaN  NaN  "continued"

I want to convert this to a Dataframe as follows:

C1    C2   C3   C4 
2021  4    1X   "First text continued still continued"
2021  5    1Y   "second text continued" 

That is, I want to merge the rows of C4 column into a single row until a new value comes in C2 and C3 columns. Is there any efficient way to do it? Thanks!

CodePudding user response:

Do with ffill before groupby

out = df.ffill().groupby(['C1','C2','C3'],as_index=False)['C4'].agg(' '.join)
Out[49]: 
     C1   C2  C3                                        C4
0  2021  4.0  1X  "Firsttext" "continued" "stillContinued"
1  2021  5.0  1Y                  "secondtext" "continued"
  • Related