I have a following Dataframe in Pandas:
C1 C2 C3 C4
2021 4 1X "First text "
2021 NaN NaN "continued"
2021 NaN NaN "still Continued"
2021 5 1Y "second text"
2021 NaN NaN "continued"
I want to convert this to a Dataframe as follows:
C1 C2 C3 C4
2021 4 1X "First text continued still continued"
2021 5 1Y "second text continued"
That is, I want to merge the rows of C4
column into a single row until a new value comes in C2
and C3
columns. Is there any efficient way to do it? Thanks!
CodePudding user response:
Do with ffill
before groupby
out = df.ffill().groupby(['C1','C2','C3'],as_index=False)['C4'].agg(' '.join)
Out[49]:
C1 C2 C3 C4
0 2021 4.0 1X "Firsttext" "continued" "stillContinued"
1 2021 5.0 1Y "secondtext" "continued"