I have such a dataframe:
time | text
01.01.2000 | None
None | abc
None | cde
None | def
01.02.2000 | None
None | abb
None | bbc
None | dde
01.03.2000 | None
None | 123
None | 278
None | 782
I now want to split this dataframe in multiple dataframes beginning with the value where time is not None and adding the rows for each dataframe just one after another with a new line after each original row. That means it should look like this:
df1
time | text
01.01.2000 | abc \n cde \n def
And the second dataframe like this:
df2
time | text
01.02.2000 | abb \n bbc \n dde
How can I do this? I would like to use a for loop to do this.
CodePudding user response:
You can forward fill time
column then groupby time
column
df['time'] = df['time'].ffill()
out = (df.groupby('time', as_index=False)
['text'].agg(lambda x: '\n'.join(x.dropna())))
print(out)
time text
0 01.01.2000 abc\ncde\ndef
1 01.02.2000 abb\nbbc\ndde
2 01.03.2000 123\n278\n782
groups = [g for name, g in out.groupby('time')]
print(groups)
[ time text
0 01.01.2000 abc\ncde\ndef, time text
1 01.02.2000 abb\nbbc\ndde, time text
2 01.03.2000 123\n278\n782]