I have a dataframe which looks like this:
time text
01.01.1970 abc
01.01.1970 cde
01.01.1970 fgh
01.01.1980 abc
01.01.1980 xyz
I would like to join the content in text
based on column time
. I want to join them separated by \n
. How can I do this in order to get such a dataframe?
time text
01.01.1970 abc\ncde\nfgh
01.01.1980 abc\nxyz
I tried the following but I do not get what is expected but instead for every row in text
I get: text\ntime
.
out = (df.groupby('time', as_index=False)
['text'].agg(lambda x: '\n'.join(x.dropna())))
CodePudding user response:
df.groupby('time')['text'].apply(lambda x: x.str.cat(sep='\n'))
output:
time text
01.01.1970 "abc\ndef"
01.01.1980 "ghi\njkl"
CodePudding user response:
It's easier to drop NaNs before
df.dropna().groupby('time')['text'].agg('\n'.join)
CodePudding user response:
This answer is longer/uglier than the others but it at least gives you back a dataframe similar to your starting one.
List = []
for x in df.time.unique():
List.append([x , "\n".join(df[df.time == x].text.values)])
pd.DataFrame(List, columns = df.columns)