Home > Software design >  How to join rows in pandas dataframe based on column value?
How to join rows in pandas dataframe based on column value?

Time:07-12

I have a dataframe which looks like this:

time text
01.01.1970 abc
01.01.1970 cde
01.01.1970 fgh
01.01.1980 abc
01.01.1980 xyz

I would like to join the content in text based on column time. I want to join them separated by \n. How can I do this in order to get such a dataframe?

time text
01.01.1970 abc\ncde\nfgh
01.01.1980 abc\nxyz

I tried the following but I do not get what is expected but instead for every row in text I get: text\ntime.

out = (df.groupby('time', as_index=False)
       ['text'].agg(lambda x: '\n'.join(x.dropna())))

CodePudding user response:

df.groupby('time')['text'].apply(lambda x: x.str.cat(sep='\n'))

output:

time    text
01.01.1970  "abc\ndef"
01.01.1980  "ghi\njkl"

CodePudding user response:

It's easier to drop NaNs before

df.dropna().groupby('time')['text'].agg('\n'.join)

CodePudding user response:

This answer is longer/uglier than the others but it at least gives you back a dataframe similar to your starting one.

List = []
for x in df.time.unique():
    List.append([x , "\n".join(df[df.time == x].text.values)])
pd.DataFrame(List, columns = df.columns)
  • Related