Home > other >  How to concat strings in rows groupby column and index in dataframe?
How to concat strings in rows groupby column and index in dataframe?

Time:12-07

I have a dataframe that looks like this:

    text
0   dog went to the store.    
0   cat is tall.
0   blue is red.
1   red is blue
1   blue is red

How do I concat the strings by row and group by index, so that the new df looks like this?:

    text
0   dog went to the store. cat is tall. blue is red.  
1   red is blue.blue is red

I've tried, but this is doing nothing and returns back the same number of rows, not sure where to go from here:

 df[['text']].groupby(df.index)['text'].transform(lambda x: ','.join(x))

CodePudding user response:

import pandas as pd

# create a sample dataframe
df = pd.DataFrame({'text': ['dog went to the store.', 'cat is tall.', 'blue is red.', 'red is blue', 'blue is red'],
                   'index': [0, 0, 0, 1, 1]})

# concatenate the strings by row and group by index
df = df.groupby('index')['text'].apply(lambda x: ' '.join(x)).reset_index()

# print the resulting dataframe
print(df)

CodePudding user response:

A possible solution (just replace transform by agg):

df[['text']].groupby(df.index)['text'].agg(lambda x: ','.join(x))

Output:

0    dog went to the store.,cat is tall.,blue is red.
1                             red is blue,blue is red
Name: text, dtype: object
  • Related