I have a dataframe that looks like this:
text
0 dog went to the store.
0 cat is tall.
0 blue is red.
1 red is blue
1 blue is red
How do I concat the strings by row and group by index, so that the new df looks like this?:
text
0 dog went to the store. cat is tall. blue is red.
1 red is blue.blue is red
I've tried, but this is doing nothing and returns back the same number of rows, not sure where to go from here:
df[['text']].groupby(df.index)['text'].transform(lambda x: ','.join(x))
CodePudding user response:
import pandas as pd
# create a sample dataframe
df = pd.DataFrame({'text': ['dog went to the store.', 'cat is tall.', 'blue is red.', 'red is blue', 'blue is red'],
'index': [0, 0, 0, 1, 1]})
# concatenate the strings by row and group by index
df = df.groupby('index')['text'].apply(lambda x: ' '.join(x)).reset_index()
# print the resulting dataframe
print(df)
CodePudding user response:
A possible solution (just replace transform
by agg
):
df[['text']].groupby(df.index)['text'].agg(lambda x: ','.join(x))
Output:
0 dog went to the store.,cat is tall.,blue is red.
1 red is blue,blue is red
Name: text, dtype: object