I have column of index and each index has it's corresponding word:
id | word |
---|---|
1 | word1 |
1 | word2 |
1 | word3 |
2 | word4 |
2 | word5 |
and so on.
I want to group them by index in this way: for id 1-[word1,word2,word3] for id 2-[word4,word5]
and so on
and extract to CSV file
I have this code:
df = pd.DataFrame(data)
d={"word":"first"}
df_new = df.groupby(df['id'], as_index=False).aggregate(d).reindex(columns=df['word'])
print (df_new)
df_new.to_csv('test.csv', sep='\t', encoding='utf-8', index=False)
What do I need to change in order for that to work?
Thank you in advance
CodePudding user response:
# Import Dependencies
import pandas as pd
# Create DataFrame
data = {'id': [1, 1, 1, 2, 2], 'word': ['word1', 'word2', 'word3', 'word4', 'word5']}
df = pd.DataFrame(data)
# Groupby and Merge
df = df.groupby('id', as_index=False).agg({'word' : ','.join})
# Result
id word
0 1 word1,word2,word3
1 2 word4,word5