I have a German to English dictionary with multiple entries for some words. I want to group those entries such that the English translations for the same german word are separated by a comma.
I have the following dataframe:
Deutsch Englisch
spindeldürr spindly
Garn {n} [auch fig.] yarn
Schnur {f} twine
Naht {f} suture
zunähen to suture
Faden {m} strand [thread]
Faden {m} thread [also fig.: of conversation]
Flussbett {n} riverbed
Flussbett {n} channel [of a river]
streuen to strew
And I want to produce:
Deutsch Englisch
spindeldürr spindly
Garn {n} [auch fig.] yarn
Schnur {f} twine
Naht {f} suture
zunähen to suture
Faden {m} strand [thread], thread [also fig.: of conversation]
Flussbett {n} riverbed, channel [of a river]
streuen to strew
I created this dataframe from a .txt file using the following code:
import pandas as pd
df = pd.read_csv('test.txt', delimiter='::')
df.columns = df.columns.str.strip()
How can I achieve this using Pandas or other common packages?
CodePudding user response:
Try groupby
:
# Old versions of Pandas
>>> df.groupby('Deutsch', sort=False)['Englisch'].agg(', '.join).reset_index()
# Newer versions
>>> df.groupby('Deutsch', sort=False, as_index=False)['Englisch'].agg(', '.join)
Deutsch Englisch
0 spindeldürr spindly
1 Garn {n} [auch fig.] yarn
2 Schnur {f} twine
3 Naht {f} suture
4 zunähen to suture
5 Faden {m} strand [thread], thread [also fig.: of convers...
6 Flussbett {n} riverbed, channel [of a river]
7 streuen to strew