I am having a dataframe like this
| Text | Label |
| -------- | -------------- |
| ........ | ........ |
| 65 | B-phone |
| 5454 | I-phone |
| 3485 | I-phone |
| ........ | ........ |
How could I merge the the above 3 rows into a single row like this? In specific, the B-phone label will be fixed (the 65 text can be any value and at any index), and any rows below it with the label "I-phone" will be merged into the row with label B-phone
Text | Label |
---|---|
........ | ........ |
65 5454 3485 | B-phone |
........ | ........ |
CodePudding user response:
Adding an id attribute to assist in grouping the records:
df = pd.DataFrame([[' 65','B-phone'],['5454','I-phone'],['3485','I-phone']], columns=['Text','Label'])
df["id"] = 1
df_text = df.groupby('id')['Text'].aggregate(lambda x: ' '.join(tuple(x))).reset_index()
df_label = df.groupby('id')['Label'].aggregate(lambda x: x[0]).reset_index()
out = pd.merge(df_text,df_label)
Gives:
id Text Label
0 1 65 5454 3485 B-phone
CodePudding user response:
Strictly addressing your stated requirements, this should work:
pd.DataFrame({'ID':[1], 'Text': [' '.join(df['Text'].to_list())], 'Label': [df.iat[0,1]]})
Result:
ID Text Label
0 1 65 5454 3485 B-phone