Home > Enterprise >  Pandas: Merging rows into one
Pandas: Merging rows into one

Time:09-23

I have the following table:

Name Age Data_1 Data_2
Tom 10 Test
Tom 10 Foo
Anne 20 Bar

How I can merge this rows to get this output:

Name Age Data_1 Data_2
Tom 10 Test Foo
Anne 20 Bar

I tried this code (and some other related (agg, groupby other fields, et cetera)):

import pandas as pd
data = [['tom', 10, 'Test', ''], ['tom', 10, 1, 'Foo'], ['Anne', 20, '', 'Bar']]

df = pd.DataFrame(data, columns=['Name', 'Age', 'Data_1', 'Data_2'])
df = df.groupby("Name").sum()

print(df)

But I only get something like this:

c2
Name
-------- --------------
Anne Foo
Tom Bar

CodePudding user response:

Use this if the empty cells are NaN :

(df.set_index(['Name', 'Age'])
   .stack()
   .groupby(level=[0, 1, 2])
   .apply(''.join)
   .unstack()
   .reset_index()
)

Otherwise, add this line df.replace('', np.nan, inplace=True) before the code above.

# Output
   Name  Age Data_1 Data_2
0  Anne   20    NaN    Bar
1   Tom   10   Test    Foo

CodePudding user response:

Just a groupby and a sum will do.

df.groupby(['Name','Age']).sum().reset_index()

   Name  Age Data_1 Data_2
0  Anne   20           Bar
1   tom   10   Test    Foo
  • Related