Home > OS >  How to merge two rows with the same value in a given column
How to merge two rows with the same value in a given column

Time:12-07

Hey I have a DataFrame like this:

d =  {"KEY": ["KEY2", "KEY2"], "String value": ["value 1", "value 2"], "list value": [["val1"], ["val2"]]}
df = pd.DataFrame(d)
df

In KEY column there is the same value in both rows. What I want to do is to create one row from this two rows (or more in my DataFrame) in such a way that values from from a given columns (except KEY and possibly one more column) are added. So finally I want sth like this:

d2 = {"KEY": ["KEY2"], "String value": ["value 1"   "value 2"], "list value": [["val1"]   ["val2"]]}
res = pd.DataFrame(d2)
res

How can I do that?

CodePudding user response:

groupby and sum

it is possible to process columns separately and leave some columns "unchanged" by keeping the first value, assuming it is the same for rows with the same key.

d =  {"KEY": ["KEY2", "KEY2"], "String value": ["value 1", "value 2"], "list value": [["val1"], ["val2"]],
      "other value": [1, 1]}
df = pd.DataFrame(d)


df2 = pd.DataFrame(df.groupby('KEY')['String value'].apply(lambda x:'\n'.join(x)))
df2['list value'] = df.groupby('KEY')['list value'].sum()
df2['other value'] = df.groupby('KEY')['other value'].first()
df2
    KEY     String value    list value
0   KEY2    value 1value 2  [val1, val2]
  • Related