Home > Software engineering >  How merge two rows in one row in pandas?
How merge two rows in one row in pandas?

Time:03-17

I have this DataFrame:

    id   type    value
0  104     0       7999
1  105     1  196193579
2  108     0     245744
3  108     1   93310128

I need to merge rows that have the same id and keep the two values in the same row, the following example is what I require:

    id   type    value_0     value_1
0  104     0       7999       0
1  105     1        0      196193579
2  108     0     245744    93310128

I have the following code, with which to group and change the values for each row

 def concat_rows(self, rows ):
        row = rows.iloc[0]

        if len(rows) > 1:
            row1 = rows.iloc[0]
            row2 = rows.iloc[1]
            row['value_1'] = row1['value'] if row1['type'] == 1 else row2['value']
            row['value_0'] = row1['value'] if row1['type'] == 0 else row2['value']
        else:

            row['value_1'] = row['value'] if row['type'] == 1 else 0
            row['value_0'] = row['value'] if row['type'] == 0 else 0
        return row

df2 = df.groupby('id').apply(self.concat_rows).reset_index(drop=True)

But I get the following the following table with the modified numbers

    id  value    type  value_1  value_0
0  104   7999     0        0     7999
1  105     99     1       99      399
2  108  10770     0    12118    10770

Data:

{'id': [104, 105, 108, 108],
 'type': [0, 1, 0, 1],
 'value': [7999, 196193579, 245744, 93310128]}

CodePudding user response:

It seems you want to keep "type" column values as well. So you could use groupby first to get the "type" column; then use pivot to get the remaining columns and merge it to the "type" and "id" columns:

out = (df.groupby('id')['type'].first().reset_index()
       .merge(df.pivot('id', 'type', 'value').add_prefix('value_')
              .fillna(0).reset_index(), on='id'))

or pivot assign:

out = (df.pivot('id', 'type', 'value')
       .add_prefix('value_').fillna(0)
       .assign(type=df.groupby('id')['type'].first())
       .reset_index()
       [['id','type','value_0','value_1']]
       .rename_axis(columns=[None]))

Output:

    id  type   value_0      value_1
0  104     0    7999.0          0.0
1  105     1       0.0  196193579.0
2  108     0  245744.0   93310128.0

CodePudding user response:

You can use:

df = df.pivot_table(index=['id'], columns=['type'], values=['value'], fill_value=0).reset_index()
df.columns = ['_'.join(map(str, col)).strip('_') for col in df.columns]

OUTPUT

    id   value_0      value_1
0  104    7999.0          0.0
1  105       0.0  196193579.0
2  108  245744.0   93310128.0
  • Related