Home > front end >  how to add a dataframe with another dataframe and add common columns values based on a column?
how to add a dataframe with another dataframe and add common columns values based on a column?

Time:05-28

my first data frame:

    import pandas as pd
    
    df1 = pd.DataFrame({'CONTRACT':['Tom', 'nick', 'krish', 'jack'],
            'buy_qty':[20, 40, 50, 60],
            'buy_val':[100,120,140,160],
            'factor':[15,20,25,30],
            })
    
        CONTRACT    buy_qty buy_val factor
    0   Tom          20      100    96
    1   nick         40      110    93
    2   krish        50      140    91
    3   jack         60      160    83
      

second data frame:

df2 = pd.DataFrame({'CONTRACT': ['Tom', 'nick', 'amit', 'joy'],
            'buy_qty': [5, 6, 15, 30],
            'buy_val': [15, 20, 35, 16],
            'factor': [67, 89, 76, 69]}

df2:
        CONTRACT    buy_qty buy_val factor
    0   Tom           5     15       67
    1   nick          6     20       89
    2   amit          15    35       76
    3   joy           30    16       69

I want dataframe Like this (all CONTRACT values of df2 and uncommon CONTRACT values of df1, by adding common CONTRACTS's buy_qty and buy_val and factor will be from df2(latest data frame) ...)-

        CONTRACT    buy_qty  buy_val  factor
    0   Tom         25*        115*      67
    1   nick        46*        130*      89
    2   krish       50         140       91
    4   jack        60         160       83
    2   amit        15         35        76
    3   joy         30         16        69
* denotes added values

I tried like this-

final = pd.concat([df2, df1]).add(df1['buy_qty','buy_val'],df2['buy_qty','buy_val']).drop_duplicates(subset=["CONTRACT"]).reset_index(drop=True)

but it's not working fine.

Can anyone suggest a better way?

CodePudding user response:

Unlike your previous question, here you have to groupby because you want to aggregate some values on your columns:

  • buy_qty: sum
  • buy_val: sum
  • factor: first*

* first because you want to keep the values from df2 and this is the first dataframe on pd.concat.

out = (pd.concat([df2, df1]).groupby('CONTRACT', as_index=False)
         .agg({'buy_qty': 'sum', 'buy_val': 'sum', 'factor': 'first'})
print(out)

# Output
  CONTRACT  buy_qty  buy_val  factor
0      Tom       25      115      67
1     amit       15       35      76
2     jack       60      160      83
3      joy       30       16      69
4    krish       50      140      91
5     nick       46      130      89

CodePudding user response:

You can use add with fill_value=0:

(df1
 .set_index('CONTRACT')
 .add(df2.set_index('CONTRACT'), fill_value=0)
 .reset_index()
)

output:

  CONTRACT  buy_qty  buy_val  factor
0      Tom     25.0    115.0   163.0
1     amit     15.0     35.0    76.0
2     jack     60.0    160.0    83.0
3      joy     30.0     16.0    69.0
4    krish     50.0    140.0    91.0
5     nick     46.0    130.0   182.0

CodePudding user response:

Try:

df3=pd.concat([df1,df2])
df3.groupby('CONTRACT').sum()

result into:

    buy_qty     buy_val     factor
CONTRACT            
Tom     40  200     30
amit    50  140     25
jack    60  160     30
joy     60  160     30
krish   50  140     25
nick    80  240     40
  • Related