Sum values of two Dataframe with Conditions-CodePudding

it's the situation: in a "for" loop i'm reading csv files as dataframe and add them to a main big df. if new data is not already in main df, it will be simply concat to main df. if new data has already a same value in main df, it has to sum with the pervious one.

an example:

main_df=
         building,  day,  kw/h
         1,         1,    50
         1,         2,    55
         2,         1,    30
         2,         2,    40

new_df_1=
         building,  day,  kw/h
         3,         1,    55
         3,         2,    58

new_df_2=

         building,  day,  kw/h
         2,         1,    15
         2,         2,    19
         2,         3,    14

new_df_2 will simply concat to main_df, but new_df_2 has to sum with existed data. So the desire answer will be:

         building,  day,  kw/h
         1,         1,    50
         1,         2,    55
         2,         1,    45
         2,         2,    59
         2,         3,    14
         3,         1,    55
         3,         2,    58

Question: 1. how can i check that, is there already same building and same day in main_df? 2. how can i sum the values of "KW/h" coulmns? i mean:

for csv in path_list:
read= pd.read_csv(csv)
   if ( **already exist** ):
       **sum with old values**
   else:
       main_df= pd.concat([main_df,read])

i was searching for some function like "merge" with indicator "on=" for automaticly find the right values and sum them. but i couldn't find any thing. Any idea would help.

CodePudding user response：

concat the three DF and then sum the kw/h on building and day

df=pd.concat([main_df, new_df1, new_df2])
df=df.groupby(['building','day'], as_index=False)['kw/h'].sum()
df

    building    day     kw/h
0          1      1     50
1          1      2     55
2          2      1     45
3          2      2     59
4          2      3     14
5          3      1     55
6          3      2     58