I have 2 dataframes as below, some of the index values could be common between the two and I would like to add the values across the two if same index is present. The output should have all the index values present (from 1 & 2) and their cumulative values.
Build
2.1.3.13 2
2.1.3.1 1
2.1.3.15 1
2.1.3.20 1
2.1.3.8 1
2.1.3.9 1
Ref_Build
2.1.3.13 2
2.1.3.10 1
2.1.3.14 1
2.1.3.17 1
2.1.3.18 1
2.1.3.22 1
For example in the above case 2.1.3.13 should show 4 and the remaining 11 of them with 1 each.
What's the efficient way to do this? I tried merge etc., but some of those options were giving me 'intersection' and not 'union'.
Your help is much appreciated.
CodePudding user response:
Use Series.add
and Series.fillna
df1['Build'].add(df2['Ref_Build']).fillna(df1['Build']).fillna(df2['Ref_Build'])
2.1.3.1 1.0
2.1.3.10 1.0
2.1.3.13 4.0
2.1.3.14 1.0
2.1.3.15 1.0
2.1.3.17 1.0
2.1.3.18 1.0
2.1.3.20 1.0
2.1.3.22 1.0
2.1.3.8 1.0
2.1.3.9 1.0
dtype: float64
Or:
pd.concat([df1['Build'], df2['Ref_Build']], axis=1).sum(axis=1)
2.1.3.13 4.0
2.1.3.1 1.0
2.1.3.15 1.0
2.1.3.20 1.0
2.1.3.8 1.0
2.1.3.9 1.0
2.1.3.10 1.0
2.1.3.14 1.0
2.1.3.17 1.0
2.1.3.18 1.0
2.1.3.22 1.0
dtype: float64
CodePudding user response:
You can try merge
with outer
option or concat
on columns
out = pd.merge(df1, df2, left_index=True, right_index=True, how='outer').fillna(0)
# or
out = pd.concat([df1, df2], axis=1).fillna(0)
out['sum'] = out['Build'] out['Ref_Build']
# or with `eval` in one line
out = pd.concat([df1, df2], axis=1).fillna(0).eval('sum = Build Ref_Build')
print(out)
Build Ref_Build sum
2.1.3.13 2.0 2.0 4.0
2.1.3.1 1.0 0.0 1.0
2.1.3.15 1.0 0.0 1.0
2.1.3.20 1.0 0.0 1.0
2.1.3.8 1.0 0.0 1.0
2.1.3.9 1.0 0.0 1.0
2.1.3.10 0.0 1.0 1.0
2.1.3.14 0.0 1.0 1.0
2.1.3.17 0.0 1.0 1.0
2.1.3.18 0.0 1.0 1.0
2.1.3.22 0.0 1.0 1.0