I'm trying to update a "qty" column in a Dataframe based on another Dataframe "qty" column only for specific rows (according to specific types).
Here are my example Dataframes :
df = pd.DataFrame({'op': ['A', 'A', 'A', 'B', 'C'], 'type': ['X', 'Y', 'Z', 'X', 'Z'], 'qty': [3, 1, 8, 0, 4]})
df_xy = pd.DataFrame({'op': ['A', 'B', 'C'], 'qty': [10, 20, 30]})
print(df)
print(df_xy)
op type qty
0 A X 3
1 A Y 1
2 A Z 8
3 B X 0
4 C Z 4
op qty
0 A 10
1 B 20
2 C 30
I try to use the loc function to choose the concerned rows and to compare with the other Dataframe according to my reference column "op" but without success
# Select df rows where "type" is in "types" and set "qty" according to "qty" from df_xy
types = ['X', 'Y']
df.loc[df['type'].isin(types), 'qty'] = df_xy.loc[df_xy['op'] == df['op'], 'qty']
print(df)
I would like to have a Dataframe that is like this :
op type qty
0 A X 10
1 A Y 10
2 A Z 8
3 B X 20
4 C Z 4
But I have an error specifying that I cannot compare Series Objects that are not labeled the same way
ValueError: Can only compare identically-labeled Series objects
Any help is much appreciated! Thanks in advance!
CodePudding user response:
Use Series.map
only for filtered rows in both sides for avoid processing not matched rows, here Z
rows:
types = ['X', 'Y']
mask = df['type'].isin(types)
df.loc[mask, 'qty'] = df.loc[mask, 'op'].map(df_xy.set_index('op')['qty'])
print (df)
op type qty
0 A X 10
1 A Y 10
2 A Z 8
3 B X 20
4 C Z 4
CodePudding user response:
You could combine loc
and merge
to align your 2 Series:
df.loc[df['type'].isin(types), 'qty'] = df[['op']].merge(df_xy, on='op')['qty']
output:
op type qty
0 A X 10
1 A Y 10
2 A Z 8
3 B X 20
4 C Z 4