I have a dataframe, called target, that I would like to fill, based on some rules, from an old dataframe, called source.
import pandas as pd
import numpy as np
target_d = {'ColA': ['a1', 'a1', 'a2', 'a2', 'a3', 'a3', 'a4'],
'ColB': ['X', 'Y', 'X', 'Y', 'X', 'Y', 'X'],
'ColC': ['c1', 'c2', 'c3', 'c4', 'c5', 'c6', 'c7']
}
target = pd.DataFrame(data=target_d)
source_d = {'Col1': ['a1', 'a2', 'a3', 'a4'],
'Col2': ['e1', 'e2', 'e3', 'e4'],
'Col3': ['f1', 'f2', 'f3', 'f4'],
'Col4': ['g1', 'g2', 'g3', 'g4'],
}
source = pd.DataFrame(data=source_d)
source = source.set_index('Col1')
target.loc[target['ColB'] == 'X', 'ColC'] = source.loc['a1', 'Col4']
The above starts with our two input dataframes:
ColA ColB ColC
0 a1 X c1
1 a1 Y c2
2 a2 X c3
3 a2 Y c4
4 a3 X c5
5 a3 Y c6
6 a4 X c7
Col2 Col3 Col4
Col1
a1 e1 f1 g1
a2 e2 f2 g2
a3 e3 f3 g3
a4 e4 f4 g4
to then update target to provide:
ColA ColB ColC
0 a1 X g1
1 a1 Y c2
2 a2 X g1
3 a2 Y c4
4 a3 X g1
5 a3 Y c6
6 a4 X g1
however, what I would like is:
ColA ColB ColC
0 a1 X g1
1 a1 Y c2
2 a2 X g2
3 a2 Y c4
4 a3 X g3
5 a3 Y c6
6 a4 X g4
I want to condition on the ColB value being X and to then use the ColA value to find the Col4 value in the source, by using the index.
I somehow want the right hand side of the equality to read
source.loc[target.loc['ContractID'], 'Col4']
to use my indexing to find which value I want. I don't know if I'm completely barking up the wrong tree since the right hand side is normally a fixed value in everything that I've seen.
Any ideas of how I can produce the desired output?
CodePudding user response:
You can use create a dictionary
with keys
being the index
of source and respective values
being Col4 of source.
Then, map
map with loc
(to locate where ColB equals 'X') to update your target DF
:
d = dict(zip(source.index,source.Col4))
target.loc[target['ColB'].eq('X'), 'ColC'] = target['ColA'].map(d)
Prints:
>>> target
ColA ColB ColC
0 a1 X g1
1 a1 Y c2
2 a2 X g2
3 a2 Y c4
4 a3 X g3
5 a3 Y c6
6 a4 X g4
Setup:
>>> target.to_dict()
{'ColA': {0: 'a1', 1: 'a1', 2: 'a2', 3: 'a2', 4: 'a3', 5: 'a3', 6: 'a4'},
'ColB': {0: 'X', 1: 'Y', 2: 'X', 3: 'Y', 4: 'X', 5: 'Y', 6: 'X'},
'ColC': {0: 'c1', 1: 'c2', 2: 'c3', 3: 'c4', 4: 'c5', 5: 'c6', 6: 'c7'}}
>>> source.to_dict()
{'Col2': {'a1': 'e1', 'a2': 'e2', 'a3': 'e3', 'a4': 'e4'},
'Col3': {'a1': 'f1', 'a2': 'f2', 'a3': 'f3', 'a4': 'f4'},
'Col4': {'a1': 'g1', 'a2': 'g2', 'a3': 'g3', 'a4': 'g4'}}