I currently have a dataframe like this
colA colB colC colD
1 1a Yes Yes
1 1b Yes Yes
2 2a Yes Yes
2 2b Yes Yes
2 2c Yes Yes
3 3a Yes Yes
4 4a Yes Yes
Say I have a dictionary like this
filterDict = {
"1" : ["1a","1b"]
"2" : ["2a","2b"]
}
I want it so that if in colA a value exists as a key, check if colB in that same row exists in the value list of the dict for that key so that the resulting dataframe would look like this
colA colB colC colD
1 1a No No
1 1b No No
2 2a No No
2 2b No No
2 2c Yes Yes
3 3a Yes Yes
4 4a Yes Yes
I am relatively new to pandas and am unsure how to do this. What I previously had was something hardcoded like this
mask = ((df["colA"] == "1") & (df["colB"].isin(["1a","1b"])) |
((df["colB"] == "2") & (df["colB"].isin(["2a","2b"]))
df.loc[mask,["colC","colD"]] = ["No","No"]
But I would like this to be dynamic so that it could be put in the dict format
CodePudding user response:
You could try this
for index, row in df.iterrows():
if str(row['colA']) in filterDict:
if row['colB'] in filterDict[str(row['colA'])]:
df.loc[index, 'colC'] = 'No'
df.loc[index, 'colD'] = 'No'
continue
df.loc[index, 'colC'] = 'Yes'
df.loc[index, 'colD'] = 'Yes'
Based on your expected output, I've assumed 'No'
if a match is found, otherwise 'Yes'
.
CodePudding user response:
Flatten the fitersDict
into list of tuples then create a multindex from it, the use Multindex.isin
to test from the occurrence of multiindex in the columns colA
, colB
of given dataframe
i1 = df.set_index(['colA', 'colB']).index
i2 = pd.MultiIndex.from_tuples((a, b) for a, l in filterDict.items() for b in l)
mask = ~i1.isin(i2)
df['colC'], df['colD'] = mask, mask
colA colB colC colD
0 1 1a False False
1 1 1b False False
2 2 2a False False
3 2 2b False False
4 2 2c True True
5 3 3a True True
6 4 4a True True