So, I have a set of pairs, let's call it map X, that looks this:
{'a':(a_1, a_2, a_3),
'b':(b_1, b_2, b_3),
'c':(c_1, c_2, c_3)}
and a table that looks something like this:
Attribute 1 | Attribute 2 |
---|---|
a | a_1 |
a | a_2 |
c | c_3 |
b | a_1 |
a | a_2 |
b | b_3 |
We can see that the fourth record associates b and a_1. However, a_1 does not belong to the list of values associated with b in map X. Generally speaking, I want to flag when something like this happens.
Using python (pandas, preferably, but plain python is okay), how do I confirm that each attribute 1 value is paired with a member of the collection of values associated with it in map X?
CodePudding user response:
First flatten dictionary, so possible mapping second column by Series.map
and compare by first column if not equal:
d = {x: k for k, v in X.items() for x in v}
df['test'] = df['Attribute 2'].map(d).ne(df['Attribute 1'])
print (df)
Attribute 1 Attribute 2 test
0 a a_1 False
1 a a_2 False
2 c c_3 False
3 b a_1 True
4 a a_2 False
5 b b_3 False
CodePudding user response:
You can try apply
on rows
df['isin'] = df.apply(lambda row: row['Attribute 2'] not in d.get(row['Attribute 1']), axis=1)
print(df)
Attribute 1 Attribute 2 isin
0 a a_1 False
1 a a_2 False
2 c c_3 False
3 b a_1 True
4 a a_2 False
5 b b_3 False