I have dataframe named df which has two columns id1 and id2
I need to filter values based on some other df named as meta_df
meta_df has three columns id,name,text
df
id1 | id2 |
---|---|
12 | 34 |
99 | 42 |
metadf
id | name | text |
---|---|---|
12 | aa | lowerend |
42 | bb | upperend |
99 | cc | upper limit |
34 | dd | uppersome |
I need values from text which have lower and upper in string of text. e.g 12 and 34 I am trying the below code and stuck at getting text clumn
for row in df.itertuples():
print(row.Index, row.id1, row.id2)
print(meta_df[id['id']== row.id1])
print(meta_df[id['id']== row.id2])
Output Expected
id2 | id2 | flag |
---|---|---|
12 | 34 | yes |
99 | 42 | no |
CodePudding user response:
Melt df
and merge to metadf
, a bit of reshaping before getting the final value:
# keep the index with ignore_index
# it will be used when reshaping back to original form
reshaped = (df.melt(value_name = 'id', ignore_index = False)
.assign(ind=lambda df: df.index)
.merge(metadf, on='id', how = 'left')
.assign(text = lambda df: df.text.str.contains('lower'))
.drop(columns='name')
.pivot('ind', 'variable')
.rename_axis(columns=[None, None], index=None)
)
# if the row contains both lower(1) and upper(0)
# it will sum to 1, else 0, or 2(unlikely with the sample data shared)
flag = reshaped.loc(axis=1)['text'].sum(1)
reshaped.loc(axis=1)['id'].assign(flag = flag.map({1:'yes', 0:'no'}))
id1 id2 flag
0 12 34 yes
1 99 42 no