I have the following lists: brands_list = {"b1": {"name": "brand1"}, "b2". {"name": "brand2"}}
and actual_brands = ["brand1"]
, and a Pandas dataframe with a column brand
with the following content: "b1", "b1", "b1", "b2", "b1"
, and I want to assign a value to column is_brand_present
if the element of brands_list
with index of column brand
is in actual_brands
.
I try the following using numpy's where
:
brands_list = {"b1": "brand1", "b2". "brand2"}
actual_brands = ["brand1"]
data_frame["is_brand_present"] = np.where(
brands_list[data_frame["brand"]].isin(actual_brands), 1, 0
)
I expect the content of column is_brand_present
to be 1,1,1,0,1
, but I'm getting this error:
TypeError: unhashable type: 'Series'
How can I make the evaluation of the condition?
CodePudding user response:
IIUC , you are looking for (map 'brand' column to dictionary and check if its in actual brands)
df['is_brand_present'] = df['brand'].map(brands_list).isin(actual_brands).astype(int)
print(df):
brand is_brand_present
0 b1 1
1 b1 1
2 b1 1
3 b2 0
4 b1 1
CodePudding user response:
We can just do
l = [x for x, y in brands_list.items() if y['name'] in actual_brands ]
df['is_brand_present'] = df.brand.isin(l).astype(int)