I try to format the values in a pandas column based on a condition to a value in another column
wrote this function to format the keys to a brand_specific standard
brand_1 = ['coke', 'pepsi']
def washbox_brand_1(item):
for item in range(len(df)):
if df['LEVERANC'][item] in brand_1:
df['ZOEKCODE'][item] = df['ZOEKCODE'][item].str.replace('-','')
df2 = df.apply(washbox_brand_1)
df2
Condition works fine but setting the new value is a problem.
gives error:
AttributeError: 'str' object has no attribute 'str'
How can I replace old values to the new format ? Please keep it simple, because code will be extended with much more format_rules for each brand
CodePudding user response:
add condition to check type of data you are replacing, looks like column data is of str type.
if type(df['ZOEKCODE'][item]) == str:
df['ZOEKCODE'][item] = df['ZOEKCODE'][item].replace('-','')
else:
df['ZOEKCODE'][item] = df['ZOEKCODE'][item].str.replace('-','')
CodePudding user response:
I don't think you need for
loop in this case either like @rpanai said in the comments when you are using .apply
, unless otherwise you have a specific reason. You could make the function even simpler I think.
Example:
def washbox_brand_1(row):
brand_1 = ['coke', 'pepsi']
if row["LEVERANC"] in brand_1:
row["ZOEKCODE"] = row["ZOEKCODE"].replace('-', '')
return row
df2 = df.apply(washbox_brand_1, axis=1)
You can also use .loc
and .isin
to get same result. In this case you call only the function without .apply
def washbox_brand_1(df):
brand_1 = ['coke', 'pepsi']
df.loc[df["LEVERANC"].isin(brand_1), "ZOEKCODE"] = df.loc[df["LEVERANC"].isin(brand_1), "ZOEKCODE"].str.replace('-', '')
return df
df2 = washbox_brand_1(df)
This should produce the same result and if not, you will have to provide a minimal reproducible example just like @rpanai made mentioned in the comment section.