let's say i have this code(which is obviously wrong)
def conditions(df):
value = df["weight"])
unit = df["weight_unit"]
if(unit.lower() == "pound"):
return value / 2.2
elif(unit.lower() == "metric ton"):
return value * 1000
elif(unit.lower() == "long ton"):
return value * 1016
elif(unit.lower() in ("measurement ton", "short ton")):
return value * 907
def convert_to_kilo(df):
func = np.vectorize(conditions)
to_kilo = func(df)
df["weight"] = to_kilo
df["weight_unit"] = "Kilograms"
I want to apply such condition to each value in a column(weights) based on another column(weight unit). Is there an efficient way to do it. Preferably allowing a func pass so easy to modify
CodePudding user response:
Don't use a function, this will be slow. numpy.vectorize
does not vectorize in C-speed, but rather "pseudo-vectorizes" using an internal loop.
Use map
instead:
units = {'pound': 1/2.2, 'metric ton': 1000, 'long ton': 1016,
'measurement ton': 907, 'short ton': 907,
}
df['weight'] *= df['weight_unit'].str.lower().map(units)
df['weight_unit'] = 'Kilograms'