I want to apply a function to my dataframe and remove rows/ids which give an error. I also want to avoid for loops as they are a bit slow for big dataframes. An example dataframe could look like this:
d = {'a': [1, "wrong_element"], 'b': [1, 2]}
df = pd.DataFrame(data=d, index=[1, 2])
print(df)
Output:
a b
1 1 1
2 wrong_element 2
try:
df['a'] = df['a'].apply(lambda x: x-2)
except Exception:
pass
desired output:
a b
1 -1 1
CodePudding user response:
You can set a NaN
in case it fails, then drop the NaN
def operation(value):
try:
return value - 2
except:
return np.nan
df = pd.DataFrame({'a': [1, "wrong_element"], 'b': [0, 2]}, [1, 2])
df['a'] = df['a'].apply(operation)
df = df.dropna()
a b
1 -1.0 0
CodePudding user response:
The solution can be also be expressed as:
df = pd.DataFrame({'a': [1, "wrong_element"], 'b': [0, 2]}, [1, 2])
df["a"] = df["a"].replace("wrong_element", np.nan).sub(2)
df.dropna()