I have been trying to remove all rows from a dataframe based on the condition that a column doesn't have a numeric value(int, float, etc..)
This is what I tried but it's not working...
city_df = city_df[city_df["city_latitude"].isnumeric()]
Does anyone know how to make it work?
Sample Data:
CodePudding user response:
You can try with a little tweak in the code:
city_df = city_df[city_df['city_latitude'].map(lambda x: str(x)).str.isnumeric().fillna(True)].dropna(subset=['city_latitude'])
Based on this data:
city_latitude
0 36.35665
1 39.174503
2 NaN
3 AB
4 1C
5 35.961561
6 38.798958
The following result is obtained:
city_latitude
0 36.35665
1 39.174503
5 35.961561
6 38.798958
EDIT:
Given your comments and errors I suggest a different approach:
city_df['city_latitude'] = pd.to_numeric(city_df['city_latitude'],errors='coerce')
city_df = city_df.dropna(subset=['city_latitude'])
Or in a single line:
city_df = city_df[~pd.to_numeric(city_df['city_latitude'],errors='coerce').isna()]
CodePudding user response:
Try this code
city_df=city_df[city_df['city_latitude'].apply(lambda x: type(x) in [int, np.int64, float, np.float64])]
or
following code will help you to understanding about the work flow
import pandas as pd
import numpy as np
data = {
"city_latitude": [420, 380, 390,'a',5]
}
df = pd.DataFrame(data)
df=df[df['city_latitude'].apply(lambda x: type(x) in [int, np.int64,
float, np.float64])]
print(df)
output:
city_latitude
0 420
1 380
2 390
4 5