Hello I am trying to drop rows that have in a specific column string that is not a year. For example I have the in last rows year formats that have decimal points or '-'.
I have tried to convert the year column into a string and then drop them using the code below but it only removes the row with 2011-21, the ones with decimal points stay.
df.level_1=df.level_1.astype(str)
df.loc[
(~df.level_1.str.contains("."))
|~(df.level_1.str.contains("-")),
:]
is there a way to fix this issue ??
CodePudding user response:
You can filter all rows where level_1
contains non digit characters:
df[~df.level_1.str.contains('\D')]
CodePudding user response:
you can use regex:
df['level_1']=df['level_1'].astype(str)
df = df[df['level_1'].str.contains('\d\d\d\d-\d\d',regex=True)]