I have to delete rows of a database according to certain conditions.
for index, row in df_A.iterrows():
if name not in row["Name"].lower():
df_A.drop(index, inplace= True)
for index, row in df_B.iterrows():
if address != row["address"].split(":")[1]:
df_B.drop(index, inplace= True)
for index, row in df_C.iterrows():
name_given = name_dict[row["id"]]
if name_given != name:
df_C.drop(index, inplace= True)
The above code is working fine. But is there any shortcut way of doing these operations in pandas that do not use iterrows?
CodePudding user response:
Use:
df_A[df_A['name'].str.lower().str.contains(name)]
df_B[df_B['address'].str.split(':').str[1].eq(address)]
df_C[df_C['id'].map(name_dict).eq(name)]
CodePudding user response:
As u did not share sample data, could not test on your dataset but following worked for own dataset.
This following lines could replace of your first blocks of your code out of three without using 'iterrows'
df_A = df_A.fillna('0') #None/NAN cause error for 'series.str.contains' operation
df_A = df_A.drop(df_A[df_A['Name'].str.contains(name)].index)