I am trying to remove all values in this pandas dataframe that have that have less than length 3, but not to all columns
import pandas
df = pd.DataFrame({'id': [1, 2, 3],'player': ['w', 'George', 'Roland'], 'hometown': ['Miami', 'Caracas', 'Mexico City'], 'current_city': ['New York', '-', 'New York']})
columns_to_add = ['player', 'hometown', 'current_city']
for column_name in columns_to_add:
df.loc[(len(df[column_name]) < 3), column_name] = None
I am trying the following code but I get the following error:
KeyError("cannot use a single bool to index into setitem")
Note:
CodePudding user response:
You can use applymap
to calculate the length, then np.where
to update:
df[columns_to_add] = np.where(df[columns_to_add].applymap(len) >=3,
df[columns_to_add], None)
Output:
id player hometown current_city
0 1 None Miami New York
1 2 George Caracas None
2 3 Roland Mexico City New York
CodePudding user response:
Try this:
df[df[columns_to_add].apply(lambda col: col.str.len() < 3)] = np.nan
Output:
>>> df
id player hometown current_city
0 1 NaN Miami New York
1 2 George Caracas NaN
2 3 Roland Mexico City New York
CodePudding user response:
you can use the 'replace' function in DataFrame :
def find_string_less_lenth(list_of_values):
return [i for i in list_of_values if len(i)<3]
for column_name in columns_to_add:
df[column_name] = \
df[column_name].replace(find_string_less_lenth(df[column_name].values), 'none')
CodePudding user response:
I think the simplest solution might be
new_df = df[columns_to_add]
new_df[new_df.applymap(len) > 3]