I've been looking for ways to do this natively for a little while now and can't find a solution.
I have a large dataframe where I would like to set the value in other_col to 'True' for all rows where one of a list of columns is empty.
This works for a single column page_title:
df.loc[df['page_title'].isna(), ['other_col']] = ''
But not when using a list
df.loc[df[['page_title','brand','name']].isna(), ['other_col']] = ''
Any ideas of how I could do this without using Numpy or looping through all rows? Thanks
CodePudding user response:
Maybe this is what you are looking for:
df = pd.DataFrame({
'A' : ['1', '2', '3', np.nan],
'B': ['10', np.nan, np.nan, '40'],
'C' : ['test', 'test', 'test', 'test']})
df.loc[df[['A', 'B']].isna().any(1), ['C']] = 'value'
print(df)
Result:
A B C
0 1 10 test
1 2 NaN value
2 3 NaN value
3 NaN 40 value
CodePudding user response:
This will allow you to set which columns you want to determine if np.nan is present and set a True/False indicator
data = {
'Column1' : [1, 2, 3, np.nan],
'Column2' : [1, 2, 3, 4],
'Column3' : [1, 2, np.nan, 4]
}
df = pd.DataFrame(data)
df['other_col'] = np.where((df['Column1'].isna()) | (df['Column2'].isna()) | (df['Column3'].isna()), True, False)
df