I have a dataframe with columns as first ,col1,col2,col3,row1 import pandas as pd
df = pd.DataFrame({'first': ['1', '2', '3', '4', '5', '6', '7', '8'], 'col1': ['yes', 'yes', 'yes', 'yes', 'no', 'no', 'no', 'no'], 'col2': ['yes', 'yes', 'no', 'no' ,'yes', 'yes' ,'no', 'no'], 'col3': ['yes', 'no', 'yes', 'no','yes', 'no','yes', 'no'], 'row1':['4','5','7','3','7','3','79','8']})
first col1 col2 col3 row1 1 yes yes yes 4 2 yes yes no 5 3 yes no yes 7 4 yes no no 3 5 no yes yes 7 6 no yes no 3 7 no no yes 79 8 no no no 8
i want to apply conditions as below and create a new column as new_one
df.loc[((df['col1']=='no') & (df['col2']=='no') & (df['col3'] == 'no')),'new_one'] = 'no'
df.loc[~((df['col1']=='no') & (df['col2']=='no') & (df['col3'] == 'no')),'new_one'] = 'yes`
Problem: number of columns in the dataframe is dynamic and it is not constant as 3 in the above data frame but it matches with a pattern , here it is 'col*'
i want to select columns based up on the pattern(here col*) and apply the condition ('== 'no'') to each column and '&' them, and create a new column as new_one
i implemented the below code but not working
col_list=[]
for name in df.columns:
if(re.search("col",name))
col_list.append(name)
count=0
cond=''
for col in col_list:
if(count==0):
cond ="(df[\'" col "\'] == 'no')"
else:
cond ="&(df[\'" col "\'] == 'no')"
count=count 1
df.loc[cond,'new_one'] = 'no'
df.loc[~cond,'new_one'] = 'no'
problem: cond is a str , i dont know how to use that as condition as df.loc expects cond to be series.i tried eval as well but no luck.please let me know any solution to this?
first col1 col2 col3 row1 new_one
1 yes yes yes 4 yes
2 yes yes no 5 yes
3 yes no yes 7 yes
4 yes no no 3 yes
5 no yes yes 7 yes
6 no yes no 3 yes
7 no no yes 79 yes
8 no no no 8 no
CodePudding user response:
Using lambda here you counting the number of time 'no' occurrence.
df['new'] = df[['col1','col2','col3']].T.apply(lambda r: 'no' if list(r).count('no')==3 else 'yes')
CodePudding user response:
df.loc[df[['col1', 'col2', 'col3']].eq('no').all(axis=1), 'new_one'] = 'no'
df.loc[df[['col1', 'col2', 'col3']].ne('no').any(axis=1), 'new_one'] = 'yes'
Or, for both yes/no in a single command, numpy.where
:
df['new_one'] = np.where(df[['col1', 'col2', 'col3']].eq('no').all(axis=1),
'no', 'yes')
Output:
first col1 col2 col3 row1 new_one
0 1 yes yes yes 4 yes
1 2 yes yes no 5 yes
2 3 yes no yes 7 yes
3 4 yes no no 3 yes
4 5 no yes yes 7 yes
5 6 no yes no 3 yes
6 7 no no yes 79 yes
7 8 no no no 8 no
dynamic number of columns:
mask = df.filter(like='col').eq('no').all(axis=1)
df['new_one'] = np.where(mask, 'no', 'yes')