How to apply multiple conditions to dataframe with loop-CodePudding

I have the following dataframe

dict1 = {'x_math_lp': {'John':'0',
                  'Lisa': 1,
                  'Karyn': '2'},
         'o_math_lp': {'John': 0.005,
                       'Lisa': 0.001,
                       'Karyn':0.9}}
df= pd.DataFrame(dict1)

I would like to apply a condition such that if a value in the first column is less than 1 and the value in the 2nd column if >= 0.05, then replace the value in the first column with 'NaN'

Results should look like this

       x_math_lp    o_math_lp
John    NaN          0.005
Lisa    1            0.001
Karyn   NaN          0.900

Note: The reason why I want to use a loop is because my true dataframe has 30 columns and I was to do it for every column pair set in the dataframe, essentially, updating the entire dataframe.

CodePudding user response：

You can use .loc for your desired column and check you condition like below. (Because some number in x_math_lp is str you can use pd.to_numeric)

Try this:

>>> import numpy as np
>>> df.x_math_lp = pd.to_numeric(df.x_math_lp, errors='coerce')
>>> df.loc[((df['x_math_lp'] < 1) | (df['o_math_lp'] >= 0.005)), 'x_math_lp'] = np.nan
>>> df
       x_math_lp    o_math_lp
John    NaN         0.005
Lisa    1           0.001
Karyn   NaN         0.900

If you want to run on multiple columns for every column pair you can use this:

>>> df= pd.DataFrame({'x_math_lp': {'John': 0,'Lisa': 1,'Karyn': 2},'o_math_lp': {'John': 0.005,'Lisa': 0.001,'Karyn':0.9},'y_math_lp': {'John': 0,'Lisa': 1,'Karyn': 2},'p_math_lp': {'John': 0.005,'Lisa': 0.001,'Karyn':0.9}})
>>> columns = df.columns
>>> for a,b in  zip(columns[::2],columns[1::2]):
...    df.loc[((df[a] < 1) | (df[b] >= 0.005)), a] = np.nan
>>> df

       x_math_lp    o_math_lp   y_math_lp   p_math_lp
John     NaN         0.005            NaN   0.005
Lisa     1.0         0.001            1.0   0.001
Karyn    NaN         0.900            NaN   0.900