Home > Mobile >  How to apply a condition on the entire dataframe/ all columns
How to apply a condition on the entire dataframe/ all columns

Time:06-01

I have a dataframe with 14 rows × 1500 columns which contains only Numerical values. I want to apply a simple condition that if any values in the entire dataframe is above a certain number, say 25, then replace those values with 1 else replace with 0. I have found some solution on where I can perform such operation but I have specify the column name but I couldn't find any solution where I can perform one condition on the entire dataframe.

df[0.0] = df[0.0].apply(lambda x: 1 if x >=25 else 0)

This works for a specific column but

df = df[:,:].apply(lambda x: 1 if x >=25 else 0)

doesn't work. Could someone help?

CodePudding user response:

You might use pandas.DataFrame.applymap to apply function to each element following way

import pandas as pd
df = pd.DataFrame({'col1':[0,10,100],'col2':[0,50,500]})
df2 = df.applymap(lambda x: 1 if x >=25 else 0)
print(df2)

output

   col1  col2
0     0     0
1     0     1
2     1     1

in this particular case you might get df2 other way namely

df2 = (df >= 25).astype(int)

it does first create pandas.DataFrame of booleans then convert it to ints (False to 0, True to 1)

CodePudding user response:

You can try

df = np.where(df >= 25, 1, 0)

CodePudding user response:

This should let you dynamically find where x > 25 and set a new with the same column names

data = {
    'Column1' : [1, 26, 3, 27],
    'Column2' : [25, 26, 1, 1]
}

df = pd.DataFrame(data)
df_new = pd.DataFrame(np.where(df > 25, 1, 0), columns = df.columns)
df_new

CodePudding user response:

The function associated with applymap() is applied to all the elements of the given DataFrame, and hence the applymap() method is defined for DataFrames only. Similarly, the function associated with the apply() method can be applied to all the elements of DataFrame or Series, and hence the apply() method is defined for both Series and DataFrame objects.

The map() method can only be defined for Series objects in Pandas.

So, you have to use applymap() instead of apply().

  • Related