Home > OS >  Pandas create boolean column based on equality of other columns
Pandas create boolean column based on equality of other columns

Time:04-09

Is there a simple way to dynamically (!!!) create a boolean column in a Dataframe, based on the values of the other columns, by checking if the values are equal?

My DF:

df = pd.DataFrame({"column_1":[1,2,3,4,5], "column_2":[1,3,2,4,5]})

How it should look like:

|column_1|column_2|columns_equal|
|:-------|--------|------------:|
|     1  |     1  |    True     |
|     2  |     3  |    False    |
|     3  |     2  |    False    |
|     4  |     4  |    True     |
|     5  |     5  |    True     |

Thank you in advance :)

CodePudding user response:

For the simple case of two columns you can do:

df["column_equal"] = df["column_1"] == df["column_2"]

If, instead you have more columns this will be better:

df["column_equal"] = df.eq(df["column_1"], axis=0).all(axis=1)

df.eq(df["column_1"]) will give you a new dataframe with in each column a boolean indicating if that element is the same as the one in column_1. Then .all(axis=1) just checks if all elements in each row are True.

CodePudding user response:

There are 2 very steight forward solutions:

df['columns_equal'] = df['column_1'].eq(df['column_2'])

or

df['columns_equal'] = df['column_1'] == df['column_2']

Edit

A loop could look like this:

for i, item in enumerate(df.columns):
    df[i] = df['column_1'].eq(df[item])

CodePudding user response:

You can find the number of unique values per row using DataFrame.nunique with axis=1, and then check if there is only one using Series.eq

df["column_equal"] = df.nunique(axis=1).eq(1)
  • Related