Is there a simple way to dynamically (!!!) create a boolean column in a Dataframe, based on the values of the other columns, by checking if the values are equal?
My DF:
df = pd.DataFrame({"column_1":[1,2,3,4,5], "column_2":[1,3,2,4,5]})
How it should look like:
|column_1|column_2|columns_equal|
|:-------|--------|------------:|
| 1 | 1 | True |
| 2 | 3 | False |
| 3 | 2 | False |
| 4 | 4 | True |
| 5 | 5 | True |
Thank you in advance :)
CodePudding user response:
For the simple case of two columns you can do:
df["column_equal"] = df["column_1"] == df["column_2"]
If, instead you have more columns this will be better:
df["column_equal"] = df.eq(df["column_1"], axis=0).all(axis=1)
df.eq(df["column_1"])
will give you a new dataframe with in each column a boolean indicating if that element is the same as the one in column_1.
Then .all(axis=1)
just checks if all elements in each row are True
.
CodePudding user response:
There are 2 very steight forward solutions:
df['columns_equal'] = df['column_1'].eq(df['column_2'])
or
df['columns_equal'] = df['column_1'] == df['column_2']
Edit
A loop could look like this:
for i, item in enumerate(df.columns):
df[i] = df['column_1'].eq(df[item])
CodePudding user response:
You can find the number of unique values per row using DataFrame.nunique
with axis=1
, and then check if there is only one using Series.eq
df["column_equal"] = df.nunique(axis=1).eq(1)