Home > Blockchain >  pandas create new column based on boolean value
pandas create new column based on boolean value

Time:12-29

I have a dataframe like this with Boolean values

black yellow orange
TRUE TRUE TRUE
FALSE TRUE FALSE
TRUE TRUE FALSE
FALSE FALSE TRUE

I want a separate column that summarizes the Boolean values based on column name which the column would be

summary
black, yellow, orange
yellow
black, yellow
orange

Any idea how to do this please? Thanks!

CodePudding user response:

You can use each row as a selection mask to filter the column names:

(
    df.astype("bool")
    .apply(lambda row: ", ".join(df.columns[row]), axis=1)
    .to_frame("summary")
)

CodePudding user response:

Try this using pd.DataFrame.dot:

df_colors['summary'] = df_colors.dot(df_colors.columns ', ').str.strip(', ')
df_colors

Output:

   black  yellow  orange                summary
0   True    True    True  black, yellow, orange
1  False    True   False                 yellow
2   True    True   False          black, yellow
3  False   False    True                 orange
  • Related