Home > other >  How to use when and Otherwise statement for a Spark dataframe by boolean columns?
How to use when and Otherwise statement for a Spark dataframe by boolean columns?

Time:01-09

I have a dataset with three columns, col 1: country (String), col 2: threshold_1 (bool), col 3: threshold_2 (bool)

I am trying to create a new column with this logic, but getting an error

I am using the Palantir code workbook for this, can anyone tell me what I am missing here?

df = df.withColumn("Threshold_Filter", 
        when(df["country"]=="INDIA" & df["threshold_1"]==True | df["threshold_2 "]==True, "Ind_country"
     ).otherwise("Dif_country"))

CodePudding user response:

You just need to put your statements in parentheses.

df = (
    df
    .withColumn(
        "Threshold_Filter",
        when(
            (df["country"]=="INDIA") & 
            (df["threshold_1"]==True) | 
            (df["threshold_2 "]==True), 
            "Ind_country")
        .otherwise("Dif_country"))
)
  • Related