I am trying to add an if condition in F. when in a pyspark column my code:
df = df.withColumn("column_fruits",F.when(F.col('column_fruits') == "Berries"
if("fruit_color")== "red":
return "cherries"
elif("fruit_color") == "pink":
return "strawberries"
else:
return "balackberries").otherwise("column_fruits")
I want to first filter out berries and change fruit names according to color. And all the remaining fruits remain the same.
Can anyone tell me if this is a valid way of writing withColumn
code?
CodePudding user response:
This would work
df.withColumn("column_fruits", F.when((F.col('column_fruits') == "Berries") & (F.col('fruit_color') == "red"), "cherries")\
.when((F.col('column_fruits') == "Berries") & (F.col('fruit_color') == "pink"), "strawberries")\
.otherwise("blackberries"))\
.show()
Sample Input/Output: