I am trying to replace the null values with N/A. I have tried with following code but none of them works:
df.withColumn("series_name", when($"series_name") === null,"n/a")
.otherwise($series_name)
and
df.withColumn("series_name", when(col("series_name") === null,"n/a")
what am I missing?
--------------------
| series_name|
--------------------
|Families of the M...|
| null|
| Ridiculousness|
| null|
| null|
--------------------
CodePudding user response:
You could also use the .fillna() method:
df.fillna('N/A', subset=['series_name'])
CodePudding user response:
I prefer to use coalesce
.
from pyspark.sql import functions as f
df.withColumn('series_name', f.expr("coalesce(series_name, 'n/a')"))