Home > Blockchain >  PySpark dataframe with list having null values
PySpark dataframe with list having null values

Time:12-21

I see some PySpark dataframe has list of values like [2,,3,,,4]. These values between commas are null but they're not 'null' in the list. Could someone suggest how this kind of list is generated?

Thanks, J

CodePudding user response:

They are empty strings.

import pyspark.sql.functions as F

......
data = [
    ('2,,3,,,4',)
]
df = spark.createDataFrame(data, ['col'])
df = df.withColumn('col', F.split('col', ','))
df.printSchema()
df.show(truncate=False)
  • Related