Home > other >  Pyspark find where the column is equal to empy list: .where('column = []')
Pyspark find where the column is equal to empy list: .where('column = []')

Time:07-18

.where('column = []')

This operation throws following error:

java.lang.RuntimeException: Unsupported literal type class java.util.ArrayList []

How to workaround this error to find empty lists in my column?

CodePudding user response:

You can use this. This will filter out the empty list in dataframe for pyspark.

import pyspark.sql.functions as sf
df.filter(sf.size('column_with_lists') > 0)

Hope this help you.

CodePudding user response:

try this: .where("column = '[]'").

  • Related