How to filter pyspark dataframe but still on dataframe format?
I used this
datalabel = datalabel.filter(datalabel.subs_no.isNotNull()).collect()
but datalabel
format is change to list.
CodePudding user response:
You can filter the required columns using select
which will return a DataFrame
datalabel_subs_no = datalabel.filter(datalabel.subs_no.isNotNull()).select(F.col('subs_no'))