I have a simple Spark DataFrame with column ID
with integer values 1
, 2
, etc.:
--- -------
| ID| Tags |
--- -------
| 1| apple |
| 2| kiwi |
| 3| pear |
--- -------
I want to check if value like 2
is in the column ID
in any row, filter
method is only useful for string columns. Any ideas?
UPDATE:
I was trying with:
df.filter(df.ID).contains(2)
At the end I need boolean
True
or False
output.
CodePudding user response:
No. Filter
can filter other data types also.
dataDictionary = [
(1,"APPLE"),
(2,"KIWI"),
(3,"PEAR")
]
df = spark.createDataFrame(data=dataDictionary, schema = ["ID","Tags"])
df.printSchema()
df.show(truncate=False)
df.filter("ID==2").rdd.isEmpty() #Will return Boolean.