Home > Mobile >  Spark DataFrame contains specific integer value in column
Spark DataFrame contains specific integer value in column

Time:10-08

I have a simple Spark DataFrame with column ID with integer values 1, 2, etc.:

 --- ------- 
| ID| Tags  |
 --- ------- 
|  1| apple |
|  2| kiwi  |
|  3| pear  |
 --- ------- 

I want to check if value like 2 is in the column ID in any row, filter method is only useful for string columns. Any ideas?

UPDATE:

I was trying with: df.filter(df.ID).contains(2)

At the end I need boolean True or False output.

CodePudding user response:

No. Filter can filter other data types also.

dataDictionary = [
    (1,"APPLE"),
    (2,"KIWI"),
    (3,"PEAR")
    ]

df = spark.createDataFrame(data=dataDictionary, schema = ["ID","Tags"])
df.printSchema()
df.show(truncate=False)
df.filter("ID==2").rdd.isEmpty()  #Will return Boolean.

enter image description here

  • Related