How to use ILIKE in PySpark?-CodePudding

I have a query

select * from 
table 
where (primary_product NOT IN ('No Technical Enforcement')
   or group_name ilike ('%stove%'))

I want to convert the same query into PySpark SQL, but I am not able to do the same as I don't know the substitute of ILIKE.

CodePudding user response：

ILIKE does exist in Spark SQL. Here you can find the documentation.

Example:

SELECT ilike('Spark', '_Park');

Returns true.

CodePudding user response：

You can use like this

Import col from sql functions in pyspark

from pyspark.sql.functions import col

like filter condition

df.filter(col("group_name").like("%stove%")).show()

OR

spark.sql("select * from table where group_name like '%stove%'").show()

CodePudding user response：

ilike was only added in Spark 3.3 (released 2022 June 16) - both, in SQL and PySpark.

You use it in the same way as like or rlike.

Example:

df = spark.createDataFrame([('Tom', 80), ('Alice', None)], ["name", "height"])
#  ----- ------ 
# | name|height|
#  ----- ------ 
# |  Tom|    80|
# |Alice|  null|
#  ----- ------ 

df = df.filter(df.name.ilike('%Ice'))
#  ----- ------ 
# | name|height|
#  ----- ------ 
# |Alice|  null|
#  ----- ------