How can I write SQL Spark Commands to return fields with Case Insensitive results?
Example: Sample_DF below
--------
| name |
--------
| Johnny|
| Robert|
| ROBERT|
| robert|
--------
It seems by Default it seems Spark SQL is case sensitive via the field you query for:
spark.sql("select name from Sample_DF where status like '%Robert%'").show
------
|name |
------
|Robert|
------
What can I do to configure above query to be case insensitive so that it can return below, assuming there is a large list of various roberts of different lower/uppercase variations?
--------
| name |
--------
| Robert|
| ROBERT|
| robert|
--------
As I understand SQL Spark does not support MSSQL Collate
CodePudding user response:
you can make all characters lowercaser.
spark.sql("select status from Sample_DF where lower(status) like '%' || lower('Robert') || '%'").show
there is also a builtin function How to change case of whole column to lowercase?
CodePudding user response:
If you want to take a look at all the names in the name column, you could use the lower function, which converts all chars to lowercase.
Sample_DF.select(F.lower('name')).show()