Home > Mobile >  Select rows from a table which contains at-least one alphabet in the column
Select rows from a table which contains at-least one alphabet in the column

Time:12-29

I have column called name under a table in Databricks.

I want to find a way to select only those rows from a table, which contains at-least one alphabet character in the name column.

Example values in the column:

12243
#123-(23)
$ank
ada124$%
()
!asd
122acs#
gmgd32

Expected: I need to pick only those values which contains at least one alphabet in it. Or in other words, I need a way to exclude all the rows which contains only numbers and special characters.

So the Expected output should be as below :

$ank
ada124$%
!asd
122acs#
gmgd32

because these contains at least one alphabet in them.

I am using pyspark-sql in data bricks.

CodePudding user response:

You can use rlike with regex:


import pyspark.sql.functions as F

df.filter(F.col("name").rlike(".*[a-zA-Z] .*")).show()

# -------- 
#|    name|
# -------- 
#|    $ank|
#|ada124$%|
#|    !asd|
#| 122acs#|
#|  gmgd32|
# -------- 

Spark SQL equivalent query:

SELECT * 
FROM   df
WHERE  name RLIKE '.*[a-zA-Z] .*'
  • Related