I have column called name
under a table in Databricks.
I want to find a way to select only those rows from a table, which contains at-least one alphabet character in the name
column.
Example values in the column:
12243
#123-(23)
$ank
ada124$%
()
!asd
122acs#
gmgd32
Expected: I need to pick only those values which contains at least one alphabet in it. Or in other words, I need a way to exclude all the rows which contains only numbers and special characters.
So the Expected output should be as below :
$ank
ada124$%
!asd
122acs#
gmgd32
because these contains at least one alphabet in them.
I am using pyspark-sql in data bricks.
CodePudding user response:
You can use rlike
with regex:
import pyspark.sql.functions as F
df.filter(F.col("name").rlike(".*[a-zA-Z] .*")).show()
# --------
#| name|
# --------
#| $ank|
#|ada124$%|
#| !asd|
#| 122acs#|
#| gmgd32|
# --------
Spark SQL equivalent query:
SELECT *
FROM df
WHERE name RLIKE '.*[a-zA-Z] .*'