I will filter a column on dataframe for to have only the number (digit code).
main_column |
---|
HKA1774348 |
null |
774970331205 |
160-27601033 |
SGSIN/62/898805 |
null |
LOCAL |
217-29062806 |
null |
176-07027893 |
724-22100374 |
297-00371663 |
217-11580074 |
I obtain this column
main_column |
---|
774970331205 |
160-27601033 |
217-29062806 |
176-07027893 |
724-22100374 |
297-00371663 |
217-11580074 |
CodePudding user response:
You can use rlike with an regexp that only includes digits and a hyphen:
df.where(df['main_column'].rlike('^[0-9\-] $')).show()
Output:
------------
| main_column|
------------
|774970331205|
|160-27601033|
|217-29062806|
|176-07027893|
|724-22100374|
|297-00371663|
|217-11580074|
------------