Home > Mobile >  How do i transfer a string that starts with the specified character into another column in sparksql
How do i transfer a string that starts with the specified character into another column in sparksql

Time:12-31

Given the dataframe below:

Row|Address                                                                 |Contact      |
 --- ------------------------------------------------------------------------ ------------- 
|1  |J. Borja cor. Guillermo St., Cagayan de Oro City 08822 722-922 / 726-667|null         |
|2  |Cruz Taal cor. Apolinar Velez St.,Cagayan de Oro City 08822 725-301     |null         |
|3  |R.N. Abejuela St., Cagayan de Oro City                                  |08822 727-864|

How do i convert it into this:

Row|Address                                                      |Contact                  |
 --- ------------------------------------------------------------------------ ------------- 
|1  |J. Borja cor. Guillermo St., Cagayan de Oro City            | 08822 722-922 / 726-667 |
|2  |Cruz Taal cor. Apolinar Velez St.,Cagayan de Oro City       |           08822 725-301 |
|3  |R.N. Abejuela St., Cagayan de Oro City                      |           08822 727-864 |

CodePudding user response:

First use regular expressions to extract Contact_tmp from Address, then erase it from Address, and finally use coalescefunction to merge Contact and Contact_tmp.

df = df.withColumn('Contact_tmp', F.regexp_extract('Address', '\\d \\s \\d -\\d \\s*/*\\s*\\d*-*\\d*', 0)).select(
    'Row',
    F.expr('replace(Address, Contact_tmp, "")').alias('Address'),
    F.coalesce('Contact', 'Contact_tmp').alias('Contact')
)
df.show(truncate=False)
  • Related