Home > Software engineering >  How to explode a string column based on specific delimiter in Spark Scala
How to explode a string column based on specific delimiter in Spark Scala

Time:04-22

I want to explode string column based on a specific delimiter (| in my case ) I have a dataset like this:

 ----- -------------- 
|Col_1|Col_2         |
 ----- -------------- 
|  1  |  aa|bb       |
|  2  |  cc          |
|  3  |  dd|ee       |
|  4  |  ff          |
 ----- ----- ---------

I want an output like this:

 ----- -------------- 
|Col_1|Col_2         |
 ----- -------------- 
|  1  |  aa          |
|  1  |  bb          |
|  2  |  cc          |
|  3  |  dd          |
|  3  |  ee          |
|  4  |  ff          |
 ----- ----- ---------

CodePudding user response:

Use explode and split functions, and use \\ escape |.

val df1 = df.select(col("Col_1"), explode(split(col("Col_2"),"\\|")).as("Col_2"))
  • Related