Home > Net >  Spark - Scala: "error: not found: value transform"
Spark - Scala: "error: not found: value transform"

Time:10-04

In my implemented code I get the following error:

error: not found: value transform
.withColumn("min_date", array_min(transform('min_date,
                                  ^

I have been unable to resolve this. I already have the following import statements:

import sqlContext.implicits._
import org.apache.spark.sql.functions.split
import org.apache.spark.sql.functions._

I'm using Apache Zeppelin to execute this.

Here is the full code for reference and the sample of the dataset I'm using:

1004,bb5469c5|2021-09-19 01:25:30,4f0d-bb6f-43cf552b9bc6|2021-09-25 05:12:32,1954f0f|2021-09-19 01:27:45,4395766ae|2021-09-19 01:29:13,
1018,36ba7a7|2021-09-19 01:33:00,
1020,23fe40-4796-ad3d-6d5499b|2021-09-19 01:38:59,77a90a1c97b|2021-09-19 01:34:53,
1022,3623fe40|2021-09-19 01:33:00,
1028,6c77d26c-6fb86|2021-09-19 01:50:50,f0ac93b3df|2021-09-19 01:51:11,
1032,ac55-4be82f28d|2021-09-19 01:54:20,82229689e9da|2021-09-23 01:19:47,
val users = sc.textFile("path to file").map(x=>x.replaceAll("\\(","")).map(x=>x.replaceAll("\\)","")).map(x=>x.replaceFirst(",","*")).toDF("column")
val tempDF = users.withColumn("_tmp", split($"column", "\\*")).select(
  $"_tmp".getItem(0).as("col1"),
  $"_tmp".getItem(1).as("col2")
)

val output = tempDF.withColumn("min_date", split('col2 , ","))
    .withColumn("min_date", array_min(transform('min_date,
      c => to_timestamp(regexp_extract(c, "\\|(.*)$", 1)))))
  .show(10,false)

CodePudding user response:

There is no method in functions (version 3.1.2) with the signature transform(c: Column, fn: Column => Column) so you're writing importing the wrong object or trying to do something else.

CodePudding user response:

You are probably using a version of Spark < Spark 3.x, and this Scala dataframe API transform does not work. With Spark 3.x your code works fine.

I could not get with 2.4 that to work I noted. Not enough time, but have a look here: Higher Order functions in Spark SQL

  • Related