I am trying to create java spark program and trying to add anew column using
qdf.withColumn("newColumn", functions.lit("newCOlumn_val"))
and when I am trying to select with
qdf.withColumn("newColumn", functions.lit("newColumn_val")).select(qdf.col("xyz"),qdf.col("newColumn")).show();
its saying Cannot reslove column name newColumn. Can some one please help me how to do this in Java?
CodePudding user response:
qdf is the dataframe before you added the newColumn
which is why you are unable to select it with qdf.col("newColumn")
.
To get a handle on it you can use functions.col("newColumn")
e.g.
qdf.withColumn("newColumn", functions.lit("newColumn_val"))
.select(functions.col("xyz"),functions.col("newColumn"))
.show();
Alternatively you can store the dataframe after calling withColumn
and it should then be accessible e.g.
final var qdf2 = qdf.withColumn("newColumn", functions.lit("newColumn_val"));
qdf2.select(qdf2.col("xyz"), qdf2.col("newColumn")).show();
Or you can use raw strings as in Srinivas's answer.
CodePudding user response:
Try below code.
qdf.withColumn("newColumn", functions.lit("newColumn_val"))
.select("xyz","newColumn")
.show();
CodePudding user response:
You don't actually need to use withColumn at all.(Actually it has a very small performance hit over using select directly.)
Just reference the lit
in the select statement.
qdf.select(
qdf.col("xyz"),
functions.lit("newColumn_val").alias("new Column")
).show();