Home > database >  spark java how to select newly added column using withcolumn
spark java how to select newly added column using withcolumn

Time:02-11

I am trying to create java spark program and trying to add anew column using

qdf.withColumn("newColumn", functions.lit("newCOlumn_val"))

and when I am trying to select with

qdf.withColumn("newColumn", functions.lit("newColumn_val")).select(qdf.col("xyz"),qdf.col("newColumn")).show();

its saying Cannot reslove column name newColumn. Can some one please help me how to do this in Java?

CodePudding user response:

qdf is the dataframe before you added the newColumn which is why you are unable to select it with qdf.col("newColumn").

To get a handle on it you can use functions.col("newColumn") e.g.

qdf.withColumn("newColumn", functions.lit("newColumn_val"))
    .select(functions.col("xyz"),functions.col("newColumn"))
    .show();

Alternatively you can store the dataframe after calling withColumn and it should then be accessible e.g.

final var qdf2 = qdf.withColumn("newColumn", functions.lit("newColumn_val"));

qdf2.select(qdf2.col("xyz"), qdf2.col("newColumn")).show();

Or you can use raw strings as in Srinivas's answer.

CodePudding user response:

Try below code.

qdf.withColumn("newColumn", functions.lit("newColumn_val"))
.select("xyz","newColumn")
.show();

CodePudding user response:

You don't actually need to use withColumn at all.(Actually it has a very small performance hit over using select directly.) Just reference the lit in the select statement.

qdf.select( 
  qdf.col("xyz"), 
  functions.lit("newColumn_val").alias("new Column") 
).show();
  • Related