I'm working on a report generation with Spark and I need to be able to somehow add a column with constant value into a Dataset created with Dataset.select()
and then flushed into CSV file:
private static void buildReport(FileSystem fileSystem, Dataset<Row> joinedDs, String reportName) throws IOException {
Path report = new Path(reportName);
joinedDs.filter(aFlter)
.select(
joinedDs.col("AGREEMENT_ID"),
//... here I need to insert a column with constant value
joinedDs.col("ERROR_MESSAGE")
)
.write()
.format("csv")
.option("header", true)
.option("sep", ",")
.csv(reportName);
fileSystem.copyToLocalFile(report, new Path(reportName ".csv"));
}
I don't want to insert the column manually into created CSV file, I'd like to have the column there at file creation time.
CodePudding user response:
You can add it with lit function during select
.select(
joinedDs.col("AGREEMENT_ID"),
lit("YOUR_CONSTANT_VALUE").as("YOUR_COL_NAME"),
joinedDs.col("ERROR_MESSAGE")
)