Home > Enterprise >  Add a column with constant value into Spark Dataset in Java?
Add a column with constant value into Spark Dataset in Java?

Time:01-13

I'm working on a report generation with Spark and I need to be able to somehow add a column with constant value into a Dataset created with Dataset.select() and then flushed into CSV file:

private static void buildReport(FileSystem fileSystem, Dataset<Row> joinedDs, String reportName) throws IOException {
    Path report = new Path(reportName);
    joinedDs.filter(aFlter)
            .select(
                    joinedDs.col("AGREEMENT_ID"),
                    //... here I need to insert a column with constant value
                    joinedDs.col("ERROR_MESSAGE")
            )
            .write()
            .format("csv")
            .option("header", true)
            .option("sep", ",")
            .csv(reportName);

    fileSystem.copyToLocalFile(report, new Path(reportName   ".csv"));
}

I don't want to insert the column manually into created CSV file, I'd like to have the column there at file creation time.

CodePudding user response:

You can add it with lit function during select

.select(
                    joinedDs.col("AGREEMENT_ID"),
                    lit("YOUR_CONSTANT_VALUE").as("YOUR_COL_NAME"),
                    joinedDs.col("ERROR_MESSAGE")
            )
  • Related