Home > Mobile >  How to assign new attributes into spark Dataset in Java
How to assign new attributes into spark Dataset in Java

Time:01-12

Assuming I have a Dataset<Person> personList, that contains list of Person.

Person is defined as follows:

public class Person {
    String name;
    String gender;
}

Now I have the list personList as dataset, but I need to backfill another attribute into Person, let's say it's age. So I can update my Person to

public class Person {
    String name;
    String gender;
    int age;
}

How do I loop through the Dataset and upate the age value?

I tried this approach, but it didn't update anything:

    personList.foreach(person -> {
        person.setAge(12);
    });

I tried to give every Person in the personList age of 12, but when I read the data set, the age value is still empty.

Why?

CodePudding user response:

You can add a column using .withColumn(colName, lit(colValue))

personList = personList.withColumn("age", functions.lit("12"));

CodePudding user response:

Either import object like you do know and use it to access method:

    import org.apache.spark.sql.functions;

    df.withColumn("foo", functions.lit(1));

or use import static and call method directly:

import static org.apache.spark.sql.functions.lit;

df.withColumn("foo", lit(1));
  • Related