For example, I have a spark row:
Row row = ...
I can evaluate the following command in an interactive session with the debugger:
row.schema.fieldNamesSet.contains("title")
> true
However, I cannot write:
assertThat(row.schema.fieldNamesSet.contains("title"))
// or
assertThat(row.schema().fieldNamesSet.contains("title"))
// etc.
// this method path is not available because it has "private access"
(General question, or Y) How do I assert that a fieldName is not present in the row?
(Specific question, or X) How do I perform an in-line check whether a schema contains a fieldName?
CodePudding user response:
The schema of a Row
is an instance of the StructType
class, so you can refer to the JavaDoc of this class to find out all the public fields and methods that you can use. Note that you can use all the methods defined in the StructType
class plus all methods inherited from the superclasses and interfaces.
In particular, to verify if the schema contains or not a given field name you have various options:
exists
method
Pass a predicate to the exists
method that will be evaluated for each field and returns true
if at least one field matches the condition. It is also useful if you want to evaluate other conditions besides the name.
row.schema().exists(f -> "title".equals(f.name()));
getFieldIndex
method
The StructType.getFieldIndex
method returns an Option
pointing to the actual field index if present, or to None
if not present.
row.schema().getFieldIndex("title").isDefined();
You can also access the fields or fieldNames arrays with the fields()
and fieldNames()
methods and process them as it is most convenient for your use case.