I have a dataset, which has the following schema:
ds.schema()
0 = {StructField@21690} "StructField(col_name_1,NullType,true)"
1 = {StructField@21691} "StructField(col_name_2,StringType,true)"
2 = {StructField@21692} "StructField(col_name_3,ArrayType(StructType(StructField(person_name,StringType,true), StructField(person_surname,StringType,true)),true),true)"
I want to access the datatype of each StructField. E.g. if the data type of col_name_1 is NullType print null.
How can I build this if loop?
CodePudding user response:
You can pull this out of the schema:
for field in df.schema.fields: print(field.name " , " str(field.dataType))
CodePudding user response:
This worked fine for me:
ds.schema().apply(ColName).dataType().toString().equals("NullType")