Im reading a JSON file with spark.read.json
and it auotmatically gives me the dataframe with schema but is it possible to change the schema of exisiting Dataframe with the below schema?
schema = StructType([StructField("_links", MapType(StringType(), MapType(StringType(), StringType()))),
StructField("identifier", StringType()),
StructField("enabled", BooleanType()),
StructField("family", StringType()),
StructField("categories", ArrayType(StringType())),
StructField("groups", ArrayType(StringType())),
StructField("parent", StringType()),
StructField("values", MapType(StringType(), ArrayType(MapType(StringType(), StringType())))),
StructField("created", StringType()),
StructField("updated", StringType()),
StructField("associations", MapType(StringType(), MapType(StringType(), ArrayType(StringType())))),
StructField("quantified_associations", MapType(StringType(), IntegerType())),
StructField("metadata", MapType(StringType(), StringType()))])
CodePudding user response:
Once you have schema
defined (as in the answer), you can use it to read the data this way:
df = spark.read.json('path_to_json', schema=schema)