Home > Back-end >  Store different PySpark schema in one column
Store different PySpark schema in one column

Time:08-25

I try to extract different tables from REST API in PySpark. I followed this link. I want to store the different schema in one column in a pyspark dataframe. Here is an example:

import pyspark.sql.functions as F
from pyspark.sql import Row
from pyspark.sql.types import *

A = [{"TableName": "Table1", "Schema": StructType([StructField("a", StringType()), StructField("b", IntegerType())])}
    , {"TableName": "Table2", "Schema": StructType([StructField("b", StringType()), StructField("c", IntegerType())])}]
df_A = spark.createDataFrame(A)

I get the following error:

ValueError: Some of types cannot be determined after inferring

Is it possible to achieve this result?

CodePudding user response:

When we use different data type like StructType or StringType in spark, we are trying to define how the value in dataframe or column looks like, it's a description and definition but not a value. Therefore, you can't save it as a value inside the column.

If you really want to save the schema of different table, why don't you save it as a string?

A = [{"TableName": "Table1", "Schema": """StructType([StructField("a", StringType()), StructField("b", IntegerType())])"""}
    , {"TableName": "Table2", "Schema": """StructType([StructField("b", StringType()), StructField("c", IntegerType())])"""}]
df_A = spark.createDataFrame(A)

 ----------------------------------------------------------------------------- --------- 
|Schema                                                                       |TableName|
 ----------------------------------------------------------------------------- --------- 
|StructType([StructField("a", StringType()), StructField("b", IntegerType())])|Table1   |
|StructType([StructField("b", StringType()), StructField("c", IntegerType())])|Table2   |
 ----------------------------------------------------------------------------- --------- 

Then you can parse your schema when you create your own UDF.

  • Related