I create a dataframe and added some metadata to a column in it
import pandas as pd
from pyspark.sql.functions import col
from pyspark.sql import column
df = spark.createDataFrame(pd.DataFrame({'a':[1,2,3],'b':[4,5,6]}))
df=df.withColumn('a',col('a').alias('a',metadata={'numClasses':2}))
How can I access the metadata added to the column a
?
I tried looking at df.schema
but it does not get updated with the metadata
CodePudding user response:
Try the following way.
print(df.schema['a'].metadata['numClasses'])