Home > Net >  How a PySpark program can use string as Spark dataframe?
How a PySpark program can use string as Spark dataframe?

Time:09-07

df_source.printSchema()
# root
#  |-- year: integer (nullable = true)
#  |-- industry_code_ANZSIC: string (nullable = true)
#  |-- unit: string (nullable = true)

If I tried to call the above dataframe using string it throws error as the above one is dataframe and below is the string

ls=['source','temp']
temp="df_" ls[0]
print(temp)
# df_source
temp.printSchema()
Traceback (most recent call last):  
  File "<stdin>", line 1, in <module>  
AttributeError: 'str' object has no attribute 'printSchema'

I need some solution through which I could assign PySpark dataframe dynamically.

CodePudding user response:

You could use exec, but then you will have to write your method as string too:

exec(f"{temp}.printSchema()")
  • Related