Home > Blockchain >  How to get schema without loading table data in Databricks?
How to get schema without loading table data in Databricks?

Time:04-14

I am working on Databricks and I use spark to laod and publish data to a SQL database. One of the task I need to do is to get the schema of a table of my database and therefore see the datatypes of each column. The only way I am able to do it so far is by loading the whole table and then extracting the schema.

df_tableA = spark.read.format("jdbc") \
        .option("url", datasource_url) \
        .option("dbtable", table_name) \
        .option("user", dbuser) \
        .option("password", dbpassword) \
        .option("driver", driver) \
        .load()

However my goal is to get just the schema without loading the entire table since I want to speed up the process and I do not want to overload the memory.

Would you be able to suggest a smart and elegant way to achieve my goal?

CodePudding user response:

Normally, load does not load the table into memory. But if you want, you can use a dummy query and pass to dbtable like .option("dbtable", "(select * from table where 1 = 2) t")

  • Related