I want to run a repair job (MSCK REPAIR TABLE) in Azure Databricks, however I want to exclude 4 tables. What am I doing wrong?
database = "az_shffs"
tables = spark.catalog.listTables(database)
tables = tables.filter("tableName != 'exampletable1'").filter("tableName != 'exampletable2'").filter("tableName != 'exampletable3'").filter("tableName != 'exampletable4'")
for table in tables:
spark.sql(f"MSCK REPAIR TABLE {database}.{table.name}")`
I get the following error message:
AttributeError: 'list' object has no attribute 'filter'
CodePudding user response:
I think you are storing the list of tables in in tables
variable by running the following command tables = spark.catalog.listTables(database)
but the variable type is list
not dataframe
and list
has no attribute filter
. If you still want to use filter then convert that to dataframe and then use filter.
Please refer below image.
You can use following command it will store that as a dataframe and then you can use filter.
df = spark.sql("show tables in demo")
display(df)
To run MSCK REPAIR TABLE
command in for loop you can use below code.
for i in tables.collect():
Please accept this answer if this works for you.