Home > Back-end >  Azure table backup retrieve more than 1000 rows
Azure table backup retrieve more than 1000 rows

Time:10-01

I hope somebody can help me to debug this issue.

I have the following script


from azure.cosmosdb.table.tableservice import TableService,ListGenerator
from azure.storage.blob import BlobServiceClient
from datetime import date
from datetime import *




def queryAndSaveAllDataBySize(tb_name,resp_data:ListGenerator ,table_out:TableService,table_in:TableService,query_size:int):
    for item in resp_data:
        #remove etag and Timestamp appended by table service
        del item.etag
        del item.Timestamp
        print("instet data:"   str(item)   "into table:"  tb_name)
        table_in.insert_or_replace_entity(tb_name,item)
    if resp_data.next_marker:
        data = table_out.query_entities(table_name=tb_name,num_results=query_size,marker=resp_data.next_marker)
        queryAndSaveAllDataBySize(tb_name,data,table_out,table_in,query_size)


tbs_out = table_service_out.list_tables()

for tb in tbs_out:
    #create table with same name in storage2
    table_service_in.create_table(table_name=tb.name, fail_on_exist=False)
    #first query
    data = table_service_out.query_entities(tb.name,num_results=query_size)
    queryAndSaveAllDataBySize(tb.name,data,table_service_out,table_service_in,query_size)

this code will check the table in storageA copy them and create the same table in StorageB, and thanks to the marker I can have the x_ms_continuation token if I have more than 1000 rows per requests.

Goes without saying that this works just fine as it is.

But yesterday I was trying to make some changes to the code as follow:

If in storageA I have a table name TEST, I storageB I want to create a table named TEST20210930, basically the table name from storageA today date

This is where the code start breaking down.


table_service_out = TableService(account_name='', account_key='')
table_service_in = TableService(account_name='', account_key='')
query_size = 100

#save data to storage2 and check if there is lefted data in current table,if yes recurrence
def queryAndSaveAllDataBySize(tb_name,resp_data:ListGenerator ,table_out:TableService,table_in:TableService,query_size:int):
    for item in resp_data:
        #remove etag and Timestamp appended by table service
        del item.etag
        del item.Timestamp
        print("instet data:"   str(item)   "into table:"  tb_name)
        table_in.insert_or_replace_entity(tb_name,item)
    if resp_data.next_marker:
        data = table_out.query_entities(table_name=tb_name,num_results=query_size,marker=resp_data.next_marker)
        queryAndSaveAllDataBySize(tb_name,data,table_out,table_in,query_size)


tbs_out = table_service_out.list_tables()
print(tbs_out)

for tb in tbs_out:
    table = tb.name   today
    print(target_connection_string)
    #create table with same name in storage2
    table_service_in.create_table(table_name=table, fail_on_exist=False)

    #first query
    data = table_service_out.query_entities(tb.name,num_results=query_size)
    queryAndSaveAllDataBySize(table,data,table_service_out,table_service_in,query_size)

What happens here is that the code runs up to the query_size limit but than fails saying that the table was not found.

I am a bit confused here and maybe somebody can help to spot my error.

Please if you need more info just ask

Thank you so so so much.

HOW TO REPRODUCE: In azure portal create 2 storage account. StorageA and StorageB.

In storage A create a table and fill it with data, over 100 (based on the query_size. Set the configuration Endpoints. table_service_out = storageA and table_storage_in = StorageB

CodePudding user response:

I believe the issue is with the following line of code:

data = table_out.query_entities(table_name=tb_name,num_results=query_size,marker=resp_data.next_marker)

If you notice, tb_name is the name of the table in your target account which is obviously not present in your source account. Because you're querying from a table that does not exist, you're getting this error.

To fix this, you should also pass the name of source table to queryAndSaveAllDataBySize and use that when querying entities in that function.

UPDATE

Please take a look at code below:

table_service_out = TableService(account_name='', account_key='')
table_service_in = TableService(account_name='', account_key='')
query_size = 100

#save data to storage2 and check if there is lefted data in current table,if yes recurrence
def queryAndSaveAllDataBySize(source_table_name, target_table_name,resp_data:ListGenerator ,table_out:TableService,table_in:TableService,query_size:int):
    for item in resp_data:
        #remove etag and Timestamp appended by table service
        del item.etag
        del item.Timestamp
        print("instet data:"   str(item)   "into table:"  tb_name)
        table_in.insert_or_replace_entity(target_table_name,item)
    if resp_data.next_marker:
        data = table_out.query_entities(table_name=source_table_name,num_results=query_size,marker=resp_data.next_marker)
        queryAndSaveAllDataBySize(source_table_name, target_table_name, data,table_out,table_in,query_size)


tbs_out = table_service_out.list_tables()
print(tbs_out)

for tb in tbs_out:
    table = tb.name   today
    print(target_connection_string)
    #create table with same name in storage2
    table_service_in.create_table(table_name=table, fail_on_exist=False)

    #first query
    data = table_service_out.query_entities(tb.name,num_results=query_size)
    queryAndSaveAllDataBySize(tb.name, table,data,table_service_out,table_service_in,query_size)
  • Related