I am trying to perform Dataset versioning where I read a CSV file into a pandas DataFrame and then create a new version of an Azure ML Dataset. I am running the below code in an Azure CLI job within Azure DevOps.
df = pd.read_csv(blob_sas_url)
At this line, I get a 404 Error. Error Message:
urllib.error.HTTPError: HTTP Error 404: The specified resource does not exist
I tried to do this locally, I was able to read the csv file into Dataframe. The SAS URL and token are not expired too.
How to solve this issue?
Edit - Code
def __init__(self, args):
self.args = args
self.run = Run.get_context()
self.workspace = self.run.experiment.workspace
def get_Dataframe(self):
print(self.args.blob_sas_url)
df = pd.read_csv(self.args.blob_sas_url)
return df
def create_pipeline(self):
print("Creating Pipeline")
print(self.args.blob_sas_url)
dataframe = self.dataset_to_update()
# Rest of Code
if __name__ == "__main__":
parser = argparse.ArgumentParser(description='Azure ML Dataset Versioning pipeline')
parser.add_argument('--blob_sas_url', type=str, help='SAS URL to the Data File in Blob Storage')
args = parser.parse_args()
ds_versioner = Pipeline(args)
ds_versioner.create_pipeline()
In both the instances where I print the SAS URL within the script print(self.args.blob_sas_url)
, the URL is shortened. I was able to see this in the std_log.txt file.
CodePudding user response:
The reason of shortening or technically trimming your input argument is that the bash variable is split at the &
level. so all the rest of your sas url goes as "commands" or other "arguments". Apparently that is how azure parses it.
eg:
python3 test_input.py --blob_sas_url "somepath/to/storage/account/file.txt?sv=2022-01-01&sr=b&sig=SOmethingwd21dd1"
>>> output: somepath/to/storage/account/file.txt?sv=2022-01-01&sr=b&sig=SOmethingwd21dd1
python3 test_input.py --blob_sas_url somepath/to/storage/account/file.txt?sv=2022-01-01&sr=b&sig=SOmethingwd21dd1
>>> output:
[1] 1961
[2] 1962
[2] Done sr=b
so you just need to quote your Azure variable in your step command like follows:
python3 your_python_script.py --blob_sas_url "$(azml.sasURL)"