My main problem is, I want to check if an object in gcp exists or not. So, what I tried
from google.cloud import storage
client = storage.Client()
path_exists = False
for blob in client.list_blobs('models', prefix='trainedModels/mddeep256_sarim'):
path_exists = True
break
It worked fine for me. But now the problem is I don't know the model name which is mddeep256 but I know further part _sarim
So, I want to use something like
for blob in client.list_blobs('models', prefix='trainedModels/*_sarim'):
I want to use * wildcard, how can I do that?
CodePudding user response:
In short: you can't!
You can only filter on the prefix. If you want to filter on the suffix (as you wish), start by filter on the longest prefix that you can with the API, and then iterate in your code to scan the file name and get those that match your pattern.
No built-il solution for that...
CodePudding user response:
list_blob
doesn't support regex in prefix
. you need filter by yourself as mentioned by Guilaume.
following should work.
def is_object_exist(bucket_name, object_pattern):
from google.cloud import storage
import re
client = storage.Client()
all_blobs = client.list_blobs(bucket_name)
regex = re.compile(r'{}'.format(object_pattern))
filtered_blobs = [b for b in all_blobs if regex.match(b.name)]
return True if len(filtered_blobs) else False