trying to read a JSON file from databricks with the following code
with open('/dbfs/mnt/bronze/categories/20221006/data_10.json') as f:
d = json.load(f)
which works perfecyl but problem is that I would like to use the wild cards since there are multiple folders and files. Preferebly want to make the below code working
with open('/dbfs/mnt/bronze/categories/**/*.json') as f:
d = json.load(f)
when I read JSON using spark, wildcards work perfectly. But I prefer the above option
df = spark.read.json(f'/mnt/bronze/AKENEO/categories/**/*.json')
CodePudding user response:
You can create a quick script that goes through the folders using os.walk
.
You can see an example here
Basically it will allow you not to use the wildcards at all, but will require some more code.