In my blob storage documents exists that are empty. Final azure search's document looks like this:
"value": [
{
"@search.score": 19.593246,
"@search.highlights": {
"original_title": [
"<em>Divine</em> <em>Secrets</em> of the Ya-Ya Sisterhood"
],
"title": [
"<em>Divine</em> <em>Secrets</em> of the Ya-Ya Sisterhood"
]
},
"original_title": "Divine Secrets of the Ya-Ya Sisterhood",
"title": "Divine Secrets of the Ya-Ya Sisterhood"
"content": "" // this is empty
}
]
And as you can see, content
can be empty because there is nothing in "Divine Secrets of the Ya-Ya Sisterhood" blob file.
My question is how to ignore
empty blob files in data source
or during indexing
process and is it posible to throw indexing
error (or any) when empty file appear?
CodePudding user response:
Looking at the documentation here
, I would say it is not possible to either exclude zero-sized blobs from indexing or throw an error during indexing for such blobs.
One possible solution would be to set a metadata property (AzureSearch_Skip
) with value as true
when uploading blobs that do not have any content (for existing blobs, you would need to manually update the metadata). Presence of this metadata property would instruct Azure Search to ignore such blobs from indexing.