Home > Enterprise >  Azure Search how to ignore empty blob files (without content)?
Azure Search how to ignore empty blob files (without content)?

Time:10-06

In my blob storage documents exists that are empty. Final azure search's document looks like this:

"value": [
    {
        "@search.score": 19.593246,
        "@search.highlights": {
            "original_title": [
                "<em>Divine</em> <em>Secrets</em> of the Ya-Ya Sisterhood"
            ],
            "title": [
                "<em>Divine</em> <em>Secrets</em> of the Ya-Ya Sisterhood"
            ]
        },
        "original_title": "Divine Secrets of the Ya-Ya Sisterhood",
        "title": "Divine Secrets of the Ya-Ya Sisterhood"
        "content": "" // this is empty
    }
]

And as you can see, content can be empty because there is nothing in "Divine Secrets of the Ya-Ya Sisterhood" blob file.

My question is how to ignore empty blob files in data source or during indexing process and is it posible to throw indexing error (or any) when empty file appear?

CodePudding user response:

Looking at the documentation here, I would say it is not possible to either exclude zero-sized blobs from indexing or throw an error during indexing for such blobs.

One possible solution would be to set a metadata property (AzureSearch_Skip) with value as true when uploading blobs that do not have any content (for existing blobs, you would need to manually update the metadata). Presence of this metadata property would instruct Azure Search to ignore such blobs from indexing.

  • Related