Home > Blockchain >  What's the best way of storing tags into elasticsearch
What's the best way of storing tags into elasticsearch

Time:11-30

I have a index 'product' in elasticsearch,I want to add some tags like 'environmental','energy-saving','recyclable','medical-grade' to item.I collected some ways after google:array,nested,bit.

1.Use array.
{
    "mappings": {
        "properties": {
            "tags": {
                "type": "keyword"
            }
        }
    }
}

It can store tag's name directly. Query that contains 'environmental' and 'medical-grade':

{
    "query": {
        "bool": {
            "must": {
                "terms": {
                    "tags": [
                        "environmental",
                        "medical-grade"
                    ]
                }
            }
        }
    }
}

2.Use nested.
{
    "mappings": {
        "properties": {
            "tags": {
                "type": "nested",
                "properties": {
                    "code": {
                        "type": "text"
                    }
                }
            }
        }
    }
}

It can store tag's name directly too even id or others.

Query that contains 'environmental' and 'medical-grade':

{
    "query": {
        "bool": {
            "must": {
                "terms": {
                    "tags.name": [
                        "environmental",
                        "medical-grade"
                    ]
                }
            }
        }
    }
}

3.Use bit.
{
    "mappings": {
        "properties": {
            "tags": {
                "type": "long"
            }
        }
    }
}

It can store tags indirectly and need to specify a bit as a tag.

Suppose the n-th bit represents n-th tag(binary):0->'environmental',1->'energy-saving',2->'recyclable',3->'medical-grade'.So 1001(binary,equal to 9 in decimal) means it contains 'environmental' and 'medical-grade'.

Query that contains 'environmental' and 'medical-grade':

{
    "query": {
        "bool": {
            "must": {
                "script": {
                    "script": "doc['tags'].size() != 0 && (doc['tags'].value&9)==9"
                }
            }
        }
    }
}

I don't know how them performs,but I likes third way actually.Please give me some advice or better way.

CodePudding user response:

My suggestion will be go with option 1 and use array. it will easy to query data and also used in aggregation.

Option 2, you can use but i dont think so its best for your case because you dont have nested or paent-child data so it is unneccessary to store as nested.

Option 3, I will not suggest as you need to use script at query time and it will impact the performance.

  • Related