I have been using the following query to rename fieldName.
POST http://localhost:9200/INDEX_NAME/_update_by_query
{
"query": {
"exists": {
"field": "NEW_FIELD_NAME"
}
},
"script" : {
"inline": "ctx._source.NEW_FIELD_NAME = ctx._source.OLD_FIELD_NAME; ctx._source.remove(\"OLD_FIELD_NAME\");"
}
}
But for more than 4.2 million data. It takes about 2-3 minutes.
Is there any way to reduce the duration?
The ElasticSearch version is 5.6.4
CodePudding user response:
David's answer on Discuss is the hard way. If you're looking for an easy way, and if your index has more than 1 primary shard, you can use slicing in order to parallelize the work:
POST http://localhost:9200/INDEX_NAME/_update_by_query?slices=auto&wait_for_completion=false
{
...
}
But David is right, you should upgrade at some point ;-)
CodePudding user response:
A totally different way to do this which ensures zero downtime is to first create a new index map with the updated field name.
After that, you can use the reindex API to populate data to this new index. https://www.elastic.co/guide/en/elasticsearch/reference/current/docs-reindex.html
Assuming you are using aliases, you can update the underlying index for this alias once re-indexing is finished. This will assure there is no downtime in the application using this data. Reindex API also supports various conf params to make the indexing process faster.