I would like to include synonyms in Elasticsearch
using the R package elastic
, preferably at search time only. I can't get this working. Hope someone can help me out. Thanks!
Here I give one example assuming that brain
, mind
, and smart
are synonyms.
My code in R...
library(elastic)
connection <- connect()
#index_delete(connection,"test")
index_create(connection, "test")
properties <-
'{
"properties": {
"sentence": {
"type": "text",
"position_increment_gap": 100
}
}
}'
mapping_create(connection, "test", body = properties)
sentences <- data.frame(sentence = c("This is a brain","This a a mind","This is fun","This is smart"))
document <- cbind(1,sentences)
colnames(document)[1] <- "document"
docs_bulk(connection,document,"test")
emptyBody <-
'{
"query": {
"match_phrase": {
"sentence": {
"query": "this mind",
"slop": 100
}
}
}
}'
Search(connection,"test",body=emptyBody)
... returns...
"This a mind"
But I want...
"This is a brain"
"This is a mind"
"This is smart"
Settings?...
Based on the documentations of the R package elastic
and some general searches, I experimented with the following code block, putting it before the 'properties' code block, but that did not have any effect. :(
settings <- '{
"analysis": {
"analyzer": {
"synonym_analyzer": {
"tokenizer": "standard",
"filter": ["lowercase", "synonym_filter"]
}
},
"filter": {
"synonym_filter": {
"type": "synonym_graph",
"synonyms": [
"brain, mind, smart"
]
}
}
}
}
}'
index_analyze(connection, "test", body = settings)
CodePudding user response:
Are you using the synonyms analyzer in the mapping field?
"mappings": {
"properties": {
"name": {
"type": "text",
"search_analyzer": "synonym_analyzer"
}
}
}
CodePudding user response:
I found the solution
I had to create the index with particular settings (instead of using the index_analyze
function.
settings <- '
{
"settings": {
"index": {
"analysis": {
"filter": {
"my_graph_synonyms": {
"type": "synonym_graph",
"synonyms": [
"mind, brain",
"brain storm, brainstorm, envisage"
]
}
},
"analyzer": {
"my_index_time_analyzer": {
"tokenizer": "standard",
"filter": [
"lowercase",
"stemmer"
]
},
"my_search_time_analyzer": {
"tokenizer": "standard",
"filter": [
"lowercase",
"stemmer",
"my_graph_synonyms"
]
}
}
}
}
},
"mappings": {
"properties": {
"sentence": {
"type": "text",
"analyzer": "my_index_time_analyzer",
"search_analyzer": "my_search_time_analyzer"
}
}
}
}'
index_create(connection, "test", body = settings)
Using the example shared by Alexander Marquardt.