Home > front end >  How to make Neptune Search more lenient
How to make Neptune Search more lenient

Time:12-16

I have some entries inside the graph that I am searching (e.g. hello_world, foo_bar_baz) and I want to be able to search "hello" and get hello_world back.

Currently, I will only get a result if I search the entire string (i.e. searching hello_world or foo_bar_baz)

This seems to be due to elasticsearch's standard analyzer behaviour but I don't know how to deal with this with Neptune.

with neptune_graph() as g:
  my_query = " OR ".join(
  f"predicates.{field}.value:({query})" for field in ['names', 'spaces']
  )

  search_results = (
  g.withSideEffect(
  "Neptune#fts.endpoint", f"https://{neptuneSearchURL}"
  )
  .withSideEffect("Neptune#fts.queryType", "query_string")
  .withSideEffect("Neptune#fts.sortOrder", "DESC")
  .V()
  .hasLabel("doc")
  .has(
  "*",
  f"Neptune#fts entity_type:table AND ({my_query})",
  )
 )

CodePudding user response:

One way is to use a wild card.

Given:

g.addV('search-test').property('name','Hello_World')

v[0ebedfda-a9bd-e320-041a-6e98da9b1379]

Assuming the search integration is all in place, after the search index has been updated, the following will find the vertex:

g.withSideEffect("Neptune#fts.endpoint",
                 "https://vpc-neptune-xxx-abc123.us-east-1.es.amazonaws.com").
  withSideEffect('Neptune#fts.queryType', 'query_string').
  V().
  has('name','Neptune#fts hello*').
  elementMap().
  unfold()

Which yields

{<T.id: 1>: '0ebedfda-a9bd-e320-041a-6e98da9b1379'}
{<T.label: 4>: 'search-test'}
{'name': 'Hello_World'}

CodePudding user response:

The problem I was having was indeed the analyzer, except I didn't understand how to fix it until now.

When creating the elasticsearch index in the first place, you need to set what settings you want.

The solution was creating index using

with neptune_search() as es:
  es.indices.create(index="my_index", body={/*set custom analyser here*/});

  es.index(index="my_index", ... other stuff);

# example of changing the analyser (needs "" around keys and values)
#body={
#  settings:{analysis:{analyzer:{default:{
#    type: custom, 
#    tokenizer:"lowercase"
#  }}}}
#}
  • Related