Home > Net >  Manticore search fails indexing empty set on csvpipe/tsvpipe
Manticore search fails indexing empty set on csvpipe/tsvpipe

Time:07-23

I am using Manticore search engine(forked from Sphinx). I am setting up a pair of indexes implementing main delta approach. Delta index is updated using tsvpipe.

source postings_source_delta
{
  type = tsvpipe
  tsvpipe_command = bash /opt/get-delta.sh 2>/var/log/manticore/delta_index_error.log
  tsvpipe_field = content
  tsvpipe_attr_string = mongoId
}

get-delta.sh script yields tsv with latest items recently added to database. The problem is that if there are no items then tsv becomes empty and in this case indexer is failing with error.

ERROR: index 'postings_index_delta': source 'postings_source_delta': read error 'Inappropriate ioctl for device'.

This makes indexing with tsv/csv unreliable. Is there a way to solve this problem?

CodePudding user response:

In general (for all sources) Manticore doesn't enable creation of empty plain indexes, but there's a trick - you can do it using a mysql source:

source min {
    type = mysql
    sql_host = localhost
    sql_user = test
    sql_pass =
    sql_db = test
    sql_query = select 1, 'dog' Doc, 1 group_id, 'red' color, 3.5 size from t where 1=0
    sql_field_string = doc
    sql_attr_uint = group_id
    sql_attr_string = color
    sql_attr_float = size
}

will give you:

Manticore 5.0.2 348514c@220530 dev (columnar 1.15.4 2fef34e@220522)
Copyright (c) 2001-2016, Andrew Aksyonoff
Copyright (c) 2008-2016, Sphinx Technologies Inc (http://sphinxsearch.com)
Copyright (c) 2017-2022, Manticore Software LTD (https://manticoresearch.com)

using config file 'min_sql_empty.conf'...
indexing index 'idx'...
collected 0 docs, 0.0 MB
total 0 docs, 0 bytes
total 0.137 sec, 0 bytes/sec, 0.00 docs/sec
total 0 reads, 0.000 sec, 0.0 kb/call avg, 0.0 msec/call avg
total 10 writes, 0.000 sec, 0.0 kb/call avg, 0.0 msec/call avg
rotating indices: successfully sent SIGHUP to searchd (pid=1742606).

So what you could do is check if your TSV command returns anything and if it doesn't - use this trick.

It's recommended to use an RT index instead.

UPDATE

xmlpipe2 can also build an empty plain index, e.g.

snikolaev@dev:~$ cat xml_empty.conf
source min {
  type = xmlpipe2
  xmlpipe_command = cat xml_empty
}

index idx {
  path = idx/xml_empty
  source = min
}

searchd {
    listen = 9315:mysql41
    log = manticore.log
    pid_file = 9315.pid
    binlog_path =
}

snikolaev@dev:~$ cat xml_empty
<?xml version="1.0" encoding="utf-8"?>
<sphinx:docset xmlns:sphinx="http://sphinxsearch.com/">
<sphinx:schema>
    <sphinx:attr name="a" type="int" />
    <sphinx:field name="f" />
</sphinx:schema>
</sphinx:docset>

will give:

snikolaev@dev:~$ indexer -c xml_empty.conf --all
Manticore 5.0.2 348514c@220530 dev (columnar 1.15.4 2fef34e@220522)
Copyright (c) 2001-2016, Andrew Aksyonoff
Copyright (c) 2008-2016, Sphinx Technologies Inc (http://sphinxsearch.com)
Copyright (c) 2017-2022, Manticore Software LTD (https://manticoresearch.com)

using config file 'xml_empty.conf'...
indexing index 'idx'...
collected 0 docs, 0.0 MB
total 0 docs, 0 bytes
total 0.112 sec, 0 bytes/sec, 0.00 docs/sec
total 0 reads, 0.000 sec, 0.0 kb/call avg, 0.0 msec/call avg
total 8 writes, 0.000 sec, 0.0 kb/call avg, 0.0 msec/call avg
  • Related