Questions: 40 g data volume is too big, written to memory reality, certainly not the first written to a local synchronization to the HDFS again, when the query from the HDFS first index to a local disk, the efficiency is not certainly high demand for such a great god genuflect is begged any good solution. [/size]
CodePudding user response:
Wrote a piece of code before, is through the graphs generated Lucene file, and then use to show the solr, solr can use HDFS as storage,CodePudding user response:
Have other contact ElasticSearch, unlike the distributed cluster Hadoop so trouble, he is also a third party based on Lucene open source solution, can try, also said before MR generated Lucene if necessary can contact me (2012),CodePudding user response:
Thank you very much provide a solution, implementation has yet to be slowly study, had the direction, THXCodePudding user response:
Solr can direct access to the HDFS file?CodePudding user response:
Don't know is it solve the same problem, but first to write index HDFS, then pull to disk, but direct write HDFS than write less disk segment files directly, don't know what's the matter,