Home > other >  Questions about how to realize the character matching
Questions about how to realize the character matching

Time:09-16

Want to consult everybody a great god about the implementation of an operation:

For example, now set several groups of words make a word library:
color categories, red, black, white
size categories, long, short, wide
Quality categories of , solid, shaking

I have a statement " I want a red table wide, strong point
"Put this statement in keywords garage again, can come out word category, color, size, quality of a

I want to achieve:
For example, I now have a text file containing thousands of statement, and then to a keyword categories of dozens of vocabulary to run again, finally be category of word frequency distribution

This function can be realized, please?
In the case of hope to the highest efficiency (only with a text file and a thesaurus file) how to implement?

CodePudding user response:

Efficiency is certainly can, key technology problems:
1. Before and after the replacement string substitution, o the length of the poor is a project in a class of word frequency
2. Regular, the length of the list is a project to find the word
3. Jieba participles, but not necessarily can get your items in these categories

CodePudding user response:

In 3, can try to add a custom dictionary

CodePudding user response:

This can be used to re + Counder can be implemented easily and with re match each keyword, then Counder word frequency statistics, combined with the map + asynchronous should be done at a high speed
  • Related