Home > other >  Data structure for indirect symmetrical synonym
Data structure for indirect symmetrical synonym

Time:02-15

I have a set of synonyms for example like this:

big large large huge small little apple banana

Meaning big is a synonym for large, large is synonym for huge, small for little, apple for banana and vice versa(large is synonym for big, etc). Another thing is "big" is a synonym for "huge" and "huge" is a synonym for "big" because of indirect relationship via "large".

This should be something like thesaurus? But I'm not sure how the data structure should look.

CodePudding user response:

"Many different aspects of language have a natural representation as graphs. Graphs can also be used to describe how words relate to one another semantically. Within each word class, words are grouped into sets of synonyms, so-called synsets." - according to this article.

So, for an example synset for word 'banana' is (elongated crescent-shaped yellow fruit with soft sweet flesh) according to WordNet. Synsets are linked to one another by semantic relationships. So, you can find simmilar semantic synset for word 'apple' (fruit with red or yellow or green skin and sweet to tart crisp whitish flesh).

You can use this ruby gem to build a graph using WordNet database.

CodePudding user response:

One simple option would be an array of arrays like:

[
  ['big', 'large', 'huge'],
  ['small', 'little']
]

Alternately if e.g. huge is not a synonym of big in your model then you might want a hash like:

{
  big: ['large'],
  large: ['big', 'huge'],
  huge: ['large'],
  small: ['little', 'tiny'],
  little: ['small'],
  ...
}

It really depends on what you plan do do with it.

  • Related