Find multiple longest common prefixes from list of string-CodePudding

I'm trying to find all possible prefixes from a list of strings. We can remove "/" or ":" from prefixes to make it more readable.

input = ["item1", "item2", "product1", "product2", "variant:123", "variant:789"]

Expected output

item
product
variant

CodePudding user response：

The key here is to find your delimiter. It looks like your delimiters are numbers and : and /. So you should be able to map through the array, use the delimiter in a regex to return the prefix. You also have the option to check it exists in the array (so you TRULY know that its a prefix) but I didnt include it in my answer.

input = ["item1", "item2", "product1", "product2", "variant:123", "variant:789"]

prefixes = input.map {|word| word.gsub(/:?\/?[0-9]/, '')}.uniq

=> ["item", "product", "variant"]

The more you delimiters you have, you can append it onto the regex pattern. Do a little reading here about wildcards :-)

Hope this answers your question!

CodePudding user response：

I assume the order of the prefixes that is returned is unimportant. Also, I have disregarded the treatment of the characters "/" and ":" because that is straightforward and a distraction to the central problem.

Let's first create a helper method whose sole argument is an array of words that begin with the same letter.

def longest_prefix(arr)
  a0, *a = arr
  return a0 if a0.size == 1 || arr.size == 1
  n = (1..a0.size-1).find do |i|
    c = a0[i]
    a.any? { |w| w[i] != c }
  end
  n.nil? ? a0 : a0[0,n]
end

For example,

arr = ["dog1", "doggie", "dog2", "doggy"] 
longest_prefix arr
  #=> "dog"

We now merely need to group the words by their first letters, then map the resulting key-value pairs to the return value of the helper method when its argument equals the value of the key-value pair.

def prefixes(arr)
  arr.group_by { |w| w[0] }.map { |_,a| longest_prefix(a) }
end

Suppose, for example,

arr = ["dog1", "eagles", "eagle", "doggie", "dog2", "catsup",
       "cats", "elephant", "cat", "doggy", "caustic"]

Then

prefixes arr
  #=> ["dog", "e", "ca"]

Note that

arr.group_by { |w| w[0] }
  #=> { "d"=>["dog1", "doggie", "dog2", "doggy"],
  #     "e"=>["eagles", "eagle", "elephant"],
  #     "c"=>["catsup", "cats", "cat", "caustic"] }

See Enumerable#group_by.