Home > Enterprise >  Ruby: Unique array of hashes with respecting the highest version
Ruby: Unique array of hashes with respecting the highest version

Time:10-30

I'm trying to create a new array of hashes with unique values and with respecting the highest version of repeated hashes. The hash looks like the following:

old_hash = [
{"dependency"=>"websocket", "version"=>"2.8.0", "repo"=>"repo1"},
{"dependency"=>"rails", "version"=>"6.2.0", "repo"=>"repo2"},
{"dependency"=>"httparty", "version"=>"6.0.3.5", "repo"=>"repo2"},
{"dependency"=>"httparty", "version"=>"6.1.0.2", "repo"=>"repo2"},
{"dependency"=>"httparty", "version"=>"6.1.3.2", "repo"=>"repo2"},
{"dependency"=>"rails", "version"=>"6.1.0", "repo"=>"repo3"},
{"dependency"=>"metasploit", "version"=>"2.8.0", "repo"=>"repo3"}
]

As you can see, the third, fourth, and fifth hash has the same value of key dependency which is httparty and also repo which is repo2, but the fifth hash has the highest version of these three. Therefore, I'd like to create a unique hash that has the first, second, fifth, sixth, and seventh hash. So the result I'm trying to have should look like this:

unique_hash = [
{"dependency"=>"websocket", "version"=>"2.8.0", "repo"=>"repo1"},
{"dependency"=>"rails", "version"=>"6.2.0", "repo"=>"repo2"},
{"dependency"=>"httparty", "version"=>"6.1.3.2", "repo"=>"repo2"},
{"dependency"=>"rails", "version"=>"6.1.0", "repo"=>"repo3"},
{"dependency"=>"metasploit", "version"=>"2.8.0", "repo"=>"repo3"}
]

Regarding the version comparison, I'm thinking of using this method to compare them right:

def version_greater? (version1, version2)
  Gem::Version.new(version1) > Gem::Version.new(version2)
end

which returns true in case version1 is greater than version2.

I would appreciate any suggestions that helps to solve this problem.

CodePudding user response:

One approach is to use the form of Hash#update (aka merge!) that takes a block (here { |_,o,n| n["version"] > o["version"] ? n : o }) to determine the values of keys that are present in both hashes being merged.

old_hash = [
  {"dependency"=>"websocket",  "version"=>"2.8.0",   "repo"=>"repo1"},
  {"dependency"=>"rails",      "version"=>"6.2.0",   "repo"=>"repo2"},
  {"dependency"=>"httparty",   "version"=>"6.0.3.5", "repo"=>"repo2"},
  {"dependency"=>"httparty",   "version"=>"6.1.0.2", "repo"=>"repo2"},
  {"dependency"=>"httparty",   "version"=>"6.1.3.2", "repo"=>"repo2"},
  {"dependency"=>"rails",      "version"=>"6.1.0",   "repo"=>"repo3"},
  {"dependency"=>"metasploit", "version"=>"2.8.0",   "repo"=>"repo3"},
  {"dependency"=>"rails",      "version"=>"6.1.9",   "repo"=>"repo2"}
]

Note that I've added a hash to old_hash shown in the question. (Incidentally, "old_hash" is perhaps not the best name for an array.)

old_hash.each_with_object({}) do |g,h|
  h.update([g["dependency"],g["repo"]]=>g) do |_,o,n|
    n["version"] > o["version"] ? n : o
  end
end.values
  #=> [{"dependency"=>"websocket",  "version"=>"2.8.0",   "repo"=>"repo1"},
  #    {"dependency"=>"rails",      "version"=>"6.2.0",   "repo"=>"repo2"},
  #    {"dependency"=>"httparty",   "version"=>"6.1.3.2", "repo"=>"repo2"},
  #    {"dependency"=>"rails",      "version"=>"6.1.0",   "repo"=>"repo3"},  
  #    {"dependency"=>"metasploit", "version"=>"2.8.0",   "repo"=>"repo3"}]

The receiver of values can be seen to be the following.

  {["websocket", "repo1"] =>{"dependency"=>"websocket",  "version"=>  "2.8.0", "repo"=>"repo1"},
   ["rails", "repo2"]     =>{"dependency"=>"rails",      "version"=>  "6.2.0", "repo"=>"repo2"},
   ["httparty", "repo2"]  =>{"dependency"=>"httparty",   "version"=>"6.1.3.2", "repo"=>"repo2"},
   ["rails", "repo3"]     =>{"dependency"=>"rails",      "version"=>  "6.1.0", "repo"=>"repo3"},
   ["metasploit", "repo3"]=>{"dependency"=>"metasploit", "version"=>  "2.8.0", "repo"=>"repo3"}}

Consult the doc for descriptions of the three block variables: _ (the common key, here an underscore to signal that it is not used in the block calculation), o, the value of the common key in the hash being constructed (think "old"), and n, the value of the common key in the hash being merged (think "new").

CodePudding user response:

The problem was solved by using:

old_hash.group_by {|h| h.values_at("dependency","repo")}.map {|_,v| v.max_by {|h| Gem::Version.new(h["version"])}}

Thanks to @engineersmnky.

  • Related