I'm trying to create a new array of hashes with unique values and with respecting the highest version of repeated hashes. The hash looks like the following:
old_hash = [
{"dependency"=>"websocket", "version"=>"2.8.0", "repo"=>"repo1"},
{"dependency"=>"rails", "version"=>"6.2.0", "repo"=>"repo2"},
{"dependency"=>"httparty", "version"=>"6.0.3.5", "repo"=>"repo2"},
{"dependency"=>"httparty", "version"=>"6.1.0.2", "repo"=>"repo2"},
{"dependency"=>"httparty", "version"=>"6.1.3.2", "repo"=>"repo2"},
{"dependency"=>"rails", "version"=>"6.1.0", "repo"=>"repo3"},
{"dependency"=>"metasploit", "version"=>"2.8.0", "repo"=>"repo3"}
]
As you can see, the third, fourth, and fifth hash has the same value of key dependency
which is httparty
and also repo
which is repo2
, but the fifth hash has the highest version of these three. Therefore, I'd like to create a unique hash that has the first, second, fifth, sixth, and seventh hash. So the result I'm trying to have should look like this:
unique_hash = [
{"dependency"=>"websocket", "version"=>"2.8.0", "repo"=>"repo1"},
{"dependency"=>"rails", "version"=>"6.2.0", "repo"=>"repo2"},
{"dependency"=>"httparty", "version"=>"6.1.3.2", "repo"=>"repo2"},
{"dependency"=>"rails", "version"=>"6.1.0", "repo"=>"repo3"},
{"dependency"=>"metasploit", "version"=>"2.8.0", "repo"=>"repo3"}
]
Regarding the version comparison, I'm thinking of using this method to compare them right:
def version_greater? (version1, version2)
Gem::Version.new(version1) > Gem::Version.new(version2)
end
which returns true
in case version1 is greater than version2.
I would appreciate any suggestions that helps to solve this problem.
CodePudding user response:
One approach is to use the form of Hash#update (aka merge!
) that takes a block (here { |_,o,n| n["version"] > o["version"] ? n : o }
) to determine the values of keys that are present in both hashes being merged.
old_hash = [
{"dependency"=>"websocket", "version"=>"2.8.0", "repo"=>"repo1"},
{"dependency"=>"rails", "version"=>"6.2.0", "repo"=>"repo2"},
{"dependency"=>"httparty", "version"=>"6.0.3.5", "repo"=>"repo2"},
{"dependency"=>"httparty", "version"=>"6.1.0.2", "repo"=>"repo2"},
{"dependency"=>"httparty", "version"=>"6.1.3.2", "repo"=>"repo2"},
{"dependency"=>"rails", "version"=>"6.1.0", "repo"=>"repo3"},
{"dependency"=>"metasploit", "version"=>"2.8.0", "repo"=>"repo3"},
{"dependency"=>"rails", "version"=>"6.1.9", "repo"=>"repo2"}
]
Note that I've added a hash to old_hash
shown in the question. (Incidentally, "old_hash" is perhaps not the best name for an array.)
old_hash.each_with_object({}) do |g,h|
h.update([g["dependency"],g["repo"]]=>g) do |_,o,n|
n["version"] > o["version"] ? n : o
end
end.values
#=> [{"dependency"=>"websocket", "version"=>"2.8.0", "repo"=>"repo1"},
# {"dependency"=>"rails", "version"=>"6.2.0", "repo"=>"repo2"},
# {"dependency"=>"httparty", "version"=>"6.1.3.2", "repo"=>"repo2"},
# {"dependency"=>"rails", "version"=>"6.1.0", "repo"=>"repo3"},
# {"dependency"=>"metasploit", "version"=>"2.8.0", "repo"=>"repo3"}]
The receiver of values
can be seen to be the following.
{["websocket", "repo1"] =>{"dependency"=>"websocket", "version"=> "2.8.0", "repo"=>"repo1"},
["rails", "repo2"] =>{"dependency"=>"rails", "version"=> "6.2.0", "repo"=>"repo2"},
["httparty", "repo2"] =>{"dependency"=>"httparty", "version"=>"6.1.3.2", "repo"=>"repo2"},
["rails", "repo3"] =>{"dependency"=>"rails", "version"=> "6.1.0", "repo"=>"repo3"},
["metasploit", "repo3"]=>{"dependency"=>"metasploit", "version"=> "2.8.0", "repo"=>"repo3"}}
Consult the doc for descriptions of the three block variables: _
(the common key, here an underscore to signal that it is not used in the block calculation), o
, the value of the common key in the hash being constructed (think "old"), and n
, the value of the common key in the hash being merged (think "new").
CodePudding user response:
The problem was solved by using:
old_hash.group_by {|h| h.values_at("dependency","repo")}.map {|_,v| v.max_by {|h| Gem::Version.new(h["version"])}}
Thanks to @engineersmnky.