Im trying to to create a hash with one key per each type of extension on a directory. To every key I would like to add two values: number of times that extension is repeated and total size of all the files with that extension.
Something similar to this:
{".md" => {"ext_reps" => 6, "ext_size_sum" => 2350}, ".txt" => {"ext_reps" => 3, "ext_size_sum" => 1300}}
But I´m stuck on this step:
hash = Hash.new{|hsh,key| hsh[key] = {}}
ext_reps = 0
ext_size_sum = 0
Dir.glob("/home/computer/Desktop/**/*.*").each do |file|
hash[File.extname(file)].store "ext_reps", ext_reps
hash[File.extname(file)].store "ext_size_sum", ext_size_sum
end
p hash
With this result:
{".md" => {"ext_reps" => 0, "ext_size_sum" => 0}, ".txt" => {"ext_reps" => 0, "ext_size_sum" => 0}}
And I can't finde the way to increment ext_reps
and ext_siz_sum
Thanks
CodePudding user response:
Suppose the files sizes drawn are as follows.
files = [{ ext: 'a', size: 10 },
{ ext: 'b', size: 20 },
{ ext: 'a', size: 30 },
{ ext: 'c', size: 40 },
{ ext: 'b', size: 50 },
{ ext: 'a', size: 60 }]
You can use Hash#group_by and Hash#transform_values.
files.group_by { |h| h[:ext] }.
transform_values do |arr|
{ "ext_reps"=>arr.size, "ext_size_sum"=>arr.sum { |h| h[:size] } }
end
#=> {"a"=>{"ext_reps"=>3, "ext_size_sum"=>100},
# "b"=>{"ext_reps"=>2, "ext_size_sum"=>70},
# "c"=>{"ext_reps"=>1, "ext_size_sum"=>40}}
Note that the first calculation is as follows.
files.group_by { |h| h[:ext] }
#=> {"a"=>[{:ext=>"a", :size=>10}, {:ext=>"a", :size=>30},
# {:ext=>"a", :size=>60}],
# "b"=>[{:ext=>"b", :size=>20}, {:ext=>"b", :size=>50}],
# "c"=>[{:ext=>"c", :size=>40}]}
Another way is use the forms of Hash#update (aka Hash#merge!
) and Hash#merge that employ a block to compute the values of keys that are present in both hashes being merged. (Ruby does not consult that block when a key-value pair with key k
is being merged into the hash being built (h
) when h
does not have a key k
.)
See the docs for an explanation of the three parameters of the block that returns the values of common keys of hashes being merged.
files.each_with_object({}) do |g,h|
h.update(g[:ext]=>{"ext_reps"=>1, "ext_size_sum"=>g[:size]}) do |_k,o,n|
o.merge(n) { |_kk, oo, nn| oo nn }
end
end
#=> {"a"=>{"ext_reps"=>3, "ext_size_sum"=>100},
# "b"=>{"ext_reps"=>2, "ext_size_sum"=>70},
# "c"=>{"ext_reps"=>1, "ext_size_sum"=>40}}
I've chosen names for the common keys of the "outer" and "inner" hashes (_k
and _kk
, respectively) that begin with an underscore to signal to the reader that they are not used in the block calculation. This is common practive.
Note that this approach avoids the creation of a temporary hash similar to that created by group_by
and therefore tends to use less memory than the first approach.
CodePudding user response:
Here is a solution inspired by the answers given by Cary Swoveland and BenFenner
hash = {}
Dir.glob("/home/computer/Desktop/**/*.*").each do |file|
(hash[File.extname(file)] ||= []) << file.size
end
hash.transform_values! { |sizes| { "ext_reps" => sizes.count, "ext_size_sum" => sizes.sum } }
CodePudding user response:
It's not the most "Ruby-like" solution, but going along with your provided example this is probably what you'd ultimately end up with as a solution. Your main problem was that you were never incrementing the ext_reps
value, nor were you ever accumulating the ext_size_sum
value.
hash = {}
Dir.glob('/home/computer/Desktop/**/*.*').each do |file|
file_extension = File.extname(file)
if hash[file_extension].nil?
# This is the first time this file extension has been seen, so initialize things for it.
hash[file_extension] = {}
hash[file_extension]['ext_reps'] = 0
hash[file_extension]['ext_size_sum'] = 0
end
# Increment/accumulate values.
hash[file_extension]['ext_reps'] = 1
hash[file_extension]['ext_size_sum'] = file.size
end
CodePudding user response:
With each_with_object
and nested Hash.new
files = [{ ext: 'a', size: 10 },
{ ext: 'b', size: 20 },
{ ext: 'a', size: 30 },
{ ext: 'c', size: 40 },
{ ext: 'b', size: 50 },
{ ext: 'a', size: 60 }]
files.each_with_object(Hash.new(Hash.new(0))) do |el, hash|
h = hash[el[:ext]]
hash[el[:ext]] =
{ "ext_reps" => h["ext_reps"] 1, "ext_size_sum" => h["ext_size_sum"] el[:size] }
end
#=> {"a"=>{"ext_reps"=>3, "ext_size_sum"=>100},
# "b"=>{"ext_reps"=>2, "ext_size_sum"=>70},
# "c"=>{"ext_reps"=>1, "ext_size_sum"=>40}}