Home > Software design >  Map two arrays by index
Map two arrays by index

Time:11-17

I have the following arrays:

arr1 = [1, 2, 3, 4]
arr2 = [a, b, a, c]

and I would like the following output:

out = {'a' => [1, 3], 'b'=> [2], 'c' => [4]}

Is there a short handed way of doing this in Ruby? Currently, I am using a loop and index to create the hash.

CodePudding user response:

Assuming you meant

arr1 = [1, 2, 3, 4]
arr2 = %w[a b a c] # ["a", "b", "a", "d"]

so your second array is an array of strings instead of variables


You can use group_by and each_with_index enumerators to point to your variable index and group it using the second array

arr1.group_by.each_with_index { |_, index| arr2[index] }

CodePudding user response:

You can write the following.

arr1 = [1, 2, 3, 4]
arr2 = ['a', 'b', 'a', 'c']
arr1.zip(arr2).each_with_object(Hash.new { |h,k| h[k] = [] }) do |(n,c),h|
  h[c] << n
end
  #=> {"a"=>[1, 3], "b"=>[2], "c"=>[4]}

Let me explain this expression by starting with a straightforward procedural approach and then go through several steps to improve the code.


Start by creating an empty hash that will become your desired return value:

h = {}

We can then write the following

(0..arr1.size - 1).each do |i|
  n = arr1[i]
  c = arr2[i]
  h[c] = [] unless h.key?(c)
  h[c] << n
end
h #=>{"a"=>[1, 3], "b"=>[2], "c"=>[4]}

It's more Ruby-like, however to iterate over corresponding pairs of values from arr1 and arr2, namely, [1, 'a'], [2, 'b'], and so on. To to that we use the method Array#zip:

pairs = arr1.zip(arr2)
  #=> [[1, "a"], [2, "b"], [3, "a"], [4, "c"]]

then

h = {}
pairs.each do |pair| 
  n = pair.first
  c = pair.last
  h[c] = [] unless h.key?(c)
  h[c] << n
end
h #=> {"a"=>[1, 3], "b"=>[2], "c"=>[4]}

One small improvement we can make is apply array decomposition to pair:

h = {}
pairs.each do |n,c| 
  h[c] = [] unless h.key?(c)
  h[c] << n
end
h #=> {"a"=>[1, 3], "b"=>[2], "c"=>[4]}

The next improvement is to replace each with Enumerable#each_with_object to avoid the need for h = {} at the beginning and h at the end:

pairs.each_with_object({}) do |(n,c),h| 
  h[c] = [] unless h.key?(c)
  h[c] << n
end
  #=> {"a"=>[1, 3], "b"=>[2], "c"=>[4]}

Notice how I have written the block variables, with h holding the object that is returned (an initially-empty hash). This is another use of array decomposition. For more on that subject, see this article.


The previous expression is fine, and reads well, but the following tweak is often seen:

pairs.each_with_object({}) do |(n,c),h| 
  (h[c] ||= []) << n
end
  #=> {"a"=>[1, 3], "b"=>[2], "c"=>[4]}

If h does not have a key c, h[c] returns nil, so h[c] ||= [], or h[c] = h[c] || [], becomes h[c] = nil || [], ergo h[c] = [], after which h[c] << n is executed.


No better or worse than the previous expression, you may see also see the code I presented at the beginning:

arr1.zip(arr2).each_with_object(Hash.new { |h,k| h[k] = [] }) do |(n,c),h|
  h[c] << n
end

Here the block variable h is initialized to an empty hash defined

h = Hash.new { |h,k| h[k] = [] }

This employs the form of Hash::new that takes a block and no argument. When a hash h is defined in this way, if h does not have a key c, executing h[c] causes h[c] = [] to be executed before h[c] << n is executed.

  • Related