Home > Enterprise >  Using group_by for only certain attributes
Using group_by for only certain attributes

Time:10-02

I have an array roads of objects that have many attributes. What I would like to do is find which district_code (one attribute) has more than one state (another attribute)

Ommitting the other attributes for simplicity - eg:

roads = [['1004', 'VIC'], ['1004', 'BRI'], ['1011', 'VIC'], ['1010', 'ACT'], ['1000', 'ACT'], ['1019', 'VIC'], ['1004', 'VIC']]

If I was to use roads.group_by { |ro| ro[0] } I get the result:

=> {"1004"=>[["1004", "VIC"], ["1004", "BRI"], ["1004", "VIC"]], "1011"=>[["1011", "VIC"]], "1010"=>[["1010", "ACT"]], "1000"=>[["1000", "ACT"]], "1019"=>[["1019", "VIC"]]}

What I want is the hash to only show where there has been more than one unique value for state, like so:

=> {"1004"=>["VIC", "BRI"]}

Any ideas on how to group_by or map by the number of values / or for a specific attribute within a value?

Thanks!

CodePudding user response:

If you already can get:

{"1004"=>[["1004", "VIC"], ["1004", "BRI"], ["1004", "VIC"]], "1011"=>[["1011", "VIC"]], "1010"=>[["1010", "ACT"]], "1000"=>[["1000", "ACT"]], "1019"=>[["1019", "VIC"]]}

With:

roads.group_by { |ro| ro[0] }

Then you just need to select the entries with length greater than 1.

roads.group_by { |ro| ro[0] }.select { |k, v| v.length > 1 }

And I get:

{"1004"=>[["1004", "VIC"], ["1004", "BRI"], ["1004", "VIC"]]}

Then we can map that down to just the names. Could be one line, but split up for demonstration.

roads.group_by { |r| r[0] }                  \
     .select { |k, v| v.length > 1 }         \
     .map { |k, v| [k, v.map { |x| x[1] }] } \
     .to_h

And the result is:

{"1004"=>["VIC", "BRI", "VIC"]}

CodePudding user response:

Input

roads = [['1004', 'VIC'], ['1004', 'BRI'], ['1011', 'VIC'], ['1010', 'ACT'], ['1000', 'ACT'], ['1019', 'VIC'], ['1004', 'VIC']]

Code

p Hash[roads.group_by(&:first)
            .transform_values(&:uniq)
            .filter_map { |k, v| [k, v.map(&:last)] if v.length > 1 }]

Output

{"1004"=>["VIC", "BRI"]}
  • Related