Home > Software design >  Ruby: split string in hash
Ruby: split string in hash

Time:02-01

I have a string

str = "race_1: 650m, 215m, 265m, 315m\r\nrace_2: 165m, 215m, 265m, 315m."

Expected result: I wanted to split this in a hash like this:

hash = {
   race_1 => [650, 215, 265, 315],
   race_2 => [165, 215, 265, 315]
}

Can someone please guide me in a direction to create the matching hash?

CodePudding user response:

When the input always follows the same pattern, then I would use String#scan with a Regexp to extract the significant values.

string = "race_1: 650m, 215m, 265m, 315m\r\nrace_2: 165m, 215m, 265m, 315m."
regexp = /(race_\d ).*?(\d (?=m)).*?(\d (?=m)).*?(\d (?=m)).*?(\d (?=m))/

string.scan(regexp)
#=> [["race_1", "650", "215", "265", "315"], ["race_2", "165", "215", "265", "315"]]

These nested array of values can then be transformed into an hash like this:

string.scan(regexp).to_h { |values| [values[0], values[1..-1]] }
#=> {"race_1"=>["650", "215", "265", "315"], "race_2"=>["165", "215", "265", "315"]}

And because you want the numbers in the array to be integers:

string.scan(regexp).to_h { |values| [values[0], values[1..-1].map(&:to_i)] }
#=> {"race_1"=>[650, 215, 265, 315], "race_2"=>[165, 215, 265, 315]}

CodePudding user response:

You can write this code

Input

str = "race_1: 650m, 215m, 265m, 315m\r\nrace_2: 165m, 215m, 265m, 315m."

Code

Split the code with colon : and replace the m at the end

hash = str.scan(/(race_\d ): (.*)/).each_with_object({}) do |(race, distances), hash|
  hash["#{race}"] = distances.split(', ').map { |d| d.sub(/m$/, '').to_i }
end
p hash

Output

{"race_1"=>[650, 215, 265, 315], "race_2"=>[165, 215, 265, 315]}

CodePudding user response:

The following allows any number of races and for each race to have any number of associated distances (in str below there are four).

str = "race_1: 650m, 215m, 265m, 315m\r\nrace_2: 165m, 215m, 265m, 315m"
str.gsub(/(\w ): ((?:\d m, *)*\d )/).with_object({}) do |_s,h|
  h[$1] = $2.split(',').map(&:to_i)
end
  #=> {"race_1"=>[650, 215, 265, 315], "race_2"=>[165, 215, 265, 315]}

This employs a little-used (and greatly undervalued) form of String#gsub that takes a single argument but no block, and returns an enumerator. The enumerator merely generates matches of gsub's argument and therefore has nothing to do with string replacement.

The regular expression that is gsub's argument can be expressed in free-spacing mode to make it self-documenting.

/
(          # begin capture group 1
  \w       # match >= 1 word characters
)          # end capture group 1
:          # match a colon
[ ]        # match a space
(          # begin capture group 2
  (?:      # begin non-capture group
    \d     # match >= 1 digits
    m,[ ]* # match "m," followed by >= 0 spaces
  )        # end non-capture group
  *        # execute preceding non-capture group >= 0 times
  \d       # match >= 1 digits
)          # end capture group 2
/x         # invoke free-spacing regex definition mode

Note that in free-spacing mode spaces that are part of the expression must be protected. There are various ways of doing that. I have enclosed each space in a character class ([ ]).


In the example above we compute the following enumerator.

enum = str.gsub(/(\w ): ((?:\d m, *)*\d )/)
  #=> #<Enumerator: "race_1: 650m, 215m, 265m, 315m\r\n
  #     race_2: 165m, 215m, 265m, 315m":
  #     gsub(/(\w ): ((?:\d m, *)*\d )/)>

The elements it will generate are as follows.

enum.next
  #=> "race_1: 650m, 215m, 265m, 315"
enum.next
  #=> "race_2: 165m, 215m, 265m, 315"
enum.next
  #=> StopIteration: iteration reached an end

Note also that

arr = "650m, 215m, 265m, 315".split(',')
  #=> ["650m", " 215m", " 265m", " 315"]

arr.map(&:to_i)
  #=> [650, 215, 265, 315]

CodePudding user response:

Could you try the code below?

str = "race_1: 650m, 215m, 265m, 315m\r\nrace_2: 165m, 215m, 265m, 315m."
rows = str.delete('.').split("\r\n") # => ["race_1: 650m, 215m, 265m, 315m", "race_2: 165m, 215m, 265m, 315m"] 
hash_result = {}
rows.each do |row|
  key = row.split(':').first # => race_1
  value = row.split(':').last.split('m, ').map(&:to_i) # => [650, 215, 265, 315]
  hash_result[key.to_sym] = value
end
# hash_result = {:race_1=>[650, 215, 265, 315], :race_2=>[165, 215, 265, 315]}

p/s: I think you should do it yourself to improve yourself

CodePudding user response:

So long as your records are separated by new lines, you could do something like below.

#lines splits by new line. Then, you split each line by the default delimiter to get an array. The last step converts the line arrays to a hash. The first item is assumed to be the intended key with the remaining items assigned to a new array.

There are many intermediate arrays so this isn't the most efficient solution.

input = "race_1: 650m, 215m, 265m, 315m\r\nrace_2: 165m, 215m, 265m, 315m."

input.lines
     .map(&:split).map do |arr|
      { arr.first.delete(':').to_sym  => arr[1..-1].map(&:to_i) }
    end
#=> [{:race_1=>[650, 215, 265, 315]}, {:race_2=>[165, 215, 265, 315]}] 
  • Related