Home > OS >  How do I convert a string to an array with spaces preserved in Ruby?
How do I convert a string to an array with spaces preserved in Ruby?

Time:02-15

How do I convert a String: 'Hello world!' to an array: ['Hello', ' ', 'world!'] with all spaces preserved?

I tried to convert the string using the split method with different parameters, but I didn't find the right solution.

Also I didn't find any other method in the documentation (Class: String (Ruby 3.1.0)) suitable for solving this problem.

CodePudding user response:

It just occured to me, that you could use scan. Assuming that your string is stored in the variable s, and you want to separate space regions and non-space regions, you could do a

s.scan(/[ ] |[^ ] /)

which would yield in your case

["Hello", "   ", "world!"]

CodePudding user response:

Use String#scan Instead of String#split

You don't want to use String#split because that won't preserve your spaces. You want to use String#scan or String#partition instead. Using Unicode character properties, you can scan for matches with:

'Hello   world!'.scan /[\p{Alnum}\p{Punct}] |\p{Space} /
#=> ["Hello", "   ", "world!"]

You can also use POSIX character classes (pronounced "bracket expressions" in Ruby) to do the same thing if you prefer. For example:

'Hello   world!'.scan /[[:alnum:][:punct:]] |[[:space:]] /
#=> ["Hello", "   ", "world!"]

Either of these options will be more robust than solutions that rely on ASCII-only characters or literal whitespace atoms, but if you know your strings won't include other types of characters or encodings then those solutions will work too.

Using String#partition

For the very simple use case in your original example, you only have two words separated by whitespace. That means you can also use String#partition to partition on the sequential whitespace. That will split the string into exactly three elements, preserving the whitespace that partitions the words. For example:

'Hello   world!'.partition /\s /
#=> ["Hello", "   ", "world!"]

While simpler, the partitioning approach won't work as well with longer strings such as:

'Goodbye   cruel world!'.partition /\s /
#=> ["Goodbye", "   ", "cruel world!"]

so String#scan is going to be a better and more flexible approach for the general use case. However, anytime you want to split a string into three elements, or to preserve the partitioning element itself, #partition can be very handy.

  •  Tags:  
  • ruby
  • Related