Home > Blockchain >  Ruby method gsub with string ' '
Ruby method gsub with string ' '

Time:12-04

I've found interesting thing in ruby. Do anybody know why is behavior?

' '.gsub!(' ', '/ ')

tried ' '.gsub!(' ', '\ ') and expected "\\ " but got ""(empty string)

CodePudding user response:

gsub is implemented, after some indirection, as rb_sub_str_bang in C, which calls rb_reg_regsub.

Now, gsub is supposed to allow the replacement string to contain backreferences. That is, if you pass a regular expression as the first argument and that regex defines a capture group, then your replacement string can include \1 to indicate that that capture group should be placed at that position.

That behavior evidently still happens if you pass an ordinary, non-regex string as the pattern. Your verbatim string obviously won't have any capture groups, so it's a bit silly in this case. But trying to replace, for instance, with \1 in the string will give the empty string, since \1 says to go get the first capture group, which doesn't exist and hence is vacuously "".

Now, you might be thinking: isn't a number. And you'd be right. You're replacing with \ . There are several other backreferences allowed in your replacement string. I couldn't find any official documentation where these are written down, but the source code does quite fine. To summarize the code:

  • Digits \1 through \9 refer to numbered capture groups.
  • \k<...> refers to a named capture group, with the name in angled brackets.
  • \0 or \& refer to the whole substring that was matched, so (\0) as a replacement string would enclose the match in parentheses.
  • A backslash followed by a backtick (I have no idea how to write that using StackOverflow's markdown) refers to the entire string up to the match.
  • \' refers to the entire string following the match.
  • \ refers to the final capture group, i.e. the one with the highest number.
  • \\ is a literal backslash.

(Most of these are based on Perl variables of a similar name)

So, in your examples,

  • \ as the replacement string says "take the last capture group". There is no capture group, so you get the empty string.
  • \- is not a valid backreference, so it's replaced verbatim.
  • \ok is, likewise, not a backreference, so it's replaced verbatim.
  • In \\ , Ruby eats the first backslash sequence, so the actual string at runtime is \ , equivalent to the first example.
  • For \\\ , Ruby processes the first backslash sequence, so we get \\ by the time the replacement function sees it. \\ is a literal backslash, and is no longer part of an escape sequence, so we get \ .
  • Related