How to remove n number of identical characters from string in R-CodePudding

I have a string in R where the words are interspaced with a random number of character \n:

mystring = c("hello\n\ni\n\n\n\n\nam\na\n\n\n\n\n\n\ndog")

I want to replace n number of repeating \n elements so that there is only a space character between words. I can currently do this as follows, but I want a tidier solution:

 mystring %>% 
    gsub("\n\n", "\n", .) %>% 
    gsub("\n\n", "\n", .) %>% 
    gsub("\n\n", "\n", .) %>% 
    gsub("\n", " ", .)

[1] "hello i am a dog"

What is the best way to achieve this?

CodePudding user response：

We can use to signify one or more repetitions

gsub("\n ", " ", mystring)
[1] "hello i am a dog"

CodePudding user response：

We could use same logic as akrun with str_replace_all:

library(stringr)
str_replace_all(mystring, '\n ', ' ')

[1] "hello i am a dog"

CodePudding user response：

In this case, you might find str_squish() convenient. This is intended to solve this exact problem, while the other solutions show good ways to solve the more general case.

library(stringr)

mystring = c("hello\n\ni\n\n\n\n\nam\na\n\n\n\n\n\n\ndog")

str_squish(mystring)
# [1] "hello i am a dog"

If you look at the code of str_squish(), it is basically wrapper around str_replace_all().

str_squish
function (string) 
{
    stri_trim_both(str_replace_all(string, "\\s ", " "))
}

CodePudding user response：

Another possible solution, based on stringr::str_squish:

library(stringr)

str_squish(mystring)

#> [1] "hello i am a dog"