my dataset
id data
1 C H I C A G O I L
2 M A D I S O N W I
3 N E W Y O R K N Y
there is one blank character between letters in a word and 2 blank characters between words. im requiring to remove them
id data
1 CHICAGO IL
2 MADISON WI
3 NEW YORK NY
CodePudding user response:
We may use
library(stringr)
library(dplyr)
df1 %>%
mutate(data = str_replace_all(str_remove_all(data,
"(?<=\\S)\\s{1}(?=\\S)"), "\\s ", " "))
-output
id data
1 1 CHICAGO IL
2 2 MADISON WI
3 3 NEW YORK NY
data
df1 <- structure(list(id = 1:3, data = c("C H I C A G O I L", "M A D I S O N W I",
" N E W Y O R K N Y")), class = "data.frame", row.names = c(NA,
-3L))
CodePudding user response:
Using gsub
to remove any space that's followed by a capital letter:
df$data <- gsub("\\s(?=[A-Z])", "", df$data, perl = T)
Output
id data
1 1 CHICAGO IL
2 2 MADISON WI
3 3 NEW YORK NY
CodePudding user response:
This is obviously too much, but the logic behind should be demonstrated:
library(dplyr)
library(stringr)
df %>%
mutate(data = str_replace_all(data, " ", "0"),
data = str_replace_all(data, "00", " "),
data = str_replace_all(data, "0", ""))
id data
1 1 CHICAGO IL
2 2 MADISON WI
3 3 NEW YORK NY