Home > database >  how to remove part of a string without interrupting a data frame?
how to remove part of a string without interrupting a data frame?

Time:12-26

I have a data looks like this but way much bigger

df<- structure(list(names = c("bests-1", "trible-1", "crazy-1", "cool-1", 
"nonsense-1", "Mean-1", "Lose-1", "Trye-1", "Trified-1"), Col = c(1L, 
2L, NA, 4L, 47L, 294L, 2L, 1L, 3L), col2 = c(2L, 4L, 5L, 7L, 
9L, 9L, 0L, 2L, 3L)), class = "data.frame", row.names = c(NA, 
-9L))

as an example, I am trying to remove -1 from all strings of the first column

I can do this with

as.data.frame(str_remove_all(df$names, "-1"))

the problem is that it will remove all other columns as well.

I dont want to split the data and merge again because I am afraid I Make a mismatch

Is there anyway without interrupting, just getting raid of specific strings?

for instance the output should looks like this

    names Col col2
   bests   1    2
   trible   2    4
   crazy  NA    5
   cool   4    7
 nonsense  47    9
     Mean 294    9
     Lose   2    0
     Try   1    2
  Trified   3    3

CodePudding user response:

Using gsub, escape the special \\-, and $ for end of string.

transform(df, names=gsub('\\-1$', '', names))
#      names Col col2
# 1    bests   1    2
# 2   trible   2    4
# 3    crazy  NA    5
# 4     cool   4    7
# 5 nonsense  47    9
# 6     Mean 294    9
# 7     Lose   2    0
# 8     Trye   1    2
# 9  Trified   3    3

Data:

df <- structure(list(names = c("bests-1", "trible-1", "crazy-1", "cool-1", 
"nonsense-1", "Mean-1", "Lose-1", "Trye-1", "Trified-1"), Col = c(1L, 
2L, NA, 4L, 47L, 294L, 2L, 1L, 3L), col2 = c(2L, 4L, 5L, 7L, 
9L, 9L, 0L, 2L, 3L)), class = "data.frame", row.names = c(NA, 
-9L))

CodePudding user response:

Using stringr package,

df$names = str_remove_all(df$names, '-1')

     names Col col2
1    bests   1    2
2   trible   2    4
3    crazy  NA    5
4     cool   4    7
5 nonsense  47    9
6     Mean 294    9
7     Lose   2    0
8     Trye   1    2
9  Trified   3    3

CodePudding user response:

We could use trimws from base R

df$names <- trimws(df$names, whitespace = "-\\d ")

-output

> df
     names Col col2
1    bests   1    2
2   trible   2    4
3    crazy  NA    5
4     cool   4    7
5 nonsense  47    9
6     Mean 294    9
7     Lose   2    0
8     Trye   1    2
9  Trified   3    3
  •  Tags:  
  • r
  • Related