Home > Blockchain >  How to remove n number of characters of a string in R after a specific character?
How to remove n number of characters of a string in R after a specific character?

Time:02-16

My data frame is:

df <- data.frame(player = c("Taiwo Awoniyi/e5478b87", "Jacob Bruun Larsen/4e204552", "Andi Zeqiri/d01231f0"), goals = c(2,5,7))

I want to remove all numbers after the "/" in the "player" column. To ideally have:

df <- data.frame(player = c("Taiwo Awoniyi", "Jacob Bruun Larsen", "Andi Zeqiri"), goals = c(2,5,7))

I am unsure of how to approach this since player names vary greatly in length and some numbers are larger than others.

CodePudding user response:

We could use separate, added extra = 'drop' (many thanks to Onyambu)

library(dplyr)
library(tidyr)

df %>% 
  separate(player, "player", sep="/", extra = 'drop')
              player goals
1      Taiwo Awoniyi     2
2 Jacob Bruun Larsen     5
3        Andi Zeqiri     7

CodePudding user response:

Using dplyr for the pipe and mutate, we can gsub everything after /.

df %>% 
  mutate(player = gsub("\\/.*", "", player))
              player goals
1      Taiwo Awoniyi     2
2 Jacob Bruun Larsen     5
3        Andi Zeqiri     7

CodePudding user response:

You can backreference the substring you want to keep by a negative character class allowing any characters except the /:

df %>%
  mutate(player = sub("([^/] ).*", "\\1", player))
              player goals
1      Taiwo Awoniyi     2
2 Jacob Bruun Larsen     5
3        Andi Zeqiri     7

More simply, you can just remove anything that's a / or a digit:

df %>%
  mutate(player = gsub("[/0-9]", "", player))

In base R syntax:

df$player <- gsub("[/0-9]", "", df$player)
  • Related