Home > Software design >  Extract data frame variable
Extract data frame variable

Time:12-15

I have a list of proteins in a column but they all end with _l how do I remove _l from proteins ? example:

nr2_l , nr1_l , pkca_l , p_lr1_l , p_ln_l

separate(protein_exp, c("protein", "l"), "_l")

but this then separates protein **p_lr1_l ** to :

protein l
p r1

I want it to be : | protein | l | | --------| -------- | | p_lr1 | _l |

Thanks

CodePudding user response:

String Remove from stringr library can do this.

library(tidyverse)
str_remove("test_l", pattern="_l$")
#or
str_remove(protein_exp, pattern="_l$")

The extra $ dollar sign means "only match _l at the end of the string".

If your data is in a dataframe called df, you can use mutate()

df %>% mutate(protein = str_remove(protein_exp, pattern="_l$"))
  • Related