Get a vector which contains the string before one specific character from each element in a datafram-CodePudding

I have a dataframe that looks like this:

df = data.frame(V1= c(1:3), V2 = c("abc-1-10", "def-2-19", "ghi-3-937"))

I would like to be able to add a column V3 to this dataframe such that the column V3 contains the string before the first hyphen ("-") in column V2. So it should look like this:

dfDesired = data.frame(V1= c(1:3), V2 = c("abc-1-10", "def-2-19", "ghi-3-937"), V3 = c("abc", "def","ghi"))

When I try using strsplit, it gives me a list of vectors with the three parts of the each string.

CodePudding user response：

You can drop everything after the first -

In base R,

transform(df, V3 = sub('-.*', '', V2))

#  V1        V2  V3
#1  1  abc-1-10 abc
#2  2  def-2-19 def
#3  3 ghi-3-937 ghi

CodePudding user response：

A non-regex friendly way is using tidyr::separate:

tidyr::separate(df, V2, into = "V3", extra = "drop", remove = F, sep = "-")

  V1        V2  V3
1  1  abc-1-10 abc
2  2  def-2-19 def
3  3 ghi-3-937 ghi