I have a dataframe that looks like this:
df = data.frame(V1= c(1:3), V2 = c("abc-1-10", "def-2-19", "ghi-3-937"))
I would like to be able to add a column V3 to this dataframe such that the column V3 contains the string before the first hyphen ("-") in column V2. So it should look like this:
dfDesired = data.frame(V1= c(1:3), V2 = c("abc-1-10", "def-2-19", "ghi-3-937"), V3 = c("abc", "def","ghi"))
When I try using strsplit, it gives me a list of vectors with the three parts of the each string.
CodePudding user response:
You can drop everything after the first -
In base R,
transform(df, V3 = sub('-.*', '', V2))
# V1 V2 V3
#1 1 abc-1-10 abc
#2 2 def-2-19 def
#3 3 ghi-3-937 ghi
CodePudding user response:
A non-regex friendly way is using tidyr::separate
:
tidyr::separate(df, V2, into = "V3", extra = "drop", remove = F, sep = "-")
V1 V2 V3
1 1 abc-1-10 abc
2 2 def-2-19 def
3 3 ghi-3-937 ghi