Home > Software design >  R: Extracting After First Space
R: Extracting After First Space

Time:07-08

I am working with the R programming language. I found this question over here that extracts everything from the RIGHT of the first space:

#https://stackoverflow.com/questions/15895050/using-gsub-to-extract-character-string-before-white-space-in-r

dob <- c("9/9/43 12:00 AM/PM", "9/17/88 12:00 AM/PM", "11/21/48 12:00 AM/PM")

gsub( " .*$", "", dob )
# [1] "9/9/43"   "9/17/88"  "11/21/48"

Is it possible to adapt this code to extract after the first space?

# option 1

12:00 AM/PM, 12:00 AM/PM, 12:00 AM/PM

# option 2 : part 1

 12:00, 12:00 ,  12:00 

# option 2: part 2

AM/PM, AM/PM, AM/PM

# then, concatenate option 2 : part 1 and option 2 : part 2

I thought that maybe switching the syntax of the "gsub" command might accomplish this:

 gsub( "$*. ", "", dob )
 gsub( "*$. ", "", dob )

But I don't think I am doing this correctly.

Can someone please show me how to do this (option 1 and [option 2 part 1, option 2 part 2])?

Thanks!

Note: Normally, I do this with "Text to Columns" in Microsoft Excel - but I would like to learn how to do this in R!

CodePudding user response:

Do you mean the following?

dob <- c("9/9/43 12:00 AM/PM", "9/17/88 12:00 AM/PM", "11/21/48 12:00 AM/PM", "red1 23 g")

gsub("^\\S  ", "", dob)

#> [1] "12:00 AM/PM" "12:00 AM/PM" "12:00 AM/PM" "23 g"

CodePudding user response:

Option 1: Remove the first space and everything before it?

sub(".*? ", "", dob)
# "12:00 AM/PM" "12:00 AM/PM" "12:00 AM/PM"

Option 2: Remove the last space and everything before it?

sub(".* ", "", dob)
# [1] "AM/PM" "AM/PM" "AM/PM"

Option 3: Remove the first/last space and everything before/after it?

gsub(" [^ ] $|^.*? ", "", dob)
# [1] "12:00" "12:00" "12:00"
  • Related