Home > Mobile >  Getting to grips with regular expressions in grep
Getting to grips with regular expressions in grep

Time:09-15

I am struggling to understand how to use grep and similar to extract strings that I need. My strings are of the pattern "1899-12-31 17:20:00 UTC". I want to remove all before and including the first space, and remove " UTC". The output on this example is 17:20:00. How do I do that with base r functions like grep or gsub?

CodePudding user response:

Here is a base R option using gsub

> gsub("^(\\S \\s)|(\\s\\S )$", "", "1899-12-31 17:20:00 UTC")
[1] "17:20:00"

where

  • ^(\\S \\s) searches the substring (no space included) from the head till the first space
  • (\\S \\s)$ searches the substring from the last space till the tail of the string

Another option is using scan

> scan(text = "1899-12-31 17:20:00 UTC", what = "", quiet = TRUE)[2]
[1] "17:20:00"

A super genius idea from @akrun's comment is using as.POSIXct

> format(as.POSIXct("1899-12-31 17:20:00 UTC"), "%H:%M:%S")
[1] "17:20:00"
  • Related