I did a OSM-extract and have the column "osm_openin" for the opening hours for each object in R. It has the following structure:
I would love to have new columns for each day of the week, with a symbol "X" - if it is not open all day - or the according opening hours for the day "07:00 - 21:00".
My solution:
Firstly, I am thinking of using representative values for the week days "Mo = 1", "Tu = 2"..."Su = 7". It is important, if the day/value itself is not explicitly mentioned, but is exisiting in an intervall.
For each value, I am searching its existence in the column. If it finds the value, I'll take the opening hours following directly after (don't know which R command to use for that) If not, then the value has to be in an intervall. For example "2" (Tuesday) is not existing, then the script needs to realize Tuesday is between Mo-Sa. (don't know which method to use for that).
Public Holiday is not important.
Any suggestion for a solution?
Thanks.
CodePudding user response:
I don't know the best way, but may be I can help you. Firstly we need to create array of weekdays:
wdays <- c("Mo", "Tu", "We", "Th", "Fr", "Sa", "Su")
Now let's write code for converting text from "Mo,We-Fr"
to vector c(1, 3, 4, 5)
. Algorithm:
- Delete information about holidays (
"PH", "SH"
); - Replace name of weekday with number (
"Mo"
-->1
,"Tu"
-->2
, etc.); - Replace
-
with:
. For example,3-5
will be3:5
and it is R-style code; - Add
c(
to the beginning and)
to the end. For example,1,3:5
will bec(1, 3:5)
; c(1, 3:5)
is R-style vector and we can create vector by text (eval(parse(text = "c(1, 3:5)"))
).
Full code:
GetWDays <- function(x, wdays) {
holi <- c("PH", "SH")
x <- gsub(paste0("(,|^)", holi, collapse = "|"), "", x) #delete holidays
for (i in 1:7) {
x <- gsub(wdays[i], i, x)
}
x <- gsub("-", ":", x)
x <- paste0("c(", x, ")")
wday_idx <- eval(parse(text = x))
return(wday_idx)
}
Let's create function that has opening hours (like "Mo-Fr 6:30-19:00;Sa 09:00-17:00;Su,PH 09:00-15:00"
) as input and returns data.frame with 7 columns (for each weekday). Algorithm:
- Split text by
;
; Now we will work with one part of text (for example,"Mo-Fr 6:30-19:00"
); - Split text by
"Mo-Fr 6:30-19:00"
-->"Mo-Fr"
and"6:30-19:00"
- First part (
"Mo-Fr"
) we put intoGetWDays
and we make vector from second part (it's size will be like as first part size). Example:"Mo-Fr"
-->c(1,2,3,4,5)
,"6:30-19:00"
-->rep("6:30-19:00", 5)
; - Make data.frame from 2 vectors (
Day
andTime
); - Use
bind_rows
for each part from first step. Now we have big data.frame, but some weekdays may be missing, and some weekdays may have"Off"
in columnTime
; - So add rows for missing weekdays (by
merge
) and replace "Off
" andNA
with"X"
(as you want); - Transpose data.frame and return
Full code:
GetTimetable <- function(x) {
wdays <- c("Mo", "Tu", "We", "Th", "Fr", "Sa", "Su")
tmp <- strsplit(strsplit(x, ";")[[1]], " ")
tmp <- lapply(tmp, function(x) {Day <- GetWDays(x[1], wdays); data.frame(Day, Time = rep(x[2], length(Day)))})
tmp <- bind_rows(tmp) %>% arrange(Day) %>% as.data.frame()
tmp <- merge(data.frame(Day = 1:7), tmp, all.x = T, by = "Day")
tmp$Time[is.na(tmp$Time) | tmp$Time == "Off"] = "X"
tmp <- tmp %>% t() %>% "["(2, ) %>% as.list() %>% setNames(wdays) %>% bind_cols()
return(tmp)
}
If you want to apply GetTimetable
for each row you can use this code:
df_time <- df$osm_openning %>% lapply(GetTimetable) %>% bind_rows()
And if you want to add this data.frame to your data you can do something like this:
df <- bind_cols(df, df_time)