I have a dataframe imported from excel with read_excel
that looks like this:
The main task is to handle the different formats of dates:
I would like to implement it into a custom function (and I am not good at all in creating functions):
df <- structure(list(date = c("40574", "40861", "40870", "40990", "07.03.2022",
"14.03.2022", "16.03.2022", "27.03.2022", "24.03.2022", "24.03.2022"
)), row.names = c(NA, -10L), class = c("tbl_df", "tbl", "data.frame"
))
# A tibble: 10 x 1
date
<chr>
1 40574
2 40861
3 40870
4 40990
5 07.03.2022
6 14.03.2022
7 16.03.2022
8 27.03.2022
9 24.03.2022
10 24.03.2022
I solved this task with this code:
library(tidyverse)
library(janitor)
library(lubridate)
df %>%
mutate(new_col = excel_numeric_to_date(as.numeric(as.character(date)), date_system = "modern"), .before=1) %>%
mutate(date = ifelse(!str_detect(date, '\\.'), NA_character_, date)) %>%
mutate(date = dmy(date)) %>%
mutate(date = coalesce(date, new_col), .keep="unused")
From this code I would like to make a custom function with this code:
mixed_dateColumn_excel <- function(df, x) {
x <- {{x}}
df %>%
mutate(new_col = excel_numeric_to_date(as.numeric(as.character(x)), date_system = "modern"), .before=1) %>%
mutate(x = ifelse(!str_detect(x, '\\.'), NA_character_, x)) %>%
mutate(x = dmy(x)) %>%
mutate(x = coalesce(x, new_col), .keep="unused")
}
I would like to know why:
This works not:
mixed_dateColumn_excel(df, "date")
This works not also:
mixed_dateColumn_excel(df, date)
And this works:
mixed_dateColumn_excel(df, df$date)
CodePudding user response:
You can't unquote in this way using the curly-curly operator TarJae. It has to be done directly inside the tidyverse functions.
In your version, the line:
x <- {{x}}
doesn't really do anything - you would get the same results if you removed this line altogether. Your third version where you use df$date
doesn't really work, since it creates a column called x
, with the calculations inside the mutate
functions simply working on the passed vector df$date
. The version with "date"
doesn't work because you are passing a literal character string for the calculations, and date
doesn't work because without using data masking properly, R cannot find this variable.
Perhaps the easiest way to do this, since you also want to use the unquoted x
for the column name inside mutate
is to use rlang::ensym
, but you still need to unquote with the bang-bang operator, and when you are assigning columns you need the assignment operator, :=
mixed_dateColumn_excel <- function(df, x) {
x <- rlang::ensym(x)
df %>%
mutate(new_col = suppressWarnings(janitor::excel_numeric_to_date(
as.numeric(as.character(!!x)),
date_system = "modern")), .before = 1) %>%
mutate(!!x := ifelse(!str_detect(!!x, '\\.'), NA_character_, !!x)) %>%
mutate(!!x := lubridate::dmy(!!x)) %>%
mutate(!!x := coalesce(!!x, new_col), .keep="unused")
}
mixed_dateColumn_excel(df, date)
#> # A tibble: 10 x 1
#> date
#> <date>
#> 1 2011-01-31
#> 2 2011-11-14
#> 3 2011-11-23
#> 4 2012-03-22
#> 5 2022-03-07
#> 6 2022-03-14
#> 7 2022-03-16
#> 8 2022-03-27
#> 9 2022-03-24
#> 10 2022-03-24
Created on 2022-04-12 by the reprex package (v2.0.1)
CodePudding user response:
{{
is not a real operator, tidyverse functions are designed to recognise it in arguments and do some magic.
{{
can be used in names too so no need for ensym()
here, you can do :
mixed_dateColumn_excel <- function(df, x) {
df %>%
mutate(
new_col = excel_numeric_to_date(
as.numeric(as.character({{ x }})),
date_system = "modern"
),
.before = 1
) %>%
mutate("{{ x }}" := ifelse(!str_detect({{ x }}, '\\.'), NA_character_, {{ x }})) %>%
mutate("{{ x }}" := lubridate::dmy({{ x }})) %>%
mutate("{{ x }}" := coalesce({{ x }}, new_col), .keep="unused")
}