Home > database >  Using a function to perform a repetitive task in R
Using a function to perform a repetitive task in R

Time:01-19

I have a very large dataset full of dates in 2015. Every single day is listed in this dataset, but some dates are repeated multiple times depending how many incidents happen per day. I want to have a dataset for every day.

So far, I have done:

df2 <- split(df1, df1$date) 

This gives me a list of of every date within list "df2." I can view each day by typing

View(df2[["2015-01-01"]])
View(df2[["2015-01-01"]])
Etc.

Is there a way to make it so that a function performs the following action:

jan1 <- df2[["2015-01-01"]]
jan2 <- df2[["2015-01-02"]]
Etc.

CodePudding user response:

This looks to be an XY problem, but here is a potential solution to the question asked:

df1 <- data.frame(date = c("2015-01-01", "2015-01-02"),
                  val1 = c(1,2),
                  val2 = c(10, 20))
df2 <- split(df1, df1$date) 

df2[["2015-01-01"]]
#>         date val1 val2
#> 1 2015-01-01    1   10
df2[["2015-01-02"]]
#>         date val1 val2
#> 2 2015-01-02    2   20

list2env(df2, envir = .GlobalEnv)
#> <environment: R_GlobalEnv>

ls()
#> [1] "2015-01-01" "2015-01-02" "df1"        "df2"

# so, there are now 4 dataframes in your global environment
# called "df1", "df2", "2015-01-01" and "2015-01-02"
# you can access the dataframe for each 'day' using backticks,
# e.g. View(`2015-01-01`)

`2015-01-01`
#>         date val1 val2
#> 1 2015-01-01    1   10

Created on 2023-01-19 with reprex v2.0.2

CodePudding user response:

I agree with some of the comments that there might be better ways to manipulate this data for use in ggplot2.

However, expanding on jared_mamrot's answer, you can also rename those dataframe variables using lubridate functions:

library(dplyr)
library(dplyr)
df<-
  data.frame(date=c("2023-01-18" ,"2023-01-18" ,"2023-01-18" ,
                  "2021-02-05","2021-02-05","2020-07-21" )) %>%
  tibble::rowid_to_column() %>% 
  mutate(newnames = paste0(lubridate::month(date, label=T), lubridate::day(date))) 

split(df,df$newnames) %>% list2env(envir=.GlobalEnv)


This will give you variable names similar to what you specified, but note that this really won't work if you have similar month-and-date pairs on different years. For example, "2023-01-18" and "2020-01-18" would conflict (both would attempt to be assigned to "Jan18") and one would probably be overridden.

  •  Tags:  
  • r
  • Related