it's easier to explain what I want to do if you look at the code first but essentially I think I want to use lapply on a condition but I wasn't able to do it.
library("tidyverse")
names <- rep(c("City A", "City B"), each = 11)
year <- rep(c(2010:2020), times = 2)
col_1 <- c(1, 17, 34, 788, 3, 4, 78, 98, 650, 45, 20,
23, 45, 56, 877, 54, 12, 109, 167, 12, 19, 908)
col_2 <- c(3, 4, 23, 433, 2, 45, 34, 123, 98, 76, 342,
760, 123, 145, 892, 23, 5, 90, 40, 12, 67, 98)
df <- as.data.frame(cbind(names, year, col_1, col_2))
df <- df %>%
mutate(col_1 = as.numeric(col_1),
col_2 = as.numeric(col_2))
I want every numeric column in the year 2018 and later to be rounded with round_any to a value which is a multiple of three (plyr::round_any, 3) What I tried is this:
df_2018 <- df %>%
filter(year >= 2018)
df <- df %>%
filter(!(year >= 2018))
df_2018[, c(3:4)] <- lapply(df_2018[, c(3:4)], plyr::round_any, 3)
df <- rbind(df, df_2018)
In reality, there's about 50 numeric columns and tons of rows. What I tried works in theory but I would like to achieve it with less code and cleaner. I am new to using lapply and I failed trying to combine it with an ifelse because I don't want it to change my year column.
Thank you for everyone who takes the time out of their day to look at this :)
CodePudding user response:
Using dplyr::across
and if_else
you could do:
library(dplyr)
df |>
mutate(across(-c(names, year), ~ if_else(year >= 2018, plyr::round_any(.x, 3), .x)))
#> names year col_1 col_2
#> 1 City A 2010 1 3
#> 2 City A 2011 17 4
#> 3 City A 2012 34 23
#> 4 City A 2013 788 433
#> 5 City A 2014 3 2
#> 6 City A 2015 4 45
#> 7 City A 2016 78 34
#> 8 City A 2017 98 123
#> 9 City A 2018 651 99
#> 10 City A 2019 45 75
#> 11 City A 2020 21 342
#> 12 City B 2010 23 760
#> 13 City B 2011 45 123
#> 14 City B 2012 56 145
#> 15 City B 2013 877 892
#> 16 City B 2014 54 23
#> 17 City B 2015 12 5
#> 18 City B 2016 109 90
#> 19 City B 2017 167 40
#> 20 City B 2018 12 12
#> 21 City B 2019 18 66
#> 22 City B 2020 909 99
CodePudding user response:
Using data.table
:
cols <- grep("^col_[0-9] $", names(df), value = TRUE)
setDT(df)[year >= 2018, (cols) := round(.SD / 3) * 3, .SDcols = cols]