At the beginning, I have a dataframe:
year
----
1998
1998
2002
I would like to mutate a group of variables with column names consisting of month:
"001", "002", "003", "004," "005" ... "010", "011", "012" and some other character strings.
The values of variables are mutated with conditions:
if year<=2000, "ks"month = "ks"month"_v1",
if year>=2000, "ks"month = "ks"month"_v2",
As a result, it should look like:
year ks001 ks002 ... ks012
-------------------------------------
1998 ks001_v1 ks002_v1 ... ks012_v1
1998 ks001_v1 ks002_v1 ... ks012_v1
2002 ks001_v2 ks002_v2 ... ks012_v2
...
This is what's in my mind, but I'm wondering if there's a way (for example: for loop) to avoid repeating this line for 12 times.
data <- data %>% mutate(ks0xx=ifelse(year<=2000,paste0(ks0xx,"_v1"),paste0(ks0xx,"_v2")))
Thanks in advance.
CodePudding user response:
I suspect that X
in _vX
means the millennium, if so, this will do the trick (created some data for the example):
require(dplyr)
cnames <- c(paste0("ks00", 1:9),
paste0("ks0", 10:12))
df <- data.frame(year = sample(1990:2015, 7, replace = T)) %>%
arrange(year)
for (i in 1:length(cnames)) {
df[, cnames[i]] <- paste0(cnames[i],"_v", (df[,"year"] > 2000) 1 )
}
Outputs:
> df
year ks001 ks002 ks003 ks004 ks005 ks006 ks007 ks008 ks009 ks010 ks011 ks012
1 1998 ks001_v1 ks002_v1 ks003_v1 ks004_v1 ks005_v1 ks006_v1 ks007_v1 ks008_v1 ks009_v1 ks010_v1 ks011_v1 ks012_v1
2 2005 ks001_v2 ks002_v2 ks003_v2 ks004_v2 ks005_v2 ks006_v2 ks007_v2 ks008_v2 ks009_v2 ks010_v2 ks011_v2 ks012_v2
3 2007 ks001_v2 ks002_v2 ks003_v2 ks004_v2 ks005_v2 ks006_v2 ks007_v2 ks008_v2 ks009_v2 ks010_v2 ks011_v2 ks012_v2
4 2009 ks001_v2 ks002_v2 ks003_v2 ks004_v2 ks005_v2 ks006_v2 ks007_v2 ks008_v2 ks009_v2 ks010_v2 ks011_v2 ks012_v2
5 2012 ks001_v2 ks002_v2 ks003_v2 ks004_v2 ks005_v2 ks006_v2 ks007_v2 ks008_v2 ks009_v2 ks010_v2 ks011_v2 ks012_v2
6 2014 ks001_v2 ks002_v2 ks003_v2 ks004_v2 ks005_v2 ks006_v2 ks007_v2 ks008_v2 ks009_v2 ks010_v2 ks011_v2 ks012_v2
7 2014 ks001_v2 ks002_v2 ks003_v2 ks004_v2 ks005_v2 ks006_v2 ks007_v2 ks008_v2 ks009_v2 ks010_v2 ks011_v2 ks012_v2
CodePudding user response:
I think that I would do this without any loops. You can just make a long dataframe with all the colums you want at once, define the variable that you want with the ifelse statement, and then pivot the dataframe to long format. I added an id column in case there are multiple years that are the same.
library(tidyverse)
my_dat <- tibble (year = c(1998, 1999, 2001))
my_dat |>
mutate(id = row_number(),
p_col = list(paste0("ks", c(paste0("00", 1:9), paste0("0", 10:12))))) |>
unnest(p_col) |>
mutate(value = ifelse(year <= 2000, paste0(p_col, "_v1"), paste0(p_col, "_v1"))) |>
pivot_wider(names_from = p_col, values_from = value)
#> # A tibble: 3 x 14
#> year id ks001 ks002 ks003 ks004 ks005 ks006 ks007 ks008 ks009 ks010 ks011
#> <dbl> <int> <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr>
#> 1 1998 1 ks001~ ks00~ ks00~ ks00~ ks00~ ks00~ ks00~ ks00~ ks00~ ks01~ ks01~
#> 2 1999 2 ks001~ ks00~ ks00~ ks00~ ks00~ ks00~ ks00~ ks00~ ks00~ ks01~ ks01~
#> 3 2001 3 ks001~ ks00~ ks00~ ks00~ ks00~ ks00~ ks00~ ks00~ ks00~ ks01~ ks01~
#> # ... with 1 more variable: ks012 <chr>