Home > Blockchain >  mutate a variables with a condition in R
mutate a variables with a condition in R

Time:12-20

At the beginning, I have a dataframe:

year
----
1998
1998
2002

I would like to mutate a group of variables with column names consisting of month: "001", "002", "003", "004," "005" ... "010", "011", "012" and some other character strings.
The values of variables are mutated with conditions:

if year<=2000, "ks"month = "ks"month"_v1",
if year>=2000, "ks"month = "ks"month"_v2",

As a result, it should look like:

year  ks001    ks002    ...  ks012
-------------------------------------
1998  ks001_v1 ks002_v1 ...  ks012_v1
1998  ks001_v1 ks002_v1 ...  ks012_v1
2002  ks001_v2 ks002_v2 ...  ks012_v2
...

This is what's in my mind, but I'm wondering if there's a way (for example: for loop) to avoid repeating this line for 12 times.

data <- data %>% mutate(ks0xx=ifelse(year<=2000,paste0(ks0xx,"_v1"),paste0(ks0xx,"_v2")))

Thanks in advance.

CodePudding user response:

I suspect that X in _vX means the millennium, if so, this will do the trick (created some data for the example):

require(dplyr)
cnames <- c(paste0("ks00", 1:9),
            paste0("ks0", 10:12))
df <- data.frame(year = sample(1990:2015, 7, replace = T)) %>%
  arrange(year)

for (i in 1:length(cnames)) {
  df[, cnames[i]] <- paste0(cnames[i],"_v", (df[,"year"] > 2000) 1 )
}

Outputs:

> df
  year    ks001    ks002    ks003    ks004    ks005    ks006    ks007    ks008    ks009    ks010    ks011    ks012
1 1998 ks001_v1 ks002_v1 ks003_v1 ks004_v1 ks005_v1 ks006_v1 ks007_v1 ks008_v1 ks009_v1 ks010_v1 ks011_v1 ks012_v1
2 2005 ks001_v2 ks002_v2 ks003_v2 ks004_v2 ks005_v2 ks006_v2 ks007_v2 ks008_v2 ks009_v2 ks010_v2 ks011_v2 ks012_v2
3 2007 ks001_v2 ks002_v2 ks003_v2 ks004_v2 ks005_v2 ks006_v2 ks007_v2 ks008_v2 ks009_v2 ks010_v2 ks011_v2 ks012_v2
4 2009 ks001_v2 ks002_v2 ks003_v2 ks004_v2 ks005_v2 ks006_v2 ks007_v2 ks008_v2 ks009_v2 ks010_v2 ks011_v2 ks012_v2
5 2012 ks001_v2 ks002_v2 ks003_v2 ks004_v2 ks005_v2 ks006_v2 ks007_v2 ks008_v2 ks009_v2 ks010_v2 ks011_v2 ks012_v2
6 2014 ks001_v2 ks002_v2 ks003_v2 ks004_v2 ks005_v2 ks006_v2 ks007_v2 ks008_v2 ks009_v2 ks010_v2 ks011_v2 ks012_v2
7 2014 ks001_v2 ks002_v2 ks003_v2 ks004_v2 ks005_v2 ks006_v2 ks007_v2 ks008_v2 ks009_v2 ks010_v2 ks011_v2 ks012_v2

CodePudding user response:

I think that I would do this without any loops. You can just make a long dataframe with all the colums you want at once, define the variable that you want with the ifelse statement, and then pivot the dataframe to long format. I added an id column in case there are multiple years that are the same.

library(tidyverse)

my_dat <- tibble (year = c(1998, 1999, 2001)) 

my_dat |>
  mutate(id = row_number(),
         p_col = list(paste0("ks", c(paste0("00", 1:9), paste0("0", 10:12))))) |>
  unnest(p_col) |>
  mutate(value = ifelse(year <= 2000, paste0(p_col, "_v1"), paste0(p_col, "_v1"))) |>
  pivot_wider(names_from = p_col, values_from = value)
#> # A tibble: 3 x 14
#>    year    id ks001  ks002 ks003 ks004 ks005 ks006 ks007 ks008 ks009 ks010 ks011
#>   <dbl> <int> <chr>  <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr>
#> 1  1998     1 ks001~ ks00~ ks00~ ks00~ ks00~ ks00~ ks00~ ks00~ ks00~ ks01~ ks01~
#> 2  1999     2 ks001~ ks00~ ks00~ ks00~ ks00~ ks00~ ks00~ ks00~ ks00~ ks01~ ks01~
#> 3  2001     3 ks001~ ks00~ ks00~ ks00~ ks00~ ks00~ ks00~ ks00~ ks00~ ks01~ ks01~
#> # ... with 1 more variable: ks012 <chr>
  •  Tags:  
  • r
  • Related