Home > Net >  Iterating and Adding a column with user defined Calculations in R
Iterating and Adding a column with user defined Calculations in R

Time:09-28

My data looks like this:

structure(list(hhid = c(5668, 5595, 4724, 4756, 4856, 4730), 
    me = c(1L, 1L, 0L, 0L, 0L, 0L), ls = c(0L, 0L, 1L, 1L, 1L, 
    1L), lsn = c("", "", "Goatry", "Goatry", "Goatry", "Goatry"
    ), ag = c(0L, 0L, 0L, 0L, 0L, 0L)), row.names = c(NA, 6L), class = "data.frame")

For every hhid, if there is a 1 in me column then a new column is added with name mt with value 1000, else if there is 0 in me then it should move to next column to ls.

If there is 1 in ls and Goatery in lsn then 150 is new value in column mt else if there is Poultry then 300 in column mt.If there is 0 in ls then iterate through next column ag.If there is 1 then 600 in mt. So basically iterating through each column for every hhid and adding a new column to get a value for every hhid as mentioned above.

The expected output in the OP's comment is

df2 <-
structure(list(hhid = c(2L, 3L, 1347L, 642L), 
me = c(1L, 0L, 0L, 0L), ls = c(0L, 1L, 1L, 0L), 
lsn = c("", "Goatry", "Poultry", ""), 
ag = c(0L, 0L, 0L, 1L), mt = c(1000L, 150L, 300L, 600L)), 
row.names = c(NA, 4L), class = "data.frame")

CodePudding user response:

You could try this: mutate with case_when

library(dplyr)
df %>% 
  mutate(mt = case_when(
    me==1 ~ 1000L,
    me==0 & ls ==1 & lsn=="Goatry" ~ 150L,
    me==0 & ls ==1 & lsn=="Poultry" ~ 300L,
    ls==0 & ag==1 ~ 600L,
    TRUE ~ NA_integer_
  ))

  hhid me ls    lsn ag   mt
1 5668  1  0         0 1000
2 5595  1  0         0 1000
3 4724  0  1 Goatry  0  150
4 4756  0  1 Goatry  0  150
5 4856  0  1 Goatry  0  150
6 4730  0  1 Goatry  0  150

CodePudding user response:

We may use unite

library(dplyr)
library(tidyr)
df1 %>%
    unite(mt, me, ls, lsn, remove = FALSE) %>%
    mutate(mt = factor(mt, levels =c("1_0_", "0_1_Goatry", "0_1_"), 
       labels = c("1000", 150, 300)))

-output

  hhid   mt me ls    lsn ag
1 5668 1000  1  0         0
2 5595 1000  1  0         0
3 4724  150  0  1 Goatry  0
4 4756  150  0  1 Goatry  0
5 4856  150  0  1 Goatry  0
6 4730  150  0  1 Goatry  0

CodePudding user response:

I suggest that dplyr::case_when (or data.table::fcase) is the best option here.

Your first and second datasets are not the same, so I'll demonstrate on both.

library(dplyr)
dat %>%
  mutate(mt2 = case_when(
    me == 1                    ~ 1000, 
    ls == 1 & lsn == "Goatry"  ~ 150, 
    ls == 1 & lsn == "Poultry" ~ 300, 
    ls == 0 & ag ==          1 ~ 600)
  )
#   hhid me ls    lsn ag  mt2
# 1 5668  1  0         0 1000
# 2 5595  1  0         0 1000
# 3 4724  0  1 Goatry  0  150
# 4 4756  0  1 Goatry  0  150
# 5 4856  0  1 Goatry  0  150
# 6 4730  0  1 Goatry  0  150

Your second (storing in mt2 for comparison):

df2 %>%
  mutate(mt2 = case_when(
    me == 1                    ~ 1000, 
    ls == 1 & lsn == "Goatry"  ~ 150, 
    ls == 1 & lsn == "Poultry" ~ 300, 
    ls == 0 & ag ==          1 ~ 600)
  )
#   hhid me ls     lsn ag   mt  mt2
# 1    2  1  0          0 1000 1000
# 2    3  0  1  Goatry  0  150  150
# 3 1347  0  1 Poultry  0  300  300
# 4  642  0  0          1  600  600
  •  Tags:  
  • r
  • Related