Home > Software design >  Fill the data for a string variable depending on other variable in r?
Fill the data for a string variable depending on other variable in r?

Time:10-26

df_input is the data frame which needs to be transformed into df_output.

For instance, 2001-2003 is assembly=1, and we had a winner in 2001. It means we have a winner if the assembly doesn't change. Similarly, we have a string variable called "party", which doesn't change as long as the assembly is the same.

    df_input <- data.frame(winner  = c(1,0,0,0,2,0,0,0,1,0,0,0,0),
                           party = c("A",0,0,0,"B",0,0,0,"C",0,0,0,0), 
                           assembly= c(1,1,1,2,2,2,3,3,3,3,4,4,4), 
                           year = c(2001,2002,2003,2004,2005,2006,2007,2008,2009,2010,2011,2012,2013))    
    
    df_output <- data.frame(winner  = c(1,1,1,0,2,2,0,0,1,1,0,0,0),
                            party = c("A","A","A",0,"B","B",0,0,"C","C",0,0,0),
                            assembly= c(1,1,1,2,2,2,3,3,3,3,4,4,4), 
                            year = c(2001,2002,2003,2004,2005,2006,2007,2008,2009,2010,2011,2012,2013))    
    

The code is working fine with the numeric variable (winner). How to do it if there is an additional string variable, "party"?

I get the following error after implementing this code:

df_output <- df_input %>%
  mutate(winner = if_else(winner > 0, winner, NA_real_)) %>% 
  group_by(assembly) %>% 
  fill(winner) %>% 
  ungroup() %>% 
  replace_na(list(winner = 0)) #working fine

df_output <- df_input %>%
  mutate(party = ifelse(party>0, party, NA)) %>%
  group_by(assembly) %>%
  fill(party) %>%
  ungroup() %>%
  replace_na(list(party = 0))    

Error:


Error in `vec_assign()`:
! Can't convert `replace$party` <double> to match type of `data$party` <character>.

CodePudding user response:

You have to pay attention to the datatypes. As party is a character use "0" in replace_na. Also, there is a NA_character_:

library(dplyr)
library(tidyr)

df_input %>%
  mutate(winner = if_else(winner > 0, winner, NA_real_),
         party = if_else(party != "0", party, NA_character_)) %>% 
  group_by(assembly) %>% 
  fill(winner, party) %>% 
  ungroup() %>% 
  replace_na(list(winner = 0, party = "0"))
#> # A tibble: 13 × 4
#>    winner party assembly  year
#>     <dbl> <chr>    <dbl> <dbl>
#>  1      1 A            1  2001
#>  2      1 A            1  2002
#>  3      1 A            1  2003
#>  4      0 0            2  2004
#>  5      2 B            2  2005
#>  6      2 B            2  2006
#>  7      0 0            3  2007
#>  8      0 0            3  2008
#>  9      1 C            3  2009
#> 10      1 C            3  2010
#> 11      0 0            4  2011
#> 12      0 0            4  2012
#> 13      0 0            4  2013
  • Related