Home > front end >  Overwriting Dataframe Column Values Based on List
Overwriting Dataframe Column Values Based on List

Time:09-30

I'm having trouble trying to conceptualize how I would approach this issue.

df <- data.frame(id=c(1,2,3,4,5,6,7,8),
                 freq=c(1,4,7,8,13,12,5,3))
list<-data.frame(id=c(2,4,5,6),
                 freq=c(1,1,1,1))

I have a dataframe and a "list" that resembles the ones above. I'm essentially trying to clean df by replacing the freq column with a constant (in this case, 1) but only for the rows with id's named in the list item. How do I make sure that I only replace the freq value for the rows with id's mentioned in list? The actual dataframe and list I'm working with is significantly longer and it seems a bit inefficient to just manually write a line of code to replace each specified id.

CodePudding user response:

Objects:

df <- data.frame(id = c(1, 2, 3, 4, 5, 6, 7, 8), 
                 freq = c(1, 4, 7, 8, 13, 12, 5, 3))

replacements <- data.frame(id = c(2, 4, 5, 6), 
                           freq = 1)

There are two options:

  1. Using package dplyr:
library(dplyr)

new_df1 <- df %>% 
  mutate(freq = if_else(condition = id %in% replacements$id, 
                        true = 1, 
                        false = freq))
  1. Using base R:
new_df2 <- df
new_df2$freq[which(new_df2$id %in% replacements$id)] <- 1
new_df2

CodePudding user response:

If you just want to replace freq with 1 then we can use a normal ifelse statement. For anything more complex we can use dplyr::rows_update():

library(dplyr)

df %>% 
  mutate(freq = ifelse(id %in% list$id, 1, freq))
#>   id freq
#> 1  1    1
#> 2  2    1
#> 3  3    7
#> 4  4    1
#> 5  5    1
#> 6  6    1
#> 7  7    5
#> 8  8    3
library(dplyr)

df %>% 
  rows_update(list, by = "id")
#>   id freq
#> 1  1    1
#> 2  2    1
#> 3  3    7
#> 4  4    1
#> 5  5    1
#> 6  6    1
#> 7  7    5
#> 8  8    3

Created on 2022-09-29 by the reprex package (v0.3.0)

  • Related