Home > OS >  How to complete a column based on a vector using tidyr
How to complete a column based on a vector using tidyr

Time:10-20

I have the following data frame:

library(tidyverse)

dat <- structure(list(res = c("A", "R", "D", "H", "I", "L", "K", "F", 
"T", "V"), nof_res = structure(c(1L, 3L, 3L, 4L, 1L, 1L, 4L, 
1L, 1L, 1L), .Dim = 10L, .Dimnames = structure(list(NULL), .Names = ""), class = "table")), class = c("tbl_df", 
"tbl", "data.frame"), row.names = c(NA, -10L))

It looks like this:

   res   nof_res
   <chr> <table>
 1 A     1      
 2 R     3      
 3 D     3      
 4 H     4      
 5 I     1      
 6 L     1      
 7 K     4      
 8 F     1      
 9 T     1      
10 V     1   

I'd like to fill the res column of that data frame with this vector, and fill the missing value with 0.

   full_aa <- c("A", "R", "N", "D", "C", "E", "Q", "G", "H", "I", "L", "K", 
             "M", "F", "P", "S", "T", "W", "Y", "V")

The final desired result is this:

===  =======
res  nof_res
===  =======
A          1
R          3
N          0
D          3
C          0
E          0
Q          0
G          0
H          4
I          1
L          1
K          4
M          0
F          1
P          0
S          0
T          1
W          0
Y          0
V          1
===  =======

So the final column length must be the same as length(full_aa) which is 20. How can I achieve that?

CodePudding user response:

Use complete:

library(tidyr)
library(dplyr)
dat %>% 
  mutate(nof_res = as.numeric(nof_res)) %>% 
  complete(res = full_aa, fill = list(nof_res = 0))

output

   res nof_res
1    A       1
2    C       0
3    D       3
4    E       0
5    F       1
6    G       0
7    H       4
8    I       1
9    K       4
10   L       1
11   M       0
12   N       0
13   P       0
14   Q       0
15   R       3
16   S       0
17   T       1
18   V       1
19   W       0
20   Y       0

CodePudding user response:

I believe you need enframeand join:

enframe(full_aa, name = NULL, value = "res") %>%
  left_join(dat, by = "res") %>%
  mutate(nof_res = replace_na(nof_res, 0.))
  • Related