Home > Software engineering >  How to return the names of a list if the index from two list matches? (R)
How to return the names of a list if the index from two list matches? (R)

Time:06-04

I have a dataframe column (df$source_index) that contains 100 obs with indexes belong to 1-17 and another list(list_df) with names and the same index from 1-17. How do I create(mutate) a column in this dataframe that, as long as the index from the df column(df$source_index) and the index of the list(list_df) matches (are the same value), it returns the names of the list?

df

dput(df)

structure(list(value = c(A,B,C,D,E,F,....N), source_index = c(1,1,1,2,2,3..17)), .Names = c("value ", 
"source_index"), row.names = c(NA, 17L), class = "data.frame")

    #   value source_index    
    # 1  A        1    
    # 2  B        1     
    # 3  C        1   
    # 4  D        2   
    # 5  E        2    
    # 6  F        3
    # .  . .
    # 17  N       17
     

Large named list list_df that has index 1-17 with 17 elements:

list[17]
# name    Type    Value 
  apple char[5]   apple is red..
  orange char[8]  orange is sweet..
  ....
1st element: list_df[["apple"]]
2nd element: list_df[["orange"]]
...
17th element: list_df[["bana"]]

Desired output with a new col in df df

    #   value source_index new col 
    # 1  A       1        apple  
    # 2  B       1        apple
    # 3  C       1        apple 
    # 4  D       2        orange
    # 5  E       2        orange
    # 6  F       3        orange
    # .  . . 
    # 17  N 17            bana

I've tried match(df$source_index, names(list_df)) or seq_along(list_df)but it returned all NAs.

Thanks in advance!

CodePudding user response:

You can do this:

df %>% mutate(new_col = names(list_df)[source_index])

Output (first 10 rows):

   value source_index new_col
1      W            1       a
2      V            1       a
3      M            1       a
4      H            1       a
5      Z            1       a
6      V            2       b
7      J            2       b
8      F            2       b
9      Y            3       c
10     B            3       c

Input:

set.seed(123)

df = data.frame(value = sample(LETTERS,100, replace=T), source_index= sample(1:17, 100, replace=T)) %>% arrange(source_index)

list_df =as.list(1:17)
names(list_df) <- letters[1:17]

  •  Tags:  
  • r
  • Related