Home > Enterprise >  Turning two columns of data in R into a series of lists as long as the columns
Turning two columns of data in R into a series of lists as long as the columns

Time:07-04

I have two columns of data referring to international abbreviations for countries (their ISO 2-Code and ISO 3-Code). I am going to use this as a "dictionary" to substitute the 2Codes for the 3Codes in another table using the gsubfn() function. I had to create the dictionary list by entering manually as follows:

d_list<-list('ABW'='AW','AFG'='AF','AGO'='AO','AIA'='AI','ALA'='AX','ALB'='AL','AND'='AD',
          'ARE'='AE','ARG'='AR','ARM'='AM','ASM'='AS','ATG'='AG','AUS'='AU','AUT'='AT',
          'AZE'='AZ','BAH'='AZ','BDI'='BI','BEL'='BE','BEN'='BJ','BFA'='BF','BGD'='BD',
          'BGR'='BG','BHR'='BH','BIH'='BA','BLM'='BL','BLR'='BY','BLZ'='BZ','BMU'='BM',
          'BOL'='BO','BRA'='BR','BRB'='BB','BRN'='BN')

d_list is a list of 212 elements ... one element per "=".

Is there a way to create d_list from a table or multiple tables that have the 2Code and 3Code in columns? ... would make this process so much faster and less error prone.

CodePudding user response:

First, some test data:

df <- data.frame(code3 = c("ABW","AFG"), code2 = c("AW", "AF"))

If each of the 3-Code values have to be unique, then split it:

split(df$code2, df$code3)
#$ABW
#[1] "AW"
#
#$AFG
#[1] "AF"

If not, make a list and set the names manually:

as.list(setNames(df$code2, df$code3))
#$ABW
#[1] "AW"
#
#$AFG
#[1] "AF"

CodePudding user response:

A loop between columns and lists should be able to accomplish this. This code (or similar) could likely be written into a function for convenience.

#Written in R version 4.2.1
#create column 1
col1 = names(d_list)
col1
 [1] "ABW" "AFG" "AGO" "AIA" "ALA" "ALB" "AND" "ARE" "ARG" "ARM" "ASM"
[12] "ATG" "AUS" "AUT" "AZE" "BAH" "BDI" "BEL" "BEN" "BFA" "BGD" "BGR"
[23] "BHR" "BIH" "BLM" "BLR" "BLZ" "BMU" "BOL" "BRA" "BRB" "BRN"
 
#create column 2
col2 = NULL
for(i in 1:length(d_list)){
col2[i] = d_list[[i]]
}
col2
 [1] "AW" "AF" "AO" "AI" "AX" "AL" "AD" "AE" "AR" "AM" "AS" "AG" "AU"
[14] "AT" "AZ" "AZ" "BI" "BE" "BJ" "BF" "BD" "BG" "BH" "BA" "BL" "BY"
[27] "BZ" "BM" "BO" "BR" "BB" "BN"

#reconstruct d_list from columns, called d_list2 here
d_list2 = list()
for(i in 1:length(d_list)){
d_list2[[i]] = col2[i]
names(d_list2)[i] = col1[i]
}
all.equal(d_list,d_list2)
[1] TRUE
  • Related