Home > OS >  Assign values from array to dataframe in R
Assign values from array to dataframe in R

Time:01-03

I have a dataframe with data about the US States.

One of the columns in the df is "Division", which tells the location where each state belongs to ("East North Central", "East South Central", "Middle Atlantic", "Mountain", "New England", "Pacific", "South Atlantic", "West North Central", "West South Central").

I created an array with the average expectancy life for each division, using an existing column called "Life Exp:

avg.life.exp = tapply(df[["Life Exp"]], df$Division, mean, na.rm=TRUE)

Which returns the following:

East North Central East South Central    Middle Atlantic 
      70.99000           69.33750           70.63667 
      Mountain        New England            Pacific 
      70.94750           71.57833           71.69400 
South Atlantic West North Central West South Central 
      69.52625           72.32143           70.43500 

Now I would like to add a new column to the df, with the average life expectancy of each Division. So basically I would like to do a Left Join, where if the state belonged to the East Noth Central, it would return 70.99000, and so on.

I need to do this without using packages.

Thank you in advance for any help you can provide!

CodePudding user response:

One option would be to use merge:

merge(df, data.frame(Division = names(avg.life.exp), avg.life.exp), all.x = TRUE)

A second option would be to use match

df$avg.life.exp <- avg.life.exp[match(df$Division, names(avg.life.exp))]

Using the gapminder dataset as example data:

library(gapminder)

# Example data
df <- gapminder[gapminder$year == 2007, c("country", "continent", "lifeExp")]

avg.life.exp <- tapply(df[["lifeExp"]], df$continent, mean, na.rm=TRUE)

avg.life.exp
#>   Africa Americas     Asia   Europe  Oceania 
#> 54.80604 73.60812 70.72848 77.64860 80.71950

# Using merge
df1 <- merge(df, data.frame(continent = names(avg.life.exp), avg.life.exp), all.x = TRUE)

head(df1)
#>   continent           country lifeExp avg.life.exp
#> 1    Africa           Reunion  76.442     54.80604
#> 2    Africa           Eritrea  58.040     54.80604
#> 3    Africa           Algeria  72.301     54.80604
#> 4    Africa       Congo, Rep.  55.322     54.80604
#> 5    Africa Equatorial Guinea  51.579     54.80604
#> 6    Africa            Malawi  48.303     54.80604

# Using match
df$avg.life.exp <- avg.life.exp[match(df$continent, names(avg.life.exp))]

head(df)
#> # A tibble: 6 × 4
#>   country     continent lifeExp avg.life.exp
#>   <fct>       <fct>       <dbl>        <dbl>
#> 1 Afghanistan Asia         43.8         70.7
#> 2 Albania     Europe       76.4         77.6
#> 3 Algeria     Africa       72.3         54.8
#> 4 Angola      Africa       42.7         54.8
#> 5 Argentina   Americas     75.3         73.6
#> 6 Australia   Oceania      81.2         80.7
  • Related