Home > Software design >  How to add a column in a dataframe using information from another dataframe with different lengths i
How to add a column in a dataframe using information from another dataframe with different lengths i

Time:11-19

I have two dataframes with different columns and rows lengths. I want to add two columns in the dataframe A from the dataframe B (columns X and Y) using the "region" column that is present in both dataframes. The figure shows how it looks like and how I would like to get (dataframe C, but can be add to dataframe A as well). I am just showing you the head, but it has a lot of species and the match should be done for all species and the region with x and Y columns. Could someone please help me? enter image description here

CodePudding user response:

You can use dplyr::left_join to add the columns based on region and then use select to select only the columns you need.

library(dplyr)
dataframe_a %>%
  left_join(dataframe_b, by = "region") %>%
  select(species, region, X, Y)

There are a lot of ways to join two tables. I'm only showing you one, but you can find more material here

CodePudding user response:

If you left_join you won`t get the desired output:

data.frame(species=dfA$species, region=dfA$region, X=dfB$X, Y=dfB$Y)
species <- (rep("Aa achalensis",6))
region <- c("ABT", "AFG", "AGE", "AGS", "AGW", "ALA")
dfA <- data.frame(species, region)

region <- c("ABT", "AGE", "AGS", "AGW", "ALA", "ALB")
LEVEL3_NAM <- c("Alberta", "Argentina Northeast", "Argentina South", "Argentina Northwest", "Alabama", "Albania")
LEVEL2_COD <- c(71, 85, 85, 85,78, 13)
LEVEL1_COD <- c(7,8,8,8,7,1)
X <- c(1,2,3,4,5,6)
Y <- c(7,8,9,10,11,12)
dfB <- data.frame(region, LEVEL3_NAM, LEVEL2_COD, LEVEL1_COD, X, Y)

left_join(dfA, dfB, by="region") %>% 
  select(species, region, X, Y)

data.frame(species=dfA$species, region=dfA$region, X=dfB$X, Y=dfB$Y)

With left_join

       species region  X  Y
1 Aa achalensis    ABT  1  7
2 Aa achalensis    AFG NA NA
3 Aa achalensis    AGE  2  8
4 Aa achalensis    AGS  3  9
5 Aa achalensis    AGW  4 10
6 Aa achalensis    ALA  5 11

With data.frame(species=dfA$species, region=dfA$region, X=dfB$X, Y=dfB$Y)

        species region X  Y
1 Aa achalensis    ABT 1  7
2 Aa achalensis    AFG 2  8
3 Aa achalensis    AGE 3  9
4 Aa achalensis    AGS 4 10
5 Aa achalensis    AGW 5 11
6 Aa achalensis    ALA 6 12
  •  Tags:  
  • r
  • Related