Home > Back-end >  In R how can I multiply values of certain columns from two data frames where the columns have matchi
In R how can I multiply values of certain columns from two data frames where the columns have matchi

Time:07-01

I have two data frames. Both data frames have the same column names. The Period and Code column values match (ie. the combination of them is a key).

For rows where the keys match I want to multiply the value of column a from dataframe1 with column a value from dataframe2, and the same for column c. I'm thinking the solution might involve a left join.

Here are the dataframes:

dataframe1:

Period Code a c
2014-01-13 ABC 1 8
2015-01-13 ABC 2 9
2015-01-13 LMZ 3 10
2016-01-13 LMZ 4 11

dataframe2:

Period Code a c
2014-01-13 ABC 0.1 0.2
2015-01-13 ABC 0.5 0.9
2015-01-13 EKZ 0.9 0.3
2015-01-13 LMZ 0.4 0.2
2016-01-13 LMZ 0.3 0.4

Code to re-create data frames:

dataframe1 <- data.frame(Period=c(as.Date("2014-01-13"), as.Date("2015-01-13"),as.Date("2015-01-13"),as.Date("2016-01-13") ),Code = c("ABC","ABC","LMZ","LMZ"), a = 1:4, c=8:11)

dataframe2 <- data.frame(Period=c(as.Date("2014-01-13"),as.Date("2015-01-13"),as.Date("2015-01-13"),as.Date("2015-01-13"),as.Date("2016-01-13")),Code = c("ABC","ABC","EKZ","LMZ","LMZ"), a = c(0.1,0.5,0.9,0.4,0.3), c=c(0.2,0.9,0.3,0.2,0.4))

This is what I am hoping to get:

newDataframe:

Period Code a c
2014-01-13 ABC 0.1 1.6
2015-01-13 ABC 1 8.1
2015-01-13 LMZ 1.2 2
2016-01-13 LMZ 1.2 4.4

For example, here the numeric values from row 1 in dataframe1 have been multiplied by the matching values from the row with the matching key in dataframe2.

CodePudding user response:

You can use a left_join (dplyr::left_join), as you indicate, and then deselect the no-longer-needed columns:

library(dplry)

left_join(dataframe1, dataframe2, by=c("Period", "Code")) %>% 
  mutate(a=a.x*a.y, c=c.x*c.y) %>% 
  select(-c(a.x:c.y))

Output

      Period Code   a   c
1 2014-01-13  ABC 0.1 1.6
2 2015-01-13  ABC 1.0 8.1
3 2015-01-13  LMZ 1.2 2.0
4 2016-01-13  LMZ 1.2 4.4

A data.table solution is here:

library(data.table)

setDT(dataframe2)[
  setDT(dataframe1),on=.(Period, Code),.(Period, Code, a = a*i.a, c=c*i.c)
]

Output:


       Period Code   a   c
1: 2014-01-13  ABC 0.1 1.6
2: 2015-01-13  ABC 1.0 8.1
3: 2015-01-13  LMZ 1.2 2.0
4: 2016-01-13  LMZ 1.2 4.4
  • Related