Home > Back-end >  Adding values in a new column based on column name of other dataset
Adding values in a new column based on column name of other dataset

Time:02-18

Currently I have two datasets. One with locations within the company and the amount of products that are being produced at each location on each day. The other with the locations and the capacity of each location on each day. I have illustrated both of them below.

Dataset 1 (Amount of products product on certain days of the week):

Location Day of the week Produced
Location A Monday 20
Location B Monday 30
Location C Monday 55
Location A Tuesday 25
Location B Tuesday 24
Location C Tuesday 70

Dataset 2 (The capacity of each location on each day of the week)

Location Monday Tuesday
Location A 50 50
Location B 60 50
Location C 80 80

What I want to do is add a new column to the first dataset. I want that column to be named 'Capacity' and I want it to kinda look like this:

Location Day of the week Produced Capacity
Location A Monday 20 50
Location A Tuesday 25 50

I've tried multiple ifelse statements, but failed miserably and are working on it for quite some time now. Is there someone who might know a fix?

CodePudding user response:

You can pivot_longer df2 and then merge df1 and df2

library(tidyr)

merge(df1, 
      pivot_longer(df2, colnames(df2[,2:3]), 
                   names_to="Day of the week", values_to="Capacity"),
      by=c("Location","Day of the week"))
    Location Day of the week Produced Capacity
1 Location A          Monday       20       50
2 Location A         Tuesday       25       50
3 Location B          Monday       30       60
4 Location B         Tuesday       24       50
5 Location C          Monday       55       80
6 Location C         Tuesday       70       80

Data

df1 <- structure(list(Location = c("Location A", "Location B", "Location C", 
"Location A", "Location B", "Location C"), `Day of the week` = c("Monday", 
"Monday", "Monday", "Tuesday", "Tuesday", "Tuesday"), Produced = c(20L, 
30L, 55L, 25L, 24L, 70L)), class = "data.frame", row.names = c(NA, 
-6L))

df2 <- structure(list(Location = c("Location A", "Location B", "Location C"
), Monday = c(50L, 60L, 80L), Tuesday = c(50L, 50L, 80L)), class = "data.frame", row.names = c(NA, 
-3L))

CodePudding user response:

You need to put your Capacity data into long format and then merge. Here I use tidyr::pivot_longer() for the former and then dplyr::left_join() for the latter.

library(tidyverse)

d1 <- structure(list(Location = c("Location A", "Location B", "Location C", "Location A", "Location B", "Location C"), Day_of_week = c("Monday", "Monday", "Monday", "Tuesday", "Tuesday", "Tuesday"), Produced = c(20L, 30L, 55L, 25L, 24L, 70L)), class = "data.frame", row.names = c(NA, -6L))

d2 <- structure(list(Location = c("Location A", "Location B", "Location C"), Monday = c(50L, 60L, 80L), Tuesday = c(50L, 50L, 80L)), class = "data.frame", row.names = c(NA, -3L))

d2 %>% 
  pivot_longer(-Location, names_to = "Day_of_week", values_to = "Capacity") %>% 
  left_join(d1, .)
#> Joining, by = c("Location", "Day_of_week")
#>     Location Day_of_week Produced Capacity
#> 1 Location A      Monday       20       50
#> 2 Location B      Monday       30       60
#> 3 Location C      Monday       55       80
#> 4 Location A     Tuesday       25       50
#> 5 Location B     Tuesday       24       50
#> 6 Location C     Tuesday       70       80

Created on 2022-02-17 by the reprex package (v2.0.1)

  •  Tags:  
  • r
  • Related