I have a dataframe in R of this form currently
city product inv pla
city1 product1 0 2
city1 product2 1 2
city1 product3 2 2
city1 product4 0 1
city1 product5 1 1
city1 product6 1 1
I need to duplicate the rows when pla = 2, but put a value in the inv column according to the following:
When inv = 0, both rows must be equal to 0. When inv = 1, the first row must be equal to 1, and the second row must be equal to 0. When inv=2, both rows must equal 1.
How do I create a dataframe like this?:
city product inv pla
city1 product1 0 2
city1 product1 0 2
city1 product2 1 2
city1 product2 0 2
city1 product3 1 2
city1 product3 1 2
city1 product4 0 1
city1 product5 1 1
city1 product6 1 1
CodePudding user response:
Here I pull out the rows that need duplicating, modify them as necessary, and stick them back on the original data:
df %>%
filter(pla == 2) %>%
mutate(inv = case_when(inv == 1 ~ 0L, TRUE ~ inv)) %>%
bind_rows(df) %>%
mutate(inv = case_when(inv == 2 & pla == 2 ~ 1L, TRUE ~ inv)) %>%
arrange(city, product, inv)
# city product inv pla
# 1 city1 product1 0 2
# 2 city1 product1 0 2
# 3 city1 product2 0 2
# 4 city1 product2 1 2
# 5 city1 product3 1 2
# 6 city1 product3 1 2
# 7 city1 product4 0 1
# 8 city1 product5 1 1
# 9 city1 product6 1 1
CodePudding user response:
First create your "first and second row" values, putting them in inv
and inv2
, respectively; then pivot them into new rows using tidyr::pivot_longer()
.
library(dplyr)
library(tidyr)
mydata %>%
mutate(
inv2 = case_when(
pla == 1 ~ NA_real_,
inv < 2 ~ 0,
inv == 2 ~ 1
),
inv = if_else(inv == 2, 1, inv)
) %>%
pivot_longer(
c(inv, inv2),
names_to = NULL,
values_to = "inv",
values_drop_na = TRUE
)
#> # A tibble: 9 x 4
#> city product pla inv
#> <chr> <chr> <dbl> <dbl>
#> 1 city1 product1 2 0
#> 2 city1 product1 2 0
#> 3 city1 product2 2 1
#> 4 city1 product2 2 0
#> 5 city1 product3 2 1
#> 6 city1 product3 2 1
#> 7 city1 product4 1 0
#> 8 city1 product5 1 1
#> 9 city1 product6 1 1
Created on 2022-03-16 by the reprex package (v2.0.1)
CodePudding user response:
base R solution
data <- read.table(text = "city product inv pla
city1 product1 0 2
city1 product2 1 2
city1 product3 2 2
city1 product4 0 1
city1 product5 1 1
city1 product6 1 1", header = TRUE)
data_final <- data[rep(seq_len(nrow(data)), data$pla), ]
which_rep <- data_final$pla == 2
data_final$inv[which_rep] <- data_final$inv[which_rep] / 2
data_final$inv[which_rep & data_final$inv == 0.5] <- c(1, 0)
rownames(data_final) <- NULL
data_final
CodePudding user response:
Create a function f
to make the vector of values you need for each group, and then use tidyr::separate_rows
.
library(tidyverse)
f <- function(x, n){
sapply(x, function(y) paste(c(rep(1, y), rep(0, length.out = n - y)), collapse = " "))
}
dat %>%
mutate(inv = ifelse(pla == 2, f(inv, 2), inv)) %>%
separate_rows(inv)
output
# A tibble: 9 x 4
city product inv pla
<chr> <chr> <chr> <int>
1 city1 product1 0 2
2 city1 product1 0 2
3 city1 product2 1 2
4 city1 product2 0 2
5 city1 product3 1 2
6 city1 product3 1 2
7 city1 product4 0 1
8 city1 product5 1 1
9 city1 product6 1 1