I have a dataframe
A 2
B 4
C 3
and I would like to make a data frame with the following
A 0
A 1
B 0
B 0
B 0
B 1
C 0
C 0
C 1.
So for B, I want to make 4 rows and each one is 0 except for the last one which is 1. Similarly, for A, I'll have 2 rows and the first one has a 0 and the second one has a 1.
In general, if I have a row in the original table with X n
, I want to return n rows in the new table with n-1 of them being X 0
and the final one as X 1
.
Is there a way to do this in R? Or Python or SQL?
CodePudding user response:
In R
, we may use uncount
to replicate the rows from the second column and replace the second column with binary by converting the first to logical column (duplicated
)
library(tidyr)
library(dplyr)
df1 %>%
uncount(v2) %>%
mutate(v2 = (!duplicated(v1, fromLast = TRUE)))
-output
v1 v2
1 A 0
2 A 1
3 B 0
4 B 0
5 B 0
6 B 1
7 C 0
8 C 0
9 C 1
data
df1 <- structure(list(v1 = c("A", "B", "C"), v2 = c(2L, 4L, 3L)),
class = "data.frame", row.names = c(NA,
-3L))
CodePudding user response:
# load package
library(data.table)
# set as data table
setDT(df)
# work
df1 <- df[rep(seq(.N), b), ][, c := 1:.N, a]
df1[, d := 0][b == c, d := 1][, b := d][, c('c', 'd') := NULL]