Home > Blockchain >  how to transform R dataframe to rows of indicator values
how to transform R dataframe to rows of indicator values

Time:02-18

I have a dataframe

A 2
B 4
C 3

and I would like to make a data frame with the following

A 0
A 1
B 0
B 0
B 0
B 1
C 0
C 0
C 1.

So for B, I want to make 4 rows and each one is 0 except for the last one which is 1. Similarly, for A, I'll have 2 rows and the first one has a 0 and the second one has a 1.

In general, if I have a row in the original table with X n, I want to return n rows in the new table with n-1 of them being X 0 and the final one as X 1.

Is there a way to do this in R? Or Python or SQL?

CodePudding user response:

In R, we may use uncount to replicate the rows from the second column and replace the second column with binary by converting the first to logical column (duplicated)

library(tidyr)
library(dplyr)
df1 %>% 
 uncount(v2) %>%
 mutate(v2 =  (!duplicated(v1, fromLast = TRUE)))

-output

 v1 v2
1  A  0
2  A  1
3  B  0
4  B  0
5  B  0
6  B  1
7  C  0
8  C  0
9  C  1

data

df1 <- structure(list(v1 = c("A", "B", "C"), v2 = c(2L, 4L, 3L)), 
class = "data.frame", row.names = c(NA, 
-3L))

CodePudding user response:

# load package
library(data.table)

# set as data table
setDT(df)

# work
df1 <- df[rep(seq(.N), b), ][, c := 1:.N, a]

df1[, d := 0][b == c, d := 1][, b := d][, c('c', 'd') := NULL]
  • Related