I have the head()
of the dataframe displayed:
Input Dataframe:
isCool isTall isWide Building
1 0 0 1 0
2 1 1 0 1
3 1 0 1 2
4 0 1 0 3
5 1 0 0 1
Building has 4 (Building 0,1,2,3) unique values and I want to output 4 new dataframes that has a flag. How can i do this in R?
Expected Output:
DF 1 (flags building 0 or not building 0)
isCool isTall isWide Building_0
1 0 0 1 1
2 1 1 0 0
3 1 0 1 0
4 0 1 0 0
5 1 0 0 0
DF2 (flags building 1 or not building 1)
isCool isTall isWide Building_1
1 0 0 1 0
2 1 1 0 1
3 1 0 1 0
4 0 1 0 0
5 1 0 0 1
DF3 (flags building 2 or not building 2)
isCool isTall isWide Building_2
1 0 0 1 0
2 1 1 0 0
3 1 0 1 1
4 0 1 0 0
5 1 0 0 0
DF4 (flags building 3 or not building 3)
isCool isTall isWide Building_3
1 0 0 1 0
2 1 1 0 0
3 1 0 1 0
4 0 1 0 1
5 1 0 0 0
EDIT:
The Building column in the input determines the outputted 4 dataframes. For example, for DF1 there is a flag column Building_0 which flags whether the observation is within building 0 or not. Additionally, for DF2 there is a flag column Building_1 which flags whether or not the observation is within building 1 or not. Each output dataframe will be the same length as the input dataframe.
EDIT 2:
I've created this function based on Vinícius Félix solution. I duplicate 4 lines of code however based on the unqiue values of Building. Is there a way around this to just use the function once to generate 4 DFs?
flag_df <- function(df, colname, num) {
df %>%
mutate(colname = if_else(.data[[colname]] == num, 1, 0)) %>%
rename_with(.fn = ~paste0(colname,"_", num),.cols = colname) %>%
dplyr::select(-colname)
}
d_1 <- flag_df(test_df, "Building", 0)
d_2 <- flag_df(test_df, "Building", 1)
d_3 <- flag_df(test_df, "Building", 2)
d_4 <- flag_df(test_df, "Building", 3)
CodePudding user response:
Loop over the unique
sort
ed values of 'Building', create the new dataset, by appending the first 3 columns with the newly created 'Building' by doing a elementwise comparison (==
) on the looped value
fn1 <- function(dat, colnm) {
un1 <- sort(unique(dat[[colnm]]))
lst1 <- lapply(un1, function(i) {
tmp <- dat[setdiff(names(dat), colnm)]
tmp[[paste0(colnm, "_", i)]] <- (dat[[colnm]] == i)
tmp
})
names(lst1) <- paste("DF_", seq_along(lst1))
lst1
}
-output
> fn1(df1, "Building")
$`DF_ 1`
isCool isTall isWide Building_0
1 0 0 1 1
2 1 1 0 0
3 1 0 1 0
4 0 1 0 0
5 1 0 0 0
$`DF_ 2`
isCool isTall isWide Building_1
1 0 0 1 0
2 1 1 0 1
3 1 0 1 0
4 0 1 0 0
5 1 0 0 1
$`DF_ 3`
isCool isTall isWide Building_2
1 0 0 1 0
2 1 1 0 0
3 1 0 1 1
4 0 1 0 0
5 1 0 0 0
$`DF_ 4`
isCool isTall isWide Building_3
1 0 0 1 0
2 1 1 0 0
3 1 0 1 0
4 0 1 0 1
5 1 0 0 0
It is better to keep in a list
, but if we need to create multiple objects, use list2env
(not recommended)
list2env(lst1, .GlobalEnv)
Or this can be done in an easier way with model.matrix
Map(cbind, list(df1[1:3]), Building =
asplit(model.matrix(~ factor(df1$Building)-1), 2))
-output
[[1]]
isCool isTall isWide Building
1 0 0 1 1
2 1 1 0 0
3 1 0 1 0
4 0 1 0 0
5 1 0 0 0
[[2]]
isCool isTall isWide Building
1 0 0 1 0
2 1 1 0 1
3 1 0 1 0
4 0 1 0 0
5 1 0 0 1
[[3]]
isCool isTall isWide Building
1 0 0 1 0
2 1 1 0 0
3 1 0 1 1
4 0 1 0 0
5 1 0 0 0
[[4]]
isCool isTall isWide Building
1 0 0 1 0
2 1 1 0 0
3 1 0 1 0
4 0 1 0 1
5 1 0 0 0
data
df1 <- structure(list(isCool = c(0L, 1L, 1L, 0L, 1L), isTall = c(0L,
1L, 0L, 1L, 0L), isWide = c(1L, 0L, 1L, 0L, 0L), Building = c(0L,
1L, 2L, 3L, 1L)), class = "data.frame", row.names = c("1", "2",
"3", "4", "5"))
CodePudding user response:
Perhaps we can try the code below
lapply(
sort(unique(df$Building)),
function(x) {
transform(
df,
Building = (Building == x)
)
}
)
which gives
[[1]]
isCool isTall isWide Building
1 0 0 1 1
2 1 1 0 0
3 1 0 1 0
4 0 1 0 0
5 1 0 0 0
[[2]]
isCool isTall isWide Building
1 0 0 1 0
2 1 1 0 1
3 1 0 1 0
4 0 1 0 0
5 1 0 0 1
[[3]]
isCool isTall isWide Building
1 0 0 1 0
2 1 1 0 0
3 1 0 1 1
4 0 1 0 0
5 1 0 0 0
[[4]]
isCool isTall isWide Building
1 0 0 1 0
2 1 1 0 0
3 1 0 1 0
4 0 1 0 1
5 1 0 0 0
CodePudding user response:
There is a way to do that, but assign
is a very "dangerous" function, so be careful!
library(dplyr)
library(purrr)
flag_building <- function(num){
df %>%
mutate(Building = if_else(Building == num,1,0)) %>%
rename_with(.fn = ~paste0("Building_",num),.cols = "Building") %>%
assign(value = .,x = paste0("df_",num),envir = globalenv() )
}
map(unique(df$Building),.f = flag_building)
ls()
[1] "df" "df_0" "df_1" "df_2"
[5] "df_3" "flag_building"