Home > Back-end >  creating dataframe from vectors
creating dataframe from vectors

Time:02-10

enter image description hereI have the following vectors:

bid = c(1,5,10,20,30,40,50)
n = c(31,29,27,25,23,21,19)
yes = c(0,3,6,7,9,13,17)
no = n - yes

I have two questions, and I don't find any solutions for them, I would appreciate if someone can help me.

Q1: I want to write R code to create a two-column dataframe df. Column 1 has Bid, where each Bid is repeated n times; Column 2 has c(rep(1,yes),rep(0,no) at each bid.

Q2: Then when I have the data frame df, I want to write R codes to generate (from df) vectors bid, n, yes, and no, again.

CodePudding user response:

It is a bit unclear what you actually want. It is easier if you provide the desired result. Would this fit your Q1:

library(tidyverse)
bid = c(1,5,10,20,30,40,50)
n = c(31,29,27,25,23,21,19)
yes = c(0,3,6,7,9,13,17)
no = n - yes

df <- tibble(bid, yes, n, no = n -yes) %>% dplyr::select(- n) %>% pivot_longer(cols = c(yes, no)) %>%  uncount(value) %>% mutate(yesno = ifelse(name == "yes", 1,0)) %>% dplyr::select(-name)


df2 <- df %>% group_by(bid) %>% table() %>%  as.data.frame() %>% pivot_wider(id_cols = bid, names_from = yesno, values_from = Freq) %>% mutate(n = yes   no) %>% rename(no = `0`, yes = `1`)

bid <- df2$bid
n <- df2$n
yes <- df2$yes

CodePudding user response:

I don't know what you mean for Q2, but for Q1 you could do this:

library(tidyverse)

pmap_dfr(list(bid, n, yes, no),
         \(V1, V2, V3, V4) tibble(col1 = rep(V1, V2),
                                  col2 = c(rep(1,V3),rep(0,V4))))
#> # A tibble: 175 x 2
#>     col1  col2
#>    <dbl> <dbl>
#>  1     1     0
#>  2     1     0
#>  3     1     0
#>  4     1     0
#>  5     1     0
#>  6     1     0
#>  7     1     0
#>  8     1     0
#>  9     1     0
#> 10     1     0
#> # ... with 165 more rows

EDIT: For Q2, you can follow this:

library(tidyverse)

df <- pmap_dfr(list(bid, n, yes, no),
         \(V1, V2, V3, V4) tibble(col1 = rep(V1, V2),
                                  col2 = c(rep(1,V3),rep(0,V4))))

df2 <- df |>
  count(col1, col2) |>
  group_by(col1) |>
  summarise(yes = sum(n[col2==1]),
            n = sum(n))

bid2 <- df2$col1
n2 <- df2$n
yes2 <- df2$yes
no2 <- n2 - yes2

all.equal(c(bid, n, yes, no), c(bid2, n2, yes2, no2))
#> [1] TRUE
  • Related