Here is a Video i made just in case the text is confusing https://youtu.be/wkVg4dJQHgc
I had two dataframes named df1 and df2. Both of these dataframes contain different types of questions for educational tests which measure something and have a specific format
df1<-structure(list(Measures = c("space and shape", "space and shape",
"space and shape", "space and shape", "space and shape", "change and relationships",
"change and relationships", "change and relationships", "change and relationships",
"change and relationships", "space and shape", "space and shape",
"space and shape", "space and shape", "uncertainty and data",
"quantity", "uncertainty and data", "uncertainty and data", "uncertainty and data",
"quantity", "change and relationships", "change and relationships",
"space and shape", "space and shape", "space and shape", "quantity",
"quantity", "quantity", "quantity", "quantity", "uncertainty and data",
"change and relationships", "quantity", "quantity", "uncertainty and data",
"change and relationships", "uncertainty and data", "quantity",
"change and relationships", "change and relationships", "quantity",
"quantity", "quantity", "quantity", "quantity", "quantity", "change and relationships",
"uncertainty and data", "change and relationships", "uncertainty and data",
"uncertainty and data", "uncertainty and data", "quantity", "quantity",
"quantity", "space and shape", "change and relationships", "quantity",
"space and shape", "space and shape", "change and relationships",
"change and relationships", "uncertainty and data", "uncertainty and data",
"quantity", "change and relationships", "quantity", "change and relationships",
"space and shape", "quantity", "quantity", "quantity", "space and shape",
"space and shape", "space and shape", "uncertainty and data",
"uncertainty and data", "uncertainty and data", "change and relationships",
"change and relationships", "change and relationships", "uncertainty and data",
"uncertainty and data", "uncertainty and data", "change and relationships",
"change and relationships", "change and relationships", "change and relationships",
"change and relationships", "uncertainty and data", "space and shape",
"space and shape", "uncertainty and data", "uncertainty and data",
"uncertainty and data", "uncertainty and data", "uncertainty and data",
"quantity", "quantity", "space and shape", "space and shape",
"space and shape", "space and shape", "change and relationships",
"space and shape", "space and shape", "quantity", "change and relationships",
"change and relationships"), Format = c("Constructed Response Expert",
"Constructed Response Manual", "Constructed Response Expert",
"Simple Multiple Choice", "Constructed Response Auto-coded",
"Constructed Response Expert", "Constructed Response Expert",
"Constructed Response Expert", "Complex Multiple Choice", "Complex Multiple Choice",
"Complex Multiple Choice", "Simple Multiple Choice", "Constructed Response Expert",
"Constructed Response Expert", "Complex Multiple Choice", "Constructed Response Manual",
"Simple Multiple Choice", "Complex Multiple Choice", "Simple Multiple Choice",
"Constructed Response Manual", "Constructed Response Manual",
"Constructed Response Expert", "Simple Multiple Choice", "Constructed Response Expert",
"Constructed Response Auto-coded", "Constructed Response Manual",
"Complex Multiple Choice", "Constructed Response Manual", "Simple Multiple Choice",
"Simple Multiple Choice", "Simple Multiple Choice", "Simple Multiple Choice",
"Complex Multiple Choice", "Simple Multiple Choice", "Constructed Response Auto-coded",
"Constructed Response Expert", "Constructed Response Manual",
"Constructed Response Manual", "Constructed Response Expert",
"Constructed Response Manual", "Complex Multiple Choice", "Constructed Response Expert",
"Simple Multiple Choice", "Constructed Response Expert", "Constructed Response Manual",
"Simple Multiple Choice", "Constructed Response Expert", "Simple Multiple Choice",
"Constructed Response Manual", "Simple Multiple Choice", "Simple Multiple Choice",
"Simple Multiple Choice", "Constructed Response Manual", "Constructed Response Manual",
"Simple Multiple Choice", "Simple Multiple Choice", "Constructed Response Expert",
"Constructed Response Manual", "Constructed Response Manual",
"Simple Multiple Choice", "Constructed Response Manual", "Constructed Response Expert",
"Simple Multiple Choice", "Simple Multiple Choice", "Simple Multiple Choice",
"Constructed Response Expert", "Constructed Response Manual",
"Simple Multiple Choice", "Constructed Response Expert", "Simple Multiple Choice",
"Constructed Response Manual", "Constructed Response Expert",
"Complex Multiple Choice", "Complex Multiple Choice", "Constructed Response Expert",
"Constructed Response Expert", "Constructed Response Manual",
"Constructed Response Expert", "Constructed Response Manual",
"Constructed Response Expert", "Constructed Response Expert",
"Constructed Response Manual", "Constructed Response Expert",
"Constructed Response Expert", "Simple Multiple Choice", "Simple Multiple Choice",
"Constructed Response Manual", "Constructed Response Expert",
"Simple Multiple Choice", "Constructed Response Expert", "Constructed Response Manual",
"Complex Multiple Choice", "Constructed Response Manual", "Constructed Response Manual",
"Complex Multiple Choice", "Simple Multiple Choice", "Simple Multiple Choice",
"Simple Multiple Choice", "Constructed Response Manual", "Simple Multiple Choice",
"Constructed Response Expert", "Constructed Response Manual",
"Constructed Response Manual", "Constructed Response Expert",
"Constructed Response Manual", "Constructed Response Expert",
"Simple Multiple Choice", "Constructed Response Manual", "Complex Multiple Choice"
)), row.names = c(NA, -109L), class = "data.frame")
df2<- structure(list(Measures = c("space and shape", "space and shape",
"space and shape", "space and shape"), Format = c("Constructed Response Expert",
"Constructed Response Manual", "Simple Multiple Choice", "Constructed Response Auto-coded"
)), row.names = c(1L, 2L, 4L, 5L), class = "data.frame")
I use this code to subset all the rows of first dataframe which are present in second dataframe.
library(tidyverse)
inner_join(df1, df2)
But suppose that my df2 actually looked like this with an extra column at the end which tells me how many of these kinds of rows do i want from df1. For example in the following image it says that i want 2 questions which MEASURE space and shape and are of format constructed response format
df2<- structure(list(Measures = c("space and shape", "space and shape",
"space and shape", "space and shape", "asdaf"), Format = c("Constructed Response Expert",
"Constructed Response Manual", "Simple Multiple Choice", "Constructed Response Auto-coded",
"asfas"), Number = c(2, 1, 2, 1, 0)), row.names = c("1", "2",
"4", "5", "6"), class = "data.frame")
Before this i got 9 types of such questions. But i want 2 such types of questions.
CodePudding user response:
A possible solution:
library(tidyverse)
inner_join(df1,df2) %>%
group_by(Measures, Format) %>%
slice(n=1:min(Number)) %>%
ungroup
#> Joining, by = c("Measures", "Format")
#> # A tibble: 6 × 3
#> Measures Format Number
#> <chr> <chr> <dbl>
#> 1 space and shape Constructed Response Auto-coded 1
#> 2 space and shape Constructed Response Expert 2
#> 3 space and shape Constructed Response Expert 2
#> 4 space and shape Constructed Response Manual 1
#> 5 space and shape Simple Multiple Choice 2
#> 6 space and shape Simple Multiple Choice 2