Home > Software design >  Extract rows from a range that meet criteria in another column of the same df
Extract rows from a range that meet criteria in another column of the same df

Time:11-13

I have the following df

df <- data.frame(value = c(1,2,3,4,5,6,7,8,9,10), win=c(1,1,1,2,2,3,4,4,5,5))

> df
   value win
1      1   1
2      2   1
3      3   1
4      4   2
5      5   2
6      6   3
7      7   4
8      8   4
9      9   5
10    10   5

And I wanted to keep only the rows where the variable win is in more that 3 rows. So if I look into

> table(df$win)

1 2 3 4 5 
3 2 1 2 2 

I know that I will only want to keep the rows where win=1. But how do I do that for a big data frame ?

I was thinking of having a vector which would give me the unique values of df$win

xx <- unique(df$win)

> xx
[1] 1 2 3 4 5

And somehow make a loop where it would count which rows does df$win == xx and then extract only those rows but I wasn't able to make it come true so if any of you could help me I would be very thankfull !

Edit

Expected output [only for this example tho so doing subset(df, win =="1") is not useful as I don't know which "win" will be in more than 3 rows]

    > new_df
     value win
1      1   1
2      2   1
3      3   1

CodePudding user response:

If you have a big dataset, use data.table

library(data.table)
setDT(df)[, if(.N>=3) .SD, win]

Output:

   win value
1:   1     1
2:   1     2
3:   1     3
  •  Tags:  
  • r
  • Related