If I think there are some problem data and I want to remove all of fruit that has <0 data, how can I do?
fruit year price
apple 2021 2
apple 2020 -9
apple 2019 3
banana 2021 9
banana 2020 7
banana 2019 5
orange 2021 7
orange 2020 2
orange 2019 -3
->
fruit year price
banana 2021 9
banana 2020 7
banana 2019 5
CodePudding user response:
There are several possible solutions, here are three:
base R
dat[!dat$fruit %in% unique(dat[dat$price < 0, "fruit"]),]
dplyr
With all
:
library(dplyr)
dat %>%
group_by(fruit) %>%
filter(all(price >= 0))
Or, with any
:
dat %>%
group_by(fruit) %>%
filter(!any(price < 0))
output
# A tibble: 3 x 3
# Groups: fruit [1]
fruit year price
<chr> <int> <int>
1 banana 2021 9
2 banana 2020 7
3 banana 2019 5
CodePudding user response:
First your data df
:
fruit year price
1 apple 2021 2
2 apple 2020 -9
3 apple 2019 3
4 banana 2021 9
5 banana 2020 7
6 banana 2019 5
7 orange 2021 7
8 orange 2020 2
9 orange 2019 -3
You can use the following code to remove all the rows of per group with a negative price
:
df <- df[with(df, ave(price >= 0, fruit, FUN = all)), ]
df
Output:
fruit year price
4 banana 2021 9
5 banana 2020 7
6 banana 2019 5
As you can see no negative values for banana
.
Data
df <- data.frame(fruit = c("apple", "apple", "apple", "banana", "banana", "banana", "orange", "orange", "orange"),
year = c(2021, 2020, 2019, 2021, 2020, 2019, 2021, 2020, 2019),
price = c(2, -9, 3, 9, 7, 5, 7, 2, -3))