I have a table containing the history of products bought and sold. I would like to extract the items that were only bought (never sold). The bought/sale information is contained in the same column. I found a solution but i would like to know if it was possible to do it in one step
s: sell b: bought
library(dplyr)
df <- data.frame("Date" = paste0("Date", 1:5),
"Product" = c("A","B","B","C","D"),
"Action" = c("b", "b", "s", "b", "b"))
# df1 <- df %>%
# filter(Action == "b") %>%
# select(Product)
#
# df2 <- df %>%
# filter(Action == "s") %>%
# select(Product)
#
# anti_join(df1, df2)
CodePudding user response:
You may try
df %>%
group_by(Product) %>%
filter(!("s" %in% Action))
Date Product Action
<chr> <chr> <chr>
1 Date1 A b
2 Date4 C b
3 Date5 D b
CodePudding user response:
We can also do this in one line in data.table, by checking the number of unique actions per product and filtering down to those that are only ever bought:
library(data.table)
df <- setDT(df)
df <- df[,unique_actions:=uniqueN(Action), by = Product][Action == "b" & unique_actions == 1][,.(Date, Product,Action)]
> df
Date Product Action
1: Date1 A b
2: Date4 C b
3: Date5 D b
CodePudding user response:
You may remove all the Product
that have 'sell' Action
in them.
In base R -
subset(df, !Product %in% Product[Action == 's'])
# Date Product Action
#1 Date1 A b
#2 Date4 C b
#3 Date5 D b
If you prefer dplyr
the same can be written as -
library(dplyr)
df %>% filter(!Product %in% Product[Action == 's'])