What I have:
DF1
App | Proc | Rej | Old |
---|---|---|---|
2021-05-11 | 2021-05-11 | NA | NA |
2021-05-11 | 2021-05-11 | NA | NA |
2021-05-11 | NA | NA | NA |
2021-05-11 | 2021-05-11 | NA | NA |
2021-05-12 | 2021-05-12 | NA | 2021-09-12 |
2021-05-12 | 2021-05-12 | 2021-10-01 | NA |
2021-05-13 | 2021-05-13 | NA | 2021-09-12 |
2021-05-13 | 2021-05-13 | 2021-10-01 | NA |
2021-05-13 | 2021-05-13 | 2021-10-01 | NA |
2021-05-13 | NA | NA | NA |
2021-05-13 | 2021-05-13 | NA | 2021-09-12 |
2021-05-13 | NA | NA | NA |
2021-05-13 | 2021-05-13 | NA | NA |
What I want: A new table with count of App and New by Dates. Ignoring Old and Rej
DFdesired:
Date | App | Proc |
---|---|---|
2021-05-11 | 4 | 3 |
2021-05-12 | 2 | 2 |
2021-05-13 | 7 | 5 |
CodePudding user response:
Try this
library(dplyr , warn.conflicts = F)
DF1 |> group_by(App) |> summarise(app = n()) -> df2
DF1 |> group_by(Proc) |> summarise(proc =n()) -> df3
df2 |> left_join(df3 , by = c("App" = "Proc")) -> DFdesired
Output
DFdesired
# A tibble: 3 × 3
App app proc
<chr> <int> <int>
1 2021-05-11 4 3
2 2021-05-12 2 2
3 2021-05-13 7 5
CodePudding user response:
Here is another approach using pivot_longer
and pivot_wider
from tidyr
. You can select the App
and Proc
columns for pivoting, and remove rows where NA
is the value.
library(tidyverse)
df1 %>%
pivot_longer(cols = c(App, Proc), values_drop_na = T) %>%
count(value, name) %>%
pivot_wider(id_cols = value, names_from = name, values_from = n)
Output
value App Proc
<chr> <int> <int>
1 2021-05-11 4 3
2 2021-05-12 2 2
3 2021-05-13 7 5