Count of dates by a certain variables in R-CodePudding

What I have:

DF1

App	Proc	Rej	Old
2021-05-11	2021-05-11	NA	NA
2021-05-11	2021-05-11	NA	NA
2021-05-11	NA	NA	NA
2021-05-11	2021-05-11	NA	NA
2021-05-12	2021-05-12	NA	2021-09-12
2021-05-12	2021-05-12	2021-10-01	NA
2021-05-13	2021-05-13	NA	2021-09-12
2021-05-13	2021-05-13	2021-10-01	NA
2021-05-13	2021-05-13	2021-10-01	NA
2021-05-13	NA	NA	NA
2021-05-13	2021-05-13	NA	2021-09-12
2021-05-13	NA	NA	NA
2021-05-13	2021-05-13	NA	NA

What I want: A new table with count of App and New by Dates. Ignoring Old and Rej

DFdesired:

Date	App	Proc
2021-05-11	4	3
2021-05-12	2	2
2021-05-13	7	5

CodePudding user response：

Try this

library(dplyr , warn.conflicts = F)

DF1 |> group_by(App) |> summarise(app = n()) -> df2
DF1 |> group_by(Proc) |> summarise(proc =n()) -> df3

df2  |>  left_join(df3 , by = c("App" = "Proc")) -> DFdesired

Output

DFdesired

# A tibble: 3 × 3
  App          app  proc
  <chr>      <int> <int>
1 2021-05-11     4     3
2 2021-05-12     2     2
3 2021-05-13     7     5

CodePudding user response：

Here is another approach using pivot_longer and pivot_wider from tidyr. You can select the App and Proc columns for pivoting, and remove rows where NA is the value.

library(tidyverse)

df1 %>%
  pivot_longer(cols = c(App, Proc), values_drop_na = T) %>%
  count(value, name) %>%
  pivot_wider(id_cols = value, names_from = name, values_from = n)

Output

  value        App  Proc
  <chr>      <int> <int>
1 2021-05-11     4     3
2 2021-05-12     2     2
3 2021-05-13     7     5