Home > Net >  Count of dates by a certain variables in R
Count of dates by a certain variables in R

Time:06-25

What I have:

DF1

App Proc Rej Old
2021-05-11 2021-05-11 NA NA
2021-05-11 2021-05-11 NA NA
2021-05-11 NA NA NA
2021-05-11 2021-05-11 NA NA
2021-05-12 2021-05-12 NA 2021-09-12
2021-05-12 2021-05-12 2021-10-01 NA
2021-05-13 2021-05-13 NA 2021-09-12
2021-05-13 2021-05-13 2021-10-01 NA
2021-05-13 2021-05-13 2021-10-01 NA
2021-05-13 NA NA NA
2021-05-13 2021-05-13 NA 2021-09-12
2021-05-13 NA NA NA
2021-05-13 2021-05-13 NA NA

What I want: A new table with count of App and New by Dates. Ignoring Old and Rej

DFdesired:

Date App Proc
2021-05-11 4 3
2021-05-12 2 2
2021-05-13 7 5

CodePudding user response:

Try this

library(dplyr , warn.conflicts = F)

DF1 |> group_by(App) |> summarise(app = n()) -> df2
DF1 |> group_by(Proc) |> summarise(proc =n()) -> df3

df2  |>  left_join(df3 , by = c("App" = "Proc")) -> DFdesired

Output

DFdesired

# A tibble: 3 × 3
  App          app  proc
  <chr>      <int> <int>
1 2021-05-11     4     3
2 2021-05-12     2     2
3 2021-05-13     7     5

CodePudding user response:

Here is another approach using pivot_longer and pivot_wider from tidyr. You can select the App and Proc columns for pivoting, and remove rows where NA is the value.

library(tidyverse)

df1 %>%
  pivot_longer(cols = c(App, Proc), values_drop_na = T) %>%
  count(value, name) %>%
  pivot_wider(id_cols = value, names_from = name, values_from = n)

Output

  value        App  Proc
  <chr>      <int> <int>
1 2021-05-11     4     3
2 2021-05-12     2     2
3 2021-05-13     7     5
  • Related