Home > Blockchain >  group by one column and get the min returns NA
group by one column and get the min returns NA

Time:06-25

Here is the data:

    rawdata <- data.frame(
         OIL = c(435L,1386L,3019L,2690L,2516L,1977L,
                 2016L,1846L,1852L,1608L,1375L,1531L,1309L,1275L,1275L,
                 1114L),
         GAS = c(0L,1923L,3224L,3475L,2382L,1691L,
                 2706L,2393L,2066L,2194L,1916L,2127L,1723L,1532L,2012L,
                 1951L),
     PROPNUM = as.factor(c("49-005-27619",
                           "49-005-27619","49-005-27619","49-005-27619",
                           "49-005-27619","49-005-27619","49-005-27619","49-005-27619",
                           "49-005-27619","49-005-27619","49-005-27619",
                           "49-005-27619","49-005-27619","49-005-27619",
                           "49-005-27619","49-005-27619")),
      P_DATE = as.factor(c("9/30/1984",
                           "10/31/1984","11/30/1984","12/31/1984","1/31/1985",
                           "2/28/1985","3/31/1985","4/30/1985","5/31/1985",
                           "6/30/1985","7/31/1985","8/31/1985","9/30/1985",
                           "10/31/1985","11/30/1985","12/31/1985"))
)

I want to group by propnum and get the min data, here's my code:

avg <- rawdata %>%
  group_by(PROPNUM) %>%
  summarise(initial_date = min(as.Date(P_DATE, format = "%m.%d.%Y")))

But I'm only getting NAs. output

How can I fix it?

CodePudding user response:

The issue is that your date format is incorrect. Since your date is separated by / (e.g., 12/31/1985) rather than ., then you need to change the separator in format in as.Date (i.e., format = "%m/%d/%Y").

library(tidyverse)

avg <- rawdata %>%
  group_by(PROPNUM) %>%
  summarise(initial_date = min(as.Date(P_DATE, format = "%m/%d/%Y"), na.rm = T)) 

Output

  PROPNUM      initial_date
  <fct>        <date>      
1 49-005-27619 1984-09-30 
  • Related