Home > Software engineering >  How to Select Individual Rows by First Day of Month
How to Select Individual Rows by First Day of Month

Time:03-14

I have a data frame as such

# A tibble: 6 × 4
  Entity  Code  Day        stringency_index
  <chr>   <chr> <date>                <dbl>
1 Germany DEU   2020-01-21             0   
2 Germany DEU   2020-01-22             0   
3 Germany DEU   2020-01-23             0   
4 Germany DEU   2020-01-24             5.56
5 Germany DEU   2020-01-25             5.56
6 Germany DEU   2020-01-26             5.56

I want to select only rows that are the first of the month ie. 2020-02-01. How can I do this?

CodePudding user response:

In base R, you can extract the day of a date object using the format function, and filter using bracket notation. It's always best to provide sample data using dput() when asking questions on SO, but here is a simple example.

df <- data.frame(d = seq(as.Date("2020-01-01"), as.Date("2020-03-01"), by = 1))

df[format(df$d, format="%d") == "01",]

[1] "2020-01-01" "2020-02-01" "2020-03-01"

CodePudding user response:

Here is a combination of as.yearmon from zoo package and lubridate:

library(zoo)
library(dplyr)
library(lubridate)

df %>%
  mutate(Day = ymd(Day)) %>% 
  group_by(yearMon = as.yearmon(Day)) %>% 
  arrange(Day) %>% 
  summarise(FirstDay = first(Day))
  yearMon   FirstDay  
  <yearmon> <date>    
1 Jan 2020  2020-01-01
2 Feb 2020  2020-02-01

data:

structure(list(Entity = c("Germany", "Germany", "Germany", "Germany", 
"Germany", "Germany"), Code = c("DEU", "DEU", "DEU", "DEU", "DEU", 
"DEU"), Day = c("2020-01-01", "2020-01-22", "2020-02-23", "2020-02-24", 
"2020-02-01", "2020-01-26"), stringency_index = c(0, 0, 0, 5.56, 
5.56, 5.56)), class = "data.frame", row.names = c("1", "2", "3", 
"4", "5", "6"))

CodePudding user response:

You can use this code using the lubridate package:

library(tidyverse)
library(lubridate)
df <- data.frame(Entity = c("Germany", "Germany", "Germany", "Germany", "Germany", "Germany"),
                 Code = c("DEU", "DEU", "DEU", "DEU", "DEU", "DEU"), 
                 Day = seq(as.Date("2020-01-01"), as.Date("2020-01-6"), by = 1),
                 stringency_index = c(0, 0, 0, 5.56, 5.56, 5.56))

Data:

   Entity Code        Day stringency_index
1 Germany  DEU 2020-01-01             0.00
2 Germany  DEU 2020-01-02             0.00
3 Germany  DEU 2020-01-03             0.00
4 Germany  DEU 2020-01-04             5.56
5 Germany  DEU 2020-01-05             5.56
6 Germany  DEU 2020-01-06             5.56

Use this code to select first day of a month:

df %>%
  filter(day(Day) == 1) 

Output:

   Entity Code        Day stringency_index
1 Germany  DEU 2020-01-01                0
  • Related