Home > other >  How to change "year.month" format into "Year-Month" format in R
How to change "year.month" format into "Year-Month" format in R

Time:10-26

I have a data frame that looks like this:

Month           GSI
1993.01     -0.57567056
1993.02     -1.15549239
1993.03     -1.00353071
1993.04     -0.10698880
1993.05     -0.31903591
1993.06      0.30361638
1993.07      1.24528915
1993.08      0.85104370
1993.09      1.24680092
1993.10      1.42521406

As you can see, the "Month" column is meant to be a date in the format "year.month". I would like to reformat this column to the traditional "%Y-%m" format so that the data frame looks something more like this:

  Date          GSI
1993-01     -0.57567056
1993-02     -1.15549239
1993-03     -1.00353071
1993-04     -0.10698880
1993-05     -0.31903591
1993-06      0.30361638
1993-07      1.24528915
1993-08      0.85104370
1993-09      1.24680092
1993-10      1.42521406

How can I go about changing the format of this column to be recognizable as a date column? Currently, the class of the "Month" column is numeric.

CodePudding user response:

Use the lubridate package.

library(dplyr)
library(lubridate)
df <- transmute(df, date = ym(Month))

# if you don't know dplyr, use:
df$date <- ym(df$Month)

Note that this solution also coerces the result into a POSIXct (date) formatted variable. Transmute mutates and deletes and the month variable.

lubridate is the gold standard package for working with date (and time) data in R. Find the cheatsheat here.

CodePudding user response:

You can use sub, with capturing groups in the regular expression:

df$Month <- sub("^(\\d{4})\\.(\\d{2})$", "\\1-\\2", format(df$Month, 2))

df
#>      Month        GSI
#> 1  1993-01 -0.5756706
#> 2  1993-02 -1.1554924
#> 3  1993-03 -1.0035307
#> 4  1993-04 -0.1069888
#> 5  1993-05 -0.3190359
#> 6  1993-06  0.3036164
#> 7  1993-07  1.2452892
#> 8  1993-08  0.8510437
#> 9  1993-09  1.2468009
#> 10 1993-10  1.4252141

Input Data

df <- structure(list(Month = c(1993.01, 1993.02, 1993.03, 1993.04, 
1993.05, 1993.06, 1993.07, 1993.08, 1993.09, 1993.1), GSI = c(-0.57567056, 
-1.15549239, -1.00353071, -0.1069888, -0.31903591, 0.30361638, 
1.24528915, 0.8510437, 1.24680092, 1.42521406)), class = "data.frame", row.names = c(NA, 
-10L))

df
#>      Month        GSI
#> 1  1993.01 -0.5756706
#> 2  1993.02 -1.1554924
#> 3  1993.03 -1.0035307
#> 4  1993.04 -0.1069888
#> 5  1993.05 -0.3190359
#> 6  1993.06  0.3036164
#> 7  1993.07  1.2452892
#> 8  1993.08  0.8510437
#> 9  1993.09  1.2468009
#> 10 1993.10  1.4252141
  • Related