Home > front end >  Issue loading ONI data using tidyverse R
Issue loading ONI data using tidyverse R

Time:10-23

I´m trying to extract and load the ONI data from NOAA, but I´m foundig this issue:

ONI <- read_table("https://psl.noaa.gov/data/correlation/oni.data",
       skip = 1, 
       n_max = 74,
       col_names = FALSE) %>%
set_names(c('year', month.abb)) %>%
pivot_longer(-year, 
           names_to = 'month',
           values_to = 'ONI')

head(ONI)

The bug showed is: Error: Can´t combine `Jan`<character> and `Feb`<double>.

How can I solve this ?

I wish to make an time series looks like this:

YearMonth  ONI
1950 jan   -1.53
1950 feb   -1.34
1950 mar   -1.16
...         ...
2021 dec   -99.90

Thanks a lot for your contribution, but I found another issue:

my code in RMarkdown

The first month of 1950 needs to be -1.53 and not -1.34 (see https://psl.noaa.gov/data/correlation/oni.data )

How can I fix it ?

CodePudding user response:

The reason is that the second column is read as character

>ONI <- read_table("https://psl.noaa.gov/data/correlation/oni.data",
        skip = 1, 
        n_max = 74,
        col_names = FALSE) 

> str(ONI)
spec_tbl_df [74 × 11] (S3: spec_tbl_df/tbl_df/tbl/data.frame)
 $ X1 : chr [1:74] "1950  -1.53" "1951  -0.82" "1952   0.53" "1953   0.40" ...
 $ X2 : chr [1:74] "-1.34  -1.16" "-0.54  -0.17" "0.37   0.34" "0.60   0.63" ...
 $ X3 : num [1:74] -1.18 0.18 0.29 0.66 -0.41 -0.8 -0.54 0.72 0.93 0.33 ...
 $ X4 : num [1:74] -1.07 0.36 0.2 0.75 -0.54 -0.79 -0.52 0.92 0.74 0.2 ...
 $ X5 : num [1:74] -0.85 0.58 0 0.77 -0.5 -0.72 -0.51 1.11 0.64 -0.07 ...
 $ X6 : num [1:74] -0.54 0.7 -0.08 0.75 -0.64 -0.68 -0.57 1.25 0.57 -0.18 ...
 $ X7 : num [1:74] -0.42 0.89 0 0.73 -0.84 -0.75 -0.55 1.32 0.43 -0.28 ...
 $ X8 : num [1:74] -0.39 0.99 0.15 0.78 -0.9 -1.09 -0.46 1.33 0.39 -0.09 ...
 $ X9 : num [1:74] -0.44 1.15 0.1 0.84 -0.77 -1.42 -0.42 1.39 0.44 -0.03 ...
 $ X10: num [1:74] -0.6 1.04 0.04 0.84 -0.73 -1.67 -0.43 1.53 0.5 0.05 ...
 $ X11: num [1:74] -0.8 0.81 0.15 0.81 -0.66 -1.47 -0.43 1.74 0.61 -0.04 ...
...

If we convert the second column to numeric, it would work

ONI$X2 <- as.numeric(ONI$X2)
> pivot_longer(ONI, cols = -X1, names_to = 'month', values_to = 'ONI')
# A tibble: 740 × 3
   X1          month   ONI
   <chr>       <chr> <dbl>
 1 1950  -1.53 X2    NA   
 2 1950  -1.53 X3    -1.18
 3 1950  -1.53 X4    -1.07
 4 1950  -1.53 X5    -0.85
 5 1950  -1.53 X6    -0.54
 6 1950  -1.53 X7    -0.42
 7 1950  -1.53 X8    -0.39
 8 1950  -1.53 X9    -0.44
 9 1950  -1.53 X10   -0.6 
10 1950  -1.53 X11   -0.8 
# … with 730 more rows

CodePudding user response:

Akrun is correct. There is a second issue where your column X1 contains two pieces of information - "Year" and "January". You can use stringr to tidy them up. Example:

 data <- data.frame(
   X1 = c("1950   -1.53", "1951  -0.82", "1952 0.53")
 ) 

 editedData <- data %>%
   mutate(Col1 = str_split(str_squish(X1), " ")) %>%
   rowwise() %>%
   mutate(Y = Col1[1],
          X1 = as.numeric(Col1[2]))
  • Related