I´m trying to extract and load the ONI data from NOAA, but I´m foundig this issue:
ONI <- read_table("https://psl.noaa.gov/data/correlation/oni.data",
skip = 1,
n_max = 74,
col_names = FALSE) %>%
set_names(c('year', month.abb)) %>%
pivot_longer(-year,
names_to = 'month',
values_to = 'ONI')
head(ONI)
The bug showed is: Error: Can´t combine `Jan`<character> and `Feb`<double>.
How can I solve this ?
I wish to make an time series looks like this:
YearMonth ONI
1950 jan -1.53
1950 feb -1.34
1950 mar -1.16
... ...
2021 dec -99.90
Thanks a lot for your contribution, but I found another issue:
The first month of 1950 needs to be -1.53 and not -1.34 (see https://psl.noaa.gov/data/correlation/oni.data )
How can I fix it ?
CodePudding user response:
The reason is that the second column is read as character
>ONI <- read_table("https://psl.noaa.gov/data/correlation/oni.data",
skip = 1,
n_max = 74,
col_names = FALSE)
> str(ONI)
spec_tbl_df [74 × 11] (S3: spec_tbl_df/tbl_df/tbl/data.frame)
$ X1 : chr [1:74] "1950 -1.53" "1951 -0.82" "1952 0.53" "1953 0.40" ...
$ X2 : chr [1:74] "-1.34 -1.16" "-0.54 -0.17" "0.37 0.34" "0.60 0.63" ...
$ X3 : num [1:74] -1.18 0.18 0.29 0.66 -0.41 -0.8 -0.54 0.72 0.93 0.33 ...
$ X4 : num [1:74] -1.07 0.36 0.2 0.75 -0.54 -0.79 -0.52 0.92 0.74 0.2 ...
$ X5 : num [1:74] -0.85 0.58 0 0.77 -0.5 -0.72 -0.51 1.11 0.64 -0.07 ...
$ X6 : num [1:74] -0.54 0.7 -0.08 0.75 -0.64 -0.68 -0.57 1.25 0.57 -0.18 ...
$ X7 : num [1:74] -0.42 0.89 0 0.73 -0.84 -0.75 -0.55 1.32 0.43 -0.28 ...
$ X8 : num [1:74] -0.39 0.99 0.15 0.78 -0.9 -1.09 -0.46 1.33 0.39 -0.09 ...
$ X9 : num [1:74] -0.44 1.15 0.1 0.84 -0.77 -1.42 -0.42 1.39 0.44 -0.03 ...
$ X10: num [1:74] -0.6 1.04 0.04 0.84 -0.73 -1.67 -0.43 1.53 0.5 0.05 ...
$ X11: num [1:74] -0.8 0.81 0.15 0.81 -0.66 -1.47 -0.43 1.74 0.61 -0.04 ...
...
If we convert the second column to numeric
, it would work
ONI$X2 <- as.numeric(ONI$X2)
> pivot_longer(ONI, cols = -X1, names_to = 'month', values_to = 'ONI')
# A tibble: 740 × 3
X1 month ONI
<chr> <chr> <dbl>
1 1950 -1.53 X2 NA
2 1950 -1.53 X3 -1.18
3 1950 -1.53 X4 -1.07
4 1950 -1.53 X5 -0.85
5 1950 -1.53 X6 -0.54
6 1950 -1.53 X7 -0.42
7 1950 -1.53 X8 -0.39
8 1950 -1.53 X9 -0.44
9 1950 -1.53 X10 -0.6
10 1950 -1.53 X11 -0.8
# … with 730 more rows
CodePudding user response:
Akrun is correct. There is a second issue where your column X1 contains two pieces of information - "Year" and "January". You can use stringr to tidy them up. Example:
data <- data.frame(
X1 = c("1950 -1.53", "1951 -0.82", "1952 0.53")
)
editedData <- data %>%
mutate(Col1 = str_split(str_squish(X1), " ")) %>%
rowwise() %>%
mutate(Y = Col1[1],
X1 = as.numeric(Col1[2]))