I understand transition from Excel to R comes with some learning. Here, I would like to do update column 'Intensity' based on if else function. In Excel this will something read as if Time =100 then Intensity will be same as the value in Rainfall for anything else it will be the value in the next cell minus the cell value above (example =IF(C2=1000,D2,D2-D1). I tried some simple codes with my limited knowledge but couldn't get it through. Further, I would like to change the values under the column Time to a proper time value. May be this should be dealt before the if else. Below is the code that I managed to progress so far. Appreciate all inputs.
setwd ("C:/Users/")
files <- list.files(path="C:/Users/", full.names = T, recursive = F)
listcsv <- lapply(files, function(x) read.csv(paste0(x)))
data <- do.call(rbind, lapply
(files, read.csv, as.is=T, skip = 6, header = TRUE))
keeps <- data[c("Date", "Time", "Rainfall")]
keeps$Date_Formatted <- as.Date(keeps$Date, format = "%d/%m/%Y")
dateorder <- keeps[order(keeps$Date_Formatted,decreasing = FALSE) , ]
dateorder['Intensity'] <- NA
dput(dateorder[1:10, ])
CodePudding user response:
This is best done in a grouping method.
dplyr
library(dplyr)
dateorder %>%
group_by(grp = cumsum(Time == 1000)) %>%
mutate(Intensity = c(Rainfall[1], diff(Rainfall))) %>%
ungroup() %>%
select(-grp)
# # A tibble: 10 x 5
# Date Time Rainfall Date_Formatted Intensity
# <chr> <int> <dbl> <date> <dbl>
# 1 30/04/2021 1000 0.4 2021-04-30 0.4
# 2 30/04/2021 1100 0.4 2021-04-30 0
# 3 30/04/2021 1200 0.6 2021-04-30 0.200
# 4 30/04/2021 1300 0.8 2021-04-30 0.2
# 5 30/04/2021 1400 0.8 2021-04-30 0
# 6 30/04/2021 1500 1 2021-04-30 0.200
# 7 30/04/2021 1600 0 2021-04-30 -1
# 8 30/04/2021 1700 0 2021-04-30 0
# 9 30/04/2021 1800 0 2021-04-30 0
# 10 30/04/2021 1900 0 2021-04-30 0
base R
grp <- cumsum(dateorder$Time == 1000)
dateorder$Intensity <- ave(dateorder$Rainfall, grp, FUN = function(z) c(z[1], diff(z)))
dateorder
# Date Time Rainfall Date_Formatted Intensity
# 1 30/04/2021 1000 0.4 2021-04-30 0.4
# 2 30/04/2021 1100 0.4 2021-04-30 0.0
# 3 30/04/2021 1200 0.6 2021-04-30 0.2
# 4 30/04/2021 1300 0.8 2021-04-30 0.2
# 5 30/04/2021 1400 0.8 2021-04-30 0.0
# 6 30/04/2021 1500 1.0 2021-04-30 0.2
# 7 30/04/2021 1600 0.0 2021-04-30 -1.0
# 8 30/04/2021 1700 0.0 2021-04-30 0.0
# 9 30/04/2021 1800 0.0 2021-04-30 0.0
# 10 30/04/2021 1900 0.0 2021-04-30 0.0
Data
dateorder <- structure(list(Date = c("30/04/2021", "30/04/2021", "30/04/2021", "30/04/2021", "30/04/2021", "30/04/2021", "30/04/2021", "30/04/2021", "30/04/2021", "30/04/2021"), Time = c(1000L, 1100L, 1200L, 1300L, 1400L, 1500L, 1600L, 1700L, 1800L, 1900L), Rainfall = c(0.4, 0.4, 0.6, 0.8, 0.8, 1, 0, 0, 0, 0), Date_Formatted = structure(c(18747, 18747, 18747, 18747, 18747, 18747, 18747, 18747, 18747, 18747 ), class = "Date"), Intensity = c(0.4, 0, 0.2, 0.2, 0, 0.2, -1, 0, 0, 0)), row.names = c(NA, 10L), class = "data.frame")