Home > Enterprise >  If else function to update column in R
If else function to update column in R

Time:12-18

I understand transition from Excel to R comes with some learning. Here, I would like to do update column 'Intensity' based on if else function. In Excel this will something read as if Time =100 then Intensity will be same as the value in Rainfall for anything else it will be the value in the next cell minus the cell value above (example =IF(C2=1000,D2,D2-D1). I tried some simple codes with my limited knowledge but couldn't get it through. Further, I would like to change the values under the column Time to a proper time value. May be this should be dealt before the if else. Below is the code that I managed to progress so far. Appreciate all inputs.

setwd ("C:/Users/")
files <- list.files(path="C:/Users/", full.names = T, recursive = F)

listcsv <- lapply(files, function(x) read.csv(paste0(x)))
data <- do.call(rbind, lapply
                (files, read.csv, as.is=T, skip = 6, header = TRUE))
keeps <- data[c("Date", "Time", "Rainfall")]
keeps$Date_Formatted <- as.Date(keeps$Date, format = "%d/%m/%Y")
dateorder <- keeps[order(keeps$Date_Formatted,decreasing = FALSE) , ]
dateorder['Intensity'] <- NA
dput(dateorder[1:10, ])

CodePudding user response:

This is best done in a grouping method.

dplyr

library(dplyr)
dateorder %>%
  group_by(grp = cumsum(Time == 1000)) %>%
  mutate(Intensity = c(Rainfall[1], diff(Rainfall))) %>%
  ungroup() %>%
  select(-grp)
# # A tibble: 10 x 5
#    Date        Time Rainfall Date_Formatted Intensity
#    <chr>      <int>    <dbl> <date>             <dbl>
#  1 30/04/2021  1000      0.4 2021-04-30         0.4  
#  2 30/04/2021  1100      0.4 2021-04-30         0    
#  3 30/04/2021  1200      0.6 2021-04-30         0.200
#  4 30/04/2021  1300      0.8 2021-04-30         0.2  
#  5 30/04/2021  1400      0.8 2021-04-30         0    
#  6 30/04/2021  1500      1   2021-04-30         0.200
#  7 30/04/2021  1600      0   2021-04-30        -1    
#  8 30/04/2021  1700      0   2021-04-30         0    
#  9 30/04/2021  1800      0   2021-04-30         0    
# 10 30/04/2021  1900      0   2021-04-30         0    

base R

grp <- cumsum(dateorder$Time == 1000)
dateorder$Intensity <- ave(dateorder$Rainfall, grp, FUN = function(z) c(z[1], diff(z)))
dateorder
#          Date Time Rainfall Date_Formatted Intensity
# 1  30/04/2021 1000      0.4     2021-04-30       0.4
# 2  30/04/2021 1100      0.4     2021-04-30       0.0
# 3  30/04/2021 1200      0.6     2021-04-30       0.2
# 4  30/04/2021 1300      0.8     2021-04-30       0.2
# 5  30/04/2021 1400      0.8     2021-04-30       0.0
# 6  30/04/2021 1500      1.0     2021-04-30       0.2
# 7  30/04/2021 1600      0.0     2021-04-30      -1.0
# 8  30/04/2021 1700      0.0     2021-04-30       0.0
# 9  30/04/2021 1800      0.0     2021-04-30       0.0
# 10 30/04/2021 1900      0.0     2021-04-30       0.0

Data

dateorder <- structure(list(Date = c("30/04/2021", "30/04/2021", "30/04/2021", "30/04/2021", "30/04/2021", "30/04/2021", "30/04/2021", "30/04/2021", "30/04/2021", "30/04/2021"), Time = c(1000L, 1100L, 1200L, 1300L, 1400L, 1500L, 1600L, 1700L, 1800L, 1900L), Rainfall = c(0.4, 0.4, 0.6, 0.8, 0.8, 1, 0, 0, 0, 0), Date_Formatted = structure(c(18747, 18747, 18747, 18747, 18747, 18747, 18747, 18747, 18747, 18747 ), class = "Date"), Intensity = c(0.4, 0, 0.2, 0.2, 0, 0.2, -1, 0, 0, 0)), row.names = c(NA, 10L), class = "data.frame") 
  • Related