Home > Mobile >  Cond. formula if the sequence of the last five lines is > 0.5 then "El Niño" and if the
Cond. formula if the sequence of the last five lines is > 0.5 then "El Niño" and if the

Time:10-03

I´m trying to obtain an formula using dplyr::mutate in R to show me if the El Niño or if the La Niña had occurred before in my dataframe.

The rule decision is:

If $TMA_{t-1} > 0.5 ,, \mbox{and} ,, TMA_{t-2} > 0.5 \mbox{and} ,, TMA_{t-3} > 0.5 \mbox{and} ,, TMA_{t-4} > 0.5 \mbox{and} ,, TMA_{t-5} > 0.5 \mbox{then} ,, \mbox{"El Niño"} $

else if

$TMA_{t-1} < -0.5 ,, \mbox{and} ,, TMA_{t-2} < -0.5 \mbox{and} ,, TMA_{t-3} < -0.5 \mbox{and} ,, TMA_{t-4} < -0.5 \mbox{and} ,, TMA_{t-5} < -0.5 \mbox{then} ,, \mbox{"La Niña"}$

else if none of the above happens then leave in blank.

More specific we have:

If 5 latest consecutives TMA > 0.5 happens, then "El Niño", otherwise if the last five consecutives TMA < -0.5 happens, then "La Niña". And if none of the above possibilities are checked, then leave blank (NA or NULL, for example)

This is an small view of my issue that I have founded solution in a spreadsheet:

Excel formula for rule of decision characterization

In portuguese language =SE(E means =IF(AND ...

In a dataframe in R we can do:

library(dplyr)
library(fpp3)

dates <- yearmonth(c(
       "2018-02", 
       "2018-03",
       "2018-04",
       "2018-05", 
       "2018-06",
       "2018-07",      
       "2018-08", 
       "2018-09",
       "2018-10",
       "2018-11", 
       "2018-12",
       "2019-01",  
       "2019-02", 
       "2019-03",
       "2019-04",
       "2019-05", 
       "2018-06"
        ))

TMA <- c(
  -0.85,
  -0.69,
  -0.50,
  -0.22,
  -0.01,
   0.09,
   0.23,
   0.49,
   0.76,
   0.90,
   0.82,
   0.75,
   0.73,
   0.72,
   0.66,
   0.54,
   0.45 
  )

df <- data.frame(dates, TMA)

df <- df %>%
  mutate(
    ´Climatic Condition´= 
            # The conditional statement that I had wrote above... (HELP!)
            )

How can I complete the Climatic Conditioninside dplyr::mutate in R ?

CodePudding user response:

You can use zoo's rolling operation.

library(dplyr)
library(zoo)

df %>%
  mutate(climatic_condition = lag(case_when(
             rollapplyr(TMA < -0.5, 5, all, fill = FALSE) ~ "La Niña", 
             rollapplyr(TMA >  0.5, 5, all, fill = FALSE) ~ "El Niño")
         ))

#      dates   TMA climatic_condition
#1  2018 Feb -0.85               <NA>
#2  2018 Mar -0.69               <NA>
#3  2018 Apr -0.50               <NA>
#4  2018 May -0.22               <NA>
#5  2018 Jun -0.01               <NA>
#6  2018 Jul  0.09               <NA>
#7  2018 Aug  0.23               <NA>
#8  2018 Sep  0.49               <NA>
#9  2018 Oct  0.76               <NA>
#10 2018 Nov  0.90               <NA>
#11 2018 Dec  0.82               <NA>
#12 2019 Jan  0.75               <NA>
#13 2019 Feb  0.73               <NA>
#14 2019 Mar  0.72            El Niño
#15 2019 Apr  0.66            El Niño
#16 2019 May  0.54            El Niño
#17 2018 Jun  0.45            El Niño

CodePudding user response:

You could use

library(dplyr)

df %>% 
  mutate(condition = case_when(
    lag(TMA) > 0.5 & lag(TMA, 2) > 0.5 & lag(TMA, 3) > 0.5 & lag(TMA, 4) > 0.5 & lag(TMA, 5) > 0.5 ~ "El Niño",
    lag(TMA) < -0.5 & lag(TMA, 2) < -0.5 & lag(TMA, 3) < -0.5 & lag(TMA, 4) < -0.5 & lag(TMA, 5) < -0.5 ~ "La Niña")
    )

This returns

        dates   TMA condition
1  2018-02-01 -0.85      <NA>
2  2018-03-01 -0.69      <NA>
3  2018-04-01 -0.50      <NA>
4  2018-05-01 -0.22      <NA>
5  2018-06-01 -0.01      <NA>
6  2018-07-01  0.09      <NA>
7  2018-08-01  0.23      <NA>
8  2018-09-01  0.49      <NA>
9  2018-10-01  0.76      <NA>
10 2018-11-01  0.90      <NA>
11 2018-12-01  0.82      <NA>
12 2019-01-01  0.75      <NA>
13 2019-02-01  0.73      <NA>
14 2019-03-01  0.72   El Niño
15 2019-04-01  0.66   El Niño
16 2019-05-01  0.54   El Niño
17 2018-06-01  0.45   El Niño

There are more sophisticated ways but this is an easy approach.

  • Related