Home > OS >  case when for multiple conditions not working in R
case when for multiple conditions not working in R

Time:06-12

Here's my code:

    breakover <- function(correlation, growth_factor, wells_per_section, max_breakover = 16){
           case_when(correlation[wells_per_section == max_breakover] > 0.999 ~ max_breakover,
                all(growth_factor) > 0.8 ~ max_breakover,
                T ~ wells_per_section[min(which(growth_factor < 0.8))]-1)
    }
  

If the 16th correlation between section_eur and wells_per_section is greater than 0.999, record the breakover spacing as 16.

If the growth_factor column is always > 0.8 (from first row to the max row for the same reservoir_id), record the breakover spacing as 16,

The third condition is if none of the above is true, go find where growth_factor is <0.8 locate the cell above it since that's the last >0.8 value, and record the wells_per_section, so when I don't have "all(growth_factor) > 0.8 ~ max_breakover," it's working fine, when I add it, it's not working as desired, but I do need that condition.

And here's the partial dataset if it helps to understand my problem:

data <- tibble::tribble(
  ~reservoir_id, ~Wells_per_section,        ~well_eur,     ~section_eur,         ~incr_eur,      ~growth_factor,      ~correlation,
            187,                 1L, 23175.4846595876, 23175.4846595876,  23175.4846595876,                   1,                 1,
            187,                 2L, 23174.6488110432, 46349.2976220863,  23173.8129624988,   0.999963932633836,                 1,
            187,                 3L, 21927.6466708696, 65782.9400126088,  19433.6423905225,   0.886262109300568, 0.998718186856097,
            187,                 4L, 21972.0295210873, 87888.1180843492,  22105.1780717404,    1.00605991133069, 0.999487163006646,
            187,                 5L, 21442.6200328201,   107213.1001641,  19324.9820797511,   0.901241641654437, 0.999555673988887,
            187,                 6L,   21551.16613454,  129306.99680724,    22093.89664314,    1.02518334763009, 0.999743621689024,
            187,                 7L, 21230.3367379342, 148612.357165539,   19305.360358299,   0.909328975635338, 0.999784843928536,
            187,                 8L, 21011.4717674484, 168091.774139587,  19479.4169740481,   0.927084841540046, 0.999775907235973,
            187,                 9L, 18924.7167504405, 170322.450753965,  2230.67661437762,   0.917871070082235, 0.994486053177733,
            187,                10L, 17213.1637348062, 172131.637348062,    1809.186594097,   0.905104826862171, 0.984338205567282,
            187,                11L, 15809.6446592214, 173906.091251436,  1774.45390337371,   0.912238696164414, 0.971685031469237,
            187,                12L, 14650.6926441877, 175808.311730252,  1902.22047881651,   0.929838262600586, 0.958247380885512,
            187,                13L, 13690.7787061749, 177980.123180273,  2171.81145002131,   0.958633157151373, 0.945189752321453,
            187,                14L, 12885.4775455834, 180396.685638168,  2416.56245789409,   0.887541552056981, 0.933169233384585,
            187,                15L, 12194.1529755015, 182912.294632523,  2515.60899435563,   0.906296329020111, 0.922411528068806,
            187,                16L, 11594.1228761792, 185505.966018868,  2593.67138634462,    0.92370570107321,  0.91298990632797,
            188,                 1L, 53229.0340971704, 53229.0340971704,  53229.0340971704,                   1,                 1,
            188,                 2L, 48718.6235189102, 97437.2470378205,  44208.2129406501,   0.907419170483964,                 1,
            188,                 3L, 38155.0101507641, 114465.030452292,  17027.7834144718,   0.446279095384563, 0.968697765077586,
            188,                 4L, 29014.2865349962, 116057.146139985,  1592.11568769233,  0.0548735081171812, 0.906000812378873,
            188,                 5L, 23238.4800580768, 116192.400290384,  135.254150399196, 0.00582026664657817, 0.845608263691773,
            188,                 6L,  19401.171882342, 116407.031294052,  214.631003667935,  0.0110627855353048, 0.794560011850183,
            188,                 7L,  16624.245273923, 116369.716917461, -37.3143765906134, 0.00224457567701706, 0.750178915017314,
            188,                 8L, 14543.4618771122, 116347.695016898, -22.0219005634426, 0.00151421310479726, 0.711758427551842,
            188,                 9L, 12940.9039085222,   116468.1351767,  120.440159802267, 0.00930693564017202, 0.679233692144686,
            188,                10L,  11683.489800248,  116834.89800248,  366.762825780053,  0.0313915475641763, 0.653015744709973,
            188,                11L, 10541.6057608886, 115957.663369774, -877.234632705688,   0.083216414330386, 0.623584083166208,
            188,                12L, 9680.48240644707, 116165.788877365,  208.125507590506,  0.0214994975304018, 0.599187130726719,
            188,                13L, 9030.53175628351, 117396.912831686,  1231.12395432078,   0.136329065391321, 0.585029584969854,
            188,                14L, 8394.90504051518, 117528.670567212,  131.757735526859,  0.0156949643731495, 0.572171242633441,
            188,                15L, 7771.90067510843, 116578.510126626, -950.160440586042,   0.122255865110215, 0.554230322803739,
            188,                16L, 7335.24970926487, 117363.995348238,  785.485221611496,   0.107083637605326, 0.542352922481248
  )

I'll then apply this function to get a new data frame:

breakover_spacing <- sim_breakover %>%
  dplyr::group_by(reservoir_id)%>%
  dplyr::summarise(breakover = breakover(correlation, growth_factor, Wells_per_section))

This would be the output:

  reservoir_id, breakover
       187         16
       188          2
    

I think my issue is in the case when "all(growth_factor) > 0.8 ~ max_breakover"

CodePudding user response:

There are couple of issues in the breakover code i.e. all(growth_factor) > 0.8 would be all(growth_factor > 0.8) (as all(growth_factor) returns TRUE when all the values are not 0 instead it should be based on the logical condition with 0.8) and in the first condition should be wrapped in any. Also, as this is a summarisation which involves returning a single value, we may use if/else here

library(dplyr)
sim_breakover %>%
  group_by(reservoir_id) %>% 
  dplyr::summarise(breakover =  if(any(correlation > 0.999 & 
      Wells_per_section == 16) |all(growth_factor > 0.8)) 16 
     else first(Wells_per_section[growth_factor < 0.8])-1)

-output

# A tibble: 2 × 2
  reservoir_id breakover
         <dbl>     <dbl>
1          187        16
2          188         2
  • Related