Home > front end >  How to calculate values for the first row that meets a certain condition?
How to calculate values for the first row that meets a certain condition?

Time:06-23

I have the following dummy dataframe:

t <- data.frame(
           a= c(0,0,2,4,5),
           b= c(0,0,4,6,5))
a   b
0   0
0   0
2   4
4   6
5   5

I want to replace just the first value that it is not zero for the column b. Imagine that the row that meets this criteria is i. I want to replace t$b[i] with t[i 2] t[i 1] and the rest of t$b should remain the same. So the output would be

a   b
0   0
0   0
2  11
4   6
5   5

In fact the dataset is dynamic so I cannot directly point to a specific row, it has to meet the criteria of being the first row not equal to zero in column b. How can I create this new t$b?

CodePudding user response:

Here is a straight forward solution in base R:

t <- data.frame(
  a= c(0,0,2,4,5),
  b= c(0,0,4,6,5))


ind <- which(t$b > 0)[1L]
t$b[ind] <- t$b[ind 2L]   t$b[ind 1L]
t
  a  b
1 0  0
2 0  0
3 2 11
4 4  6
5 5  5

CodePudding user response:

Here is a roundabout way of getting there with a combination of group_by() and mutate():

library(tidyverse)

t %>%
  mutate(
    b_cond = b != 0, 
    row_number = row_number()
  ) %>%
  group_by(b_cond) %>%
  mutate(
    min_row_number = row_number == min(row_number),
    b = if_else(b_cond & min_row_number, lead(b, 1)   lead(b, 2), b)
  ) %>%
  ungroup() %>%
  select(a, b) # optional, to get back to original columns

# A tibble: 5 × 2
      a     b
  <dbl> <dbl>
1     0     0
2     0     0
3     2    11
4     4     6
5     5     5
  • Related