I am trying to generate values in column D that appears every 3 rows. The first value in D are the sum of the first three values in A divided by the sum of the first three values in B. The second value in D should be in the fourth row that is the sum of rows 4-6 in A divided by sum of rows 4-6 in B and so on.
A B D
1 2 1 3 4/2 2 3
3 2
4 3
4 4 4 5 6/4 3 2
5 3
6 2
CodePudding user response:
You may try using dplyr
library(dplyr)
df <- read.table(text = "
A B
1 2
3 2
4 3
4 4
5 3
6 2", header = T)
df %>%
mutate(k = floor((1:n()-1)/3),
n = 1:n()) %>%
group_by(k) %>%
mutate(D = ifelse(n %% 3 == 1, sum(A)/sum(B), NA)) %>%
ungroup %>%
select(-k, -n)
A B D
<int> <int> <dbl>
1 1 2 1.14
2 3 2 NA
3 4 3 NA
4 4 4 1.67
5 5 3 NA
6 6 2 NA
Update
Thanks to @Jean-ClaudeArbaut
df <- read.table(text = "
A B
1 2
3 2
4 3
4 4
5 3
6 2
1 'data suppressed'
1 3
'data suppressed' 3", header = T)
df %>%
mutate(n = row_number(),
k = ceiling(n/3),
) %>%
group_by(k) %>%
mutate(D = ifelse(n %% 3 == 1, sum(as.numeric(A), na.rm = T)/sum(as.numeric(B), na.rm = T), NA)) %>%
ungroup %>%
select(-k, -n)
A B D
<chr> <chr> <dbl>
1 1 2 1.14
2 3 2 NA
3 4 3 NA
4 4 4 1.67
5 5 3 NA
6 6 2 NA
7 1 data suppressed 0.333
8 1 3 NA
9 data suppressed 3 NA
CodePudding user response:
Here is a quick solution in R.
dat <- data.frame(A = c(1,3,4,4,5,6),
B = c(2,2,3,4,3,2))
n <- nrow(dat)
L_idx <- 3*c(1:(n/3)) - 2
U_idx <- 3*c(1:(n/3))
D <- rep(NA, n)
for (i in 1:(n/3)) {
D[L_idx[i]] <- sum(dat$A[L_idx[i]:U_idx[i]])/sum(dat$B[L_idx[i]:U_idx[i]])
}
dat <- data.frame(dat, D)
> dat
A B D
1 1 2 1.142857
2 3 2 NA
3 4 3 NA
4 4 4 1.666667
5 5 3 NA
6 6 2 NA