I have a data frame like this
x1<- c(0,1,1,1,1,0)
df<-data.frame(x1)
I want to add another column that will take the sum of every two rows and store the value for the first two rows. This should look like this.
You can see here that the first two rows' sum is 1 and that is given in the first two rows of the new column (x2). Next, the third and fourth-row sum is given in the 3rd and fourth row of the new column. Can anyone help?
CodePudding user response:
Here a way using dplyr
where I create a auxiliar column to group by
library(dplyr)
x1<- c(0,1,1,1,1,0)
df <- data.frame(x1)
len_df <- nrow(df)
aux <- rep(seq(1:(len_df/2)),each = 2)[1:len_df]
df %>%
mutate(aux = aux) %>%
group_by(aux) %>%
mutate(x2 = sum(x1)) %>%
ungroup() %>%
select(-aux)
# A tibble: 6 x 2
x1 x2
<dbl> <dbl>
1 0 1
2 1 1
3 1 2
4 1 2
5 1 1
6 0 1
CodePudding user response:
You can define the groups using floor division and then simply obtain the grouped sum:
library(dplyr)
df %>%
mutate(group = (row_number() - 1) %/% 2) %>%
group_by(group) %>%
mutate(x2 = sum(x1)) %>%
ungroup() %>%
select(-group)
# # A tibble: 6 × 2
# x1 x2
# <dbl> <dbl>
# 1 0 1
# 2 1 1
# 3 1 2
# 4 1 2
# 5 1 1
# 6 0 1
CodePudding user response:
Create an index with gl
for every 2 rows and do the sum after grouping
library(dplyr)
df <- df %>%
group_by(grp = as.integer(gl(n(), 2, n()))) %>%
mutate(x2 = sum(x1)) %>%
ungroup %>%
select(-grp)
-output
df
# A tibble: 6 × 2
x1 x2
<dbl> <dbl>
1 0 1
2 1 1
3 1 2
4 1 2
5 1 1
6 0 1
Or using collapse/data.table
library(data.table)
library(collapse)
setDT(df)[, x2 := fsum(x1, g = rep(.I, each = 2, length.out = .N), TRA = 1)]
-output
> df
x1 x2
<num> <num>
1: 0 1
2: 1 1
3: 1 2
4: 1 2
5: 1 1
6: 0 1
CodePudding user response:
You can use ave
ceiling
(both are base R functions)
> transform(df, x2 = ave(x1, ceiling(seq_along(x1) / 2)) * 2)
x1 x2
1 0 1
2 1 1
3 1 2
4 1 2
5 1 1
6 0 1
CodePudding user response:
First, a way of making the data.frame without the intermediate variable.
This splits the data.frame into groups of 2, sums, then repeats the pattern into the new variable.
df<-data.frame(x1=c(0,1,1,1,1,0))
df$x2<-rep(lapply(split(df, rep(1:3, each=2)), sum), each=2)
# x1 x2
#1 0 1
#2 1 1
#3 1 2
#4 1 2
#5 1 1
#6 0 1
CodePudding user response:
in base R you could do:
transform(df,x2 = ave(x1, gl(nrow(df)/2, 2), FUN = sum))
x1 x2
1 0 1
2 1 1
3 1 2
4 1 2
5 1 1
6 0 1