Sum column with similar name-CodePudding

I have a Dataset in R like this (my real dataset has more rows and columns):

AB1	AB3	AB4	XB1	XB3	XB4
12	34	0	5	3	7

I need to sum the column similar like

AB1 XB1 AB3 XB3 AB4 XB4

What is the code I can use?

CodePudding user response：

You could use

library(dplyr)
df %>% 
  mutate(across(starts_with("AB"),
                ~.x   df[[gsub("AB", "XB", cur_column())]],
                .names = "sum_{.col}"))

This returns

# A tibble: 1 x 9
    AB1   AB3   AB4   XB1   XB3   XB4 sum_AB1 sum_AB3 sum_AB4
  <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>   <dbl>   <dbl>   <dbl>
1    12    34     0     5     3     7      17      37       7

We use across and mutate in this approach.
First we select all columns starting with AB. The desired sums are always ABn XB2, so we can use this pattern.
Next we replace AB in the name of the current selected column with XB and sum those two columns. These sums are stored in a new column prefixed with sum_.

CodePudding user response：

Assuming it is the first character that changes and the others are used to group

df=read.table(text="
AB1 AB3 AB4 XB1 XB3 XB4
12  34  0   5   3   7
11  35  1   7   2   8",h=T)

sapply(
  unique(substr(colnames(df),2,100)),
  function(x){
    rowSums(df[,grepl(x,colnames(df))])
  }
)

     B1 B3 B4
[1,] 17 37  7
[2,] 18 37  9

CodePudding user response：

Try this:

library(tidyverse)

tribble(
  ~AB1, ~AB3, ~AB4, ~XB1, ~XB3, ~XB4,
  12, 34, 0, 5, 3, 7
) |> 
  pivot_longer(everything(), names_pattern = "(\\w\\w)(\\d)", names_to = c("prefix", "suffix")) |> 
  pivot_wider(names_from = prefix) |> 
  rowwise() |> 
  mutate(sum = sum(c_across(- suffix)))
#> # A tibble: 3 × 4
#> # Rowwise: 
#>   suffix    AB    XB   sum
#>   <chr>  <dbl> <dbl> <dbl>
#> 1 1         12     5    17
#> 2 3         34     3    37
#> 3 4          0     7     7

^{Created on 2022-05-11 by the reprex package (v2.0.1)}

CodePudding user response：

If you know that structure is consistent (an "A" and "X" pair for everything), then this should work.

cols <- unique(substring(names(df), 2))
df[paste0("A", cols)]   df[paste0("X", cols)]

CodePudding user response：

Using the 2 row DF2 in the Note as input calculate the suffixes (s), unique suffixes (u) and perform the indicated matrix multiplication giving (m). Finally convert that back to a data frame and set the names. No packages are used.

s <- substring(names(DF2), 2)
u <- unique(s)
m <- as.matrix(DF2) %*% outer(s, u, `==`)
sums <- setNames(as.data.frame(m), u); sums
##   B1 B3 B4
## 1 17 37  7
## 2 17 37  7

If it is desired to append these as columns to DF2 then:

data.frame(DF2, sum = sums)
##   AB1 AB3 AB4 XB1 XB3 XB4 sum.B1 sum.B3 sum.B4
## 1  12  34   0   5   3   7     17     37      7
## 2  12  34   0   5   3   7     17     37      7

Note

DF <- structure(list(AB1 = 12L, AB3 = 34L, AB4 = 0L, XB1 = 5L, XB3 = 3L, 
    XB4 = 7L), class = "data.frame", row.names = c(NA, -1L))
DF2 <- rbind(DF, DF)

DF2
##   AB1 AB3 AB4 XB1 XB3 XB4
## 1  12  34   0   5   3   7
## 2  12  34   0   5   3   7

CodePudding user response：

We can try the code below

cbind(
    df,
    list2DF(lapply(
        split.default(df, gsub("\\D ", "", names(df))),
        rowSums
    ))
)

which gives

  AB1 AB3 AB4 XB1 XB3 XB4  1  3 4
1  12  34   0   5   3   7 17 37 7