Home > Mobile >  Sum column with similar name
Sum column with similar name

Time:05-11

I have a Dataset in R like this (my real dataset has more rows and columns):

AB1 AB3 AB4 XB1 XB3 XB4
12 34 0 5 3 7

I need to sum the column similar like

AB1 XB1 AB3 XB3 AB4 XB4

What is the code I can use?

CodePudding user response:

You could use

library(dplyr)
df %>% 
  mutate(across(starts_with("AB"),
                ~.x   df[[gsub("AB", "XB", cur_column())]],
                .names = "sum_{.col}"))

This returns

# A tibble: 1 x 9
    AB1   AB3   AB4   XB1   XB3   XB4 sum_AB1 sum_AB3 sum_AB4
  <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>   <dbl>   <dbl>   <dbl>
1    12    34     0     5     3     7      17      37       7
  • We use across and mutate in this approach.
  • First we select all columns starting with AB. The desired sums are always ABn XB2, so we can use this pattern.
  • Next we replace AB in the name of the current selected column with XB and sum those two columns. These sums are stored in a new column prefixed with sum_.

CodePudding user response:

Assuming it is the first character that changes and the others are used to group

df=read.table(text="
AB1 AB3 AB4 XB1 XB3 XB4
12  34  0   5   3   7
11  35  1   7   2   8",h=T)

sapply(
  unique(substr(colnames(df),2,100)),
  function(x){
    rowSums(df[,grepl(x,colnames(df))])
  }
)

     B1 B3 B4
[1,] 17 37  7
[2,] 18 37  9

CodePudding user response:

Try this:

library(tidyverse)

tribble(
  ~AB1, ~AB3, ~AB4, ~XB1, ~XB3, ~XB4,
  12, 34, 0, 5, 3, 7
) |> 
  pivot_longer(everything(), names_pattern = "(\\w\\w)(\\d)", names_to = c("prefix", "suffix")) |> 
  pivot_wider(names_from = prefix) |> 
  rowwise() |> 
  mutate(sum = sum(c_across(- suffix)))
#> # A tibble: 3 × 4
#> # Rowwise: 
#>   suffix    AB    XB   sum
#>   <chr>  <dbl> <dbl> <dbl>
#> 1 1         12     5    17
#> 2 3         34     3    37
#> 3 4          0     7     7

Created on 2022-05-11 by the reprex package (v2.0.1)

CodePudding user response:

If you know that structure is consistent (an "A" and "X" pair for everything), then this should work.

cols <- unique(substring(names(df), 2))
df[paste0("A", cols)]   df[paste0("X", cols)]

CodePudding user response:

Using the 2 row DF2 in the Note as input calculate the suffixes (s), unique suffixes (u) and perform the indicated matrix multiplication giving (m). Finally convert that back to a data frame and set the names. No packages are used.

s <- substring(names(DF2), 2)
u <- unique(s)
m <- as.matrix(DF2) %*% outer(s, u, `==`)
sums <- setNames(as.data.frame(m), u); sums
##   B1 B3 B4
## 1 17 37  7
## 2 17 37  7

If it is desired to append these as columns to DF2 then:

data.frame(DF2, sum = sums)
##   AB1 AB3 AB4 XB1 XB3 XB4 sum.B1 sum.B3 sum.B4
## 1  12  34   0   5   3   7     17     37      7
## 2  12  34   0   5   3   7     17     37      7

Note

DF <- structure(list(AB1 = 12L, AB3 = 34L, AB4 = 0L, XB1 = 5L, XB3 = 3L, 
    XB4 = 7L), class = "data.frame", row.names = c(NA, -1L))
DF2 <- rbind(DF, DF)

DF2
##   AB1 AB3 AB4 XB1 XB3 XB4
## 1  12  34   0   5   3   7
## 2  12  34   0   5   3   7

CodePudding user response:

We can try the code below

cbind(
    df,
    list2DF(lapply(
        split.default(df, gsub("\\D ", "", names(df))),
        rowSums
    ))
)

which gives

  AB1 AB3 AB4 XB1 XB3 XB4  1  3 4
1  12  34   0   5   3   7 17 37 7
  •  Tags:  
  • r sum
  • Related