Home > OS >  How to remove columns with specific sum but ignore others?
How to remove columns with specific sum but ignore others?

Time:10-12

I have a dataframe with a bunch of abundance values for species as well as metadata combined. I use the following code below to delete all species whose abundances are less than 1, but the issue I am having is I can't figure out how to ignore the metadata column where the longitude data is also less than 1. I would like to keep that column and just focus abundance values (like the last 10 columns of the dataframe are abundances, the first 5 are metadata values that should be left untouched).

Here is an example of my dataframe (mini version):

site <- c("S1","S2","S3")
lat <- c(30,30,30.1)
long <- c(-43.11,-42.23,-42.10)
sp1 <- c(0,0,0)
sp2 <- c(10,4,9)
sp3 <- c(1,1,2)

x <- data.frame(site,lat,long,sp1,sp2,sp3)
  site latitude longitude sp1 sp2 sp3
1   S1       30      -43.11   0  10   1
2   S2       30      -42.23   0   4   1
3   S3       30.1    -42.10   0   9   2

I just need to grab all columns for species abundances that sum up to 0 and remove them. I used:

x <- x[,colSums(x[,4:ncol(x)]) > 0]
x
   lat   long sp2 sp3
1 30.0 -43.11  10   1
2 30.0 -42.23   4   1
3 30.1 -42.10   9   2

But I can't get it to return the "site" column with this...probably because it is a character column that I need to keep. Is there no way to have the table just drop the columns I am subsetting but leave everything else alone?

My goal is to return the following:

   site lat   long sp2 sp3
1   S1   30.0 -43.11  10   1
2   S2   30.0 -42.23   4   1
3   S3   30.1 -42.10   9   2

Only sp1 is dropped because it's sum was 0.

CodePudding user response:

We may use select

library(dplyr)
x %>%
    select(all_of(names(.)[1:3]), where(~ is.numeric(.) && 
       sum(., na.rm = TRUE) > 0))

-output

   site  lat   long sp2 sp3
1   S1 30.0 -43.11  10   1
2   S2 30.0 -42.23   4   1
3   S3 30.1 -42.10   9   2

In the OP's code, just add 3 TRUE to concatenate with the logical output based on the conversion of colSums to logical vector

 x[c(rep(TRUE, 3), colSums(x[,4:ncol(x)]) > 0)]
  site  lat   long sp2 sp3
1   S1 30.0 -43.11  10   1
2   S2 30.0 -42.23   4   1
3   S3 30.1 -42.10   9   2
  •  Tags:  
  • r
  • Related