Home > Enterprise >  keep columns of a dataframe according to logical vector in r
keep columns of a dataframe according to logical vector in r

Time:07-20

I have a dataframe with features in the columns, and I have a logical vector for each one of the columns, whether they fulfill a certain condition or not. TRUE if they do, FALSE if they don't.

I just want to keep the columns that fulfill the condition, and have a TRUE in the logical vector.

This is the vector:

c(Adipocytes = TRUE, B.cells = TRUE, Basophils = FALSE, CD4..memory.T.cells = TRUE, 
CD4..naive.T.cells = TRUE, CD4..T.cells = FALSE, CD4..Tcm = TRUE, 
CD4..Tem = TRUE, CD8..naive.T.cells = FALSE, CD8..T.cells = TRUE, 
CD8..Tcm = TRUE, Class.switched.memory.B.cells = TRUE, DC = TRUE, 
Endothelial.cells = FALSE, Eosinophils = TRUE, Epithelial.cells = FALSE, 
Fibroblasts = TRUE, Hepatocytes = TRUE, ly.Endothelial.cells = TRUE, 
Macrophages = FALSE, Macrophages.M1 = TRUE, Macrophages.M2 = TRUE, 
Mast.cells = TRUE, Melanocytes = TRUE, Memory.B.cells = TRUE, 
Monocytes = FALSE, mv.Endothelial.cells = TRUE, naive.B.cells = TRUE, 
Neutrophils = TRUE, NK.cells = TRUE, pDC = TRUE, Pericytes = TRUE, 
Plasma.cells = TRUE, pro.B.cells = TRUE, Tgd.cells = TRUE, Th1.cells = FALSE, 
Th2.cells = TRUE, Tregs = FALSE)

and the dataframe:

structure(list(response = structure(c(1L, 1L, 1L, 1L, 1L, 1L, 
1L), .Label = c("NoResponse", "Response"), class = "factor"), 
    Adipocytes = c(0.0147750026478847, 0.0147750026478847, 0.0147750026478847, 
    0.0212089093123119, 0.0710538305419062, 0.0147750026478847, 
    0.0147750026478847), B.cells = c(0.0771628600461939, 0.0348430059727367, 
    0.166843196901955, 0.029029065282152, 0.0289092657550142, 
    0.0091727181061411, 0.0545426703298063), Basophils = c(0.123412405513611, 
    0.229980789620818, 0.710327302827785, 0.110317426229336, 
    0.593651555059932, 0.171672288243263, 0.16957000805118), 
    CD4..memory.T.cells = c(0.053430101401256, 0.00519336879475164, 
    0.00519336879475164, 0.0135894934563294, 0.00519336879475164, 
    0.00519336879475164, 0.0187854542183181), CD4..naive.T.cells = c(0.0198799983437911, 
    0.0198799983437911, 0.0198799983437911, 0.0198799983437911, 
    0.0198799983437911, 0.0198799983437911, 0.0198799983437911
    ), CD4..T.cells = c(0.0308423390003806, -0.000280618433405845, 
    -0.000280618433405845, -0.000280618433405845, -0.000280618433405845, 
    -0.000280618433405845, 0.0119164432522563)), row.names = c("Pt1", 
"Pt10", "Pt103", "Pt106", "Pt11", "Pt17", "Pt2"), class = "data.frame")

CodePudding user response:

You can first subset the names of the columns you wish to conserve and then, just use that names to subset the columns of the original dataframe. In the following lines cols is your vector and df the dataframe.

df[,names(cols[cols==T])]

CodePudding user response:

Try this

df[intersect(names(vec == T) , colnames(df))]

where df is your data.frame and vec is your vector.

  • Related