Home > database >  Filtering a list based on string match
Filtering a list based on string match

Time:04-02

I have a list small subset of list , where I have made a list of multiple files with three columns.

My objecitve is to filter the list only to keep rows in the list which is either UP or DOWN if it is NS then i want to remove the list

head(result_abd[[1]],10)
   Symbol log2FoldChange UP_DOWN
1  Gnai3      0.07417434      NS
2  Cdc45      0.30915286    DOWN
3    H19      1.80655193      UP
4  Scml2     -0.99676631      NS
5   Narf     -0.66709244    DOWN
6   Cav2      0.14435672      NS
7   Klf6     -0.08168849    DOWN
8  Scmh1     -0.31652589    DOWN
9  Cox5a      0.26321581      NS
10 Wnt9a     -0.50397731      NS

The list structure I have is like this

str(result_abd)
List of 6
 $ C1_ref_vs_C2:'data.frame':   15188 obs. of  3 variables:
  ..$ Symbol        : Factor w/ 15187 levels "0610009O20Rik ",..: 4384 1845 4623 11902 6754 1614 5483 11901 2353 14551 ...
  ..$ log2FoldChange: num [1:15188] 0.0742 0.3092 1.8066 -0.9968 -0.6671 ...
  ..$ UP_DOWN       : chr [1:15188] "NS" "DOWN" "UP" "NS" ...
 $ C1_ref_vs_C3:'data.frame':   15188 obs. of  3 variables:
  ..$ Symbol        : Factor w/ 15187 levels "0610009O20Rik ",..: 4384 1845 4623 11902 6754 1614 5483 11901 2353 14551 ...
  ..$ log2FoldChange: num [1:15188] 0.0555 0.1878 0.8393 -0.5337 -0.8647 ...
  ..$ UP_DOWN       : chr [1:15188] "NS" "DOWN" "NS" "NS" ...
 $ C2_ref_vs_C1:'data.frame':   15188 obs. of  3 variables:
  ..$ Symbol        : Factor w/ 15187 levels "0610009O20Rik ",..: 4384 1845 4623 11902 6754 1614 5483 11901 2353 14551 ...
  ..$ log2FoldChange: num [1:15188] -0.0742 -0.3092 -1.8066 0.9968 0.6671 ...
  ..$ UP_DOWN       : chr [1:15188] "NS" "DOWN" "UP" "NS" ...
 $ C2_ref_vs_C3:'data.frame':   15188 obs. of  3 variables:
  ..$ Symbol        : Factor w/ 15187 levels "0610009O20Rik ",..: 4384 1845 4623 11902 6754 1614 5483 11901 2353 14551 ...
  ..$ log2FoldChange: num [1:15188] -0.0187 -0.1214 -0.9673 0.4631 -0.1976 ...
  ..$ UP_DOWN       : chr [1:15188] "NS" "DOWN" "NS" "NS" ...
 $ C3_ref_vs_C1:'data.frame':   15188 obs. of  3 variables:
  ..$ Symbol        : Factor w/ 15187 levels "0610009O20Rik ",..: 4384 1845 4623 11902 6754 1614 5483 11901 2353 14551 ...
  ..$ log2FoldChange: num [1:15188] -0.0555 -0.1878 -0.8393 0.5337 0.8647 ...
  ..$ UP_DOWN       : chr [1:15188] "NS" "DOWN" "NS" "NS" ...
 $ C3_ref_vs_C2:'data.frame':   15188 obs. of  3 variables:
  ..$ Symbol        : Factor w/ 15187 levels "0610009O20Rik ",..: 4384 1845 4623 11902 6754 1614 5483 11901 2353 14551 ...
  ..$ log2FoldChange: num [1:15188] 0.0187 0.1214 0.9673 -0.4631 0.1976 ...
  ..$ UP_DOWN       : chr [1:15188] "NS" "DOWN" "NS" "NS" ...

For simple dataframe i know how to filter but for list Im not sure how to go about it.Any suggestion or help would be really appreciated.

CodePudding user response:

One option would be to use lapply to filter the dataframe in your list like so:

lapply(result_abd, function(x) x[x$UP_DOWN %in% c("UP", "DOWN"), ])
#> $C1_ref_vs_C2
#>   Symbol log2FoldChange UP_DOWN
#> 2  Cdc45     0.30915286    DOWN
#> 3    H19     1.80655193      UP
#> 5   Narf    -0.66709244    DOWN
#> 7   Klf6    -0.08168849    DOWN
#> 8  Scmh1    -0.31652589    DOWN
#> 
#> $C1_ref_vs_C3
#>   Symbol log2FoldChange UP_DOWN
#> 2  Cdc45     0.30915286    DOWN
#> 3    H19     1.80655193      UP
#> 5   Narf    -0.66709244    DOWN
#> 7   Klf6    -0.08168849    DOWN
#> 8  Scmh1    -0.31652589    DOWN

DATA


df <- structure(list(Symbol = c(
  "Gnai3", "Cdc45", "H19", "Scml2", "Narf",
  "Cav2", "Klf6", "Scmh1", "Cox5a", "Wnt9a"
), log2FoldChange = c(
  0.07417434,
  0.30915286, 1.80655193, -0.99676631, -0.66709244, 0.14435672,
  -0.08168849, -0.31652589, 0.26321581, -0.50397731
), UP_DOWN = c(
  "NS",
  "DOWN", "UP", "NS", "DOWN", "NS", "DOWN", "DOWN", "NS", "NS"
)), class = "data.frame", row.names = c(
  "1",
  "2", "3", "4", "5", "6", "7", "8", "9", "10"
))

result_abd <- list(C1_ref_vs_C2 = df, C1_ref_vs_C3 = df)

CodePudding user response:

Another possible solution, based on purrr::map (I thank @stefan for the data I am using):

library(tidyverse)

map(result_abd, ~ filter(.x, UP_DOWN != "NS"))

#> $C1_ref_vs_C2
#>   Symbol log2FoldChange UP_DOWN
#> 2  Cdc45     0.30915286    DOWN
#> 3    H19     1.80655193      UP
#> 5   Narf    -0.66709244    DOWN
#> 7   Klf6    -0.08168849    DOWN
#> 8  Scmh1    -0.31652589    DOWN
#> 
#> $C1_ref_vs_C3
#>   Symbol log2FoldChange UP_DOWN
#> 2  Cdc45     0.30915286    DOWN
#> 3    H19     1.80655193      UP
#> 5   Narf    -0.66709244    DOWN
#> 7   Klf6    -0.08168849    DOWN
#> 8  Scmh1    -0.31652589    DOWN

CodePudding user response:

Data from stefan (many thanks)

Here is a combination of purrr and stringr:

library(purrr)
library(stringr)

map(result_abd, ~filter(.x, str_detect(UP_DOWN, paste(c("UP", "DOWN"), collapse = "|"))))
$C1_ref_vs_C2
  Symbol log2FoldChange UP_DOWN
2  Cdc45     0.30915286    DOWN
3    H19     1.80655193      UP
5   Narf    -0.66709244    DOWN
7   Klf6    -0.08168849    DOWN
8  Scmh1    -0.31652589    DOWN

$C1_ref_vs_C3
  Symbol log2FoldChange UP_DOWN
2  Cdc45     0.30915286    DOWN
3    H19     1.80655193      UP
5   Narf    -0.66709244    DOWN
7   Klf6    -0.08168849    DOWN
8  Scmh1    -0.31652589    DOWN
  •  Tags:  
  • r
  • Related