I have a list of tibbles that looks like this:
$WT_top_markers
# A tibble: 128 × 2
# Groups: cluster [26]
cluster gene
<fct> <chr>
1 0 Abi3bp
2 0 Apoe
3 0 Apoc1
4 0 Tgm2
5 0 Bcam
6 1 Aqp3
7 1 Sult1d1
8 1 Dapl1
9 1 Fxyd3
10 1 Pir
# … with 118 more rows
$F7KO_top_markers
# A tibble: 125 × 2
# Groups: cluster [25]
cluster gene
<fct> <chr>
1 0 Abi3bp
2 0 Apoe
3 0 Apoc1
4 0 Dapl1
5 0 Tgm2
6 1 Scgb3a1
7 1 Sftpa1
8 1 Reg3g
9 1 Bpifb1
10 1 Itln1
# … with 115 more rows
$F8HET_top_markers
# A tibble: 147 × 2
# Groups: cluster [30]
cluster gene
<fct> <chr>
1 0 Abi3bp
2 0 Apoe
3 0 Apoc1
4 0 1600014C10Rik
5 0 Bcam
6 1 Krt14
7 1 Krt17
8 1 Krt5
9 1 Bcam
10 1 Cav1
# … with 137 more rows
I want to pull out the genes from the first tibble where cluster = 20. I have tried:
features_to_plot <- unlist(top_markers[[1]][[which(top_markers[[1]]$cluster == 20)]])
but am getting an error:
! Must extract column with a single valid subscript.
✖ Subscript which(top_markers[[1]]$cluster == 20)
has size 5 but must be size 1.
Can anyone tell me how to do this properly?
Thanks, Stacy
CodePudding user response:
We can use lapply
to loop over the list
and subset
where the 'cluster' value is 20
lapply(top_markers, \(x) subset(x, cluster == 20))
The error in the OP's code is related to usage of [[
for subsetting more than one element. Use [
with ,
i.e. top_markers[[1]]
is the first list
element which is a tibble
, we get the row index with which(top_markers[[1]]$cluster == 20)
, if we want to subset the rows, the indexing will be rowindex, columnindex
, and here we need to use rowindex,
. By default, indexing in data.frame, tibble are taken as column index (eg. - tibble(col1 = 1:5)[1:2,]
and not tibble(col1 = 1:5)[1:2]
- returns error as there is only a single column and we request to select 2 columns)
top_markers[[1]][which(top_markers[[1]]$cluster == 20),]