This works:
CHROM = "3R"
g = allel.GenotypeDaskArray(callset[CHROM]['calldata/GT']).compute()
g_parents = g[:, [x in PARENT_SAMPLES for x in callset_all_sample_ids]]
g_parents
But this doesn't:
CHROM = "3R"
g = allel.GenotypeDaskArray(callset[CHROM]['calldata/GT']).compute()
g_parents = g[:, [x for x in callset_all_sample_ids]]
g_parents
It gives this error:
IndexError Traceback (most recent call last)
<ipython-input-47-a5b9bc429c1d> in <module>
1 CHROM = "3R"
2 g = allel.GenotypeDaskArray(callset[CHROM]['calldata/GT']).compute()
----> 3 g_parents = g[:, [x for x in callset_all_sample_ids]]
4 g_parents
/share/lanzarolab/opt/conda/vgl/lib/python3.6/site-packages/allel/model/ndarray.py in __getitem__(self, item)
1478 def __getitem__(self, item):
1479 return index_genotype_array(self, item, array_cls=type(self),
-> 1480 vector_cls=GenotypeVector)
1481
1482 @property
/share/lanzarolab/opt/conda/vgl/lib/python3.6/site-packages/allel/model/generic.py in index_genotype_array(g, item, array_cls, vector_cls)
36
37 # apply indexing operation to underlying values
---> 38 out = g.values[item]
39
40 # decide whether to wrap the output, if so how
IndexError: only integers, slices (`:`), ellipsis (`...`), numpy.newaxis (`None`) and integer or boolean arrays are valid indices
I don't understand why list comprehension doesn't work in the 2nd case. callset_all_sample_ids
is a list as follows:
['F1foc13m',
'F1foc12m',
'F1foc11m',
'F1foc08m',
'F1foc06f',
'F1foc02f',
'F1foc05f',
'F1bait17m',
'F1bait08m',
'F1bait02f',
'F0male1',
'F1bait01f',
'F1foc01f',
'F1bait10m',
'F1foc03f',
'F1bait03f',
'F0female',
'F1foc10m',
'F1bait12m',
'F1bait13f',
'F1bait15m',
'F1bait04f',
'F1bait05f',
'F1bait06f',
'F1foc09m',
'F1bait14m',
'F1foc04f',
'F1bait07f',
'F1bait16m',
'F1bait09m',
'F1bait11m']
and PARENT_SAMPLES is a list as follows:
['F0female',
'F0male1']
In the 1st case I get the expected array with information about F0female and F0male. In the 2nd case I expect information about all items in the callset_all_sample_ids
list. What am I doing wrong?!
CodePudding user response:
IndexError: only integers, slices (
:
), ellipsis (...
), numpy.newaxis (None
) and integer or boolean arrays are valid indices
This explains why your first example works: it is a list of booleans which is one of the valid index types. However, the second example is not a list of booleans, so you get the error.
CodePudding user response:
Didn't you look at the comprehensions by themselves?
In [84]: [x in PARENT_SAMPLES for x in callset_all_sample_ids]
Out[84]:
[False,
False,
False,
False,
False,
...
False,
False]
In [85]: [x for x in callset_all_sample_ids]
Out[85]:
['F1foc13m',
'F1foc12m',
'F1foc11m',
'F1foc08m',
...
'F1bait16m',
'F1bait09m',
'F1bait11m']
One is a list of booleans, which becomes a boolean array when used as index. If size is right that works.
The second is a list of strings. strings, individually or in a list, can't be used as an array index. That's what the error is telling you.