Home > Blockchain >  Why is Python list comprehension failing for me?
Why is Python list comprehension failing for me?

Time:10-27

This works:

CHROM = "3R"
g = allel.GenotypeDaskArray(callset[CHROM]['calldata/GT']).compute()
g_parents = g[:, [x in PARENT_SAMPLES for x in callset_all_sample_ids]]
g_parents

But this doesn't:

CHROM = "3R"
g = allel.GenotypeDaskArray(callset[CHROM]['calldata/GT']).compute()
g_parents = g[:, [x for x in callset_all_sample_ids]]
g_parents

It gives this error:

IndexError                                Traceback (most recent call last)
<ipython-input-47-a5b9bc429c1d> in <module>
      1 CHROM = "3R"
      2 g = allel.GenotypeDaskArray(callset[CHROM]['calldata/GT']).compute()
----> 3 g_parents = g[:, [x for x in callset_all_sample_ids]]
      4 g_parents

/share/lanzarolab/opt/conda/vgl/lib/python3.6/site-packages/allel/model/ndarray.py in __getitem__(self, item)
   1478     def __getitem__(self, item):
   1479         return index_genotype_array(self, item, array_cls=type(self),
-> 1480                                     vector_cls=GenotypeVector)
   1481 
   1482     @property

/share/lanzarolab/opt/conda/vgl/lib/python3.6/site-packages/allel/model/generic.py in index_genotype_array(g, item, array_cls, vector_cls)
     36 
     37     # apply indexing operation to underlying values
---> 38     out = g.values[item]
     39 
     40     # decide whether to wrap the output, if so how

IndexError: only integers, slices (`:`), ellipsis (`...`), numpy.newaxis (`None`) and integer or boolean arrays are valid indices

I don't understand why list comprehension doesn't work in the 2nd case. callset_all_sample_ids is a list as follows:

['F1foc13m',
 'F1foc12m',
 'F1foc11m',
 'F1foc08m',
 'F1foc06f',
 'F1foc02f',
 'F1foc05f',
 'F1bait17m',
 'F1bait08m',
 'F1bait02f',
 'F0male1',
 'F1bait01f',
 'F1foc01f',
 'F1bait10m',
 'F1foc03f',
 'F1bait03f',
 'F0female',
 'F1foc10m',
 'F1bait12m',
 'F1bait13f',
 'F1bait15m',
 'F1bait04f',
 'F1bait05f',
 'F1bait06f',
 'F1foc09m',
 'F1bait14m',
 'F1foc04f',
 'F1bait07f',
 'F1bait16m',
 'F1bait09m',
 'F1bait11m']

and PARENT_SAMPLES is a list as follows:

['F0female',
 'F0male1']

In the 1st case I get the expected array with information about F0female and F0male. In the 2nd case I expect information about all items in the callset_all_sample_ids list. What am I doing wrong?!

CodePudding user response:

IndexError: only integers, slices (:), ellipsis (...), numpy.newaxis (None) and integer or boolean arrays are valid indices

This explains why your first example works: it is a list of booleans which is one of the valid index types. However, the second example is not a list of booleans, so you get the error.

CodePudding user response:

Didn't you look at the comprehensions by themselves?

In [84]: [x in PARENT_SAMPLES for x in callset_all_sample_ids]
Out[84]: 
[False,
 False,
 False,
 False,
 False,
 ...
 False,
 False]
In [85]: [x for x in callset_all_sample_ids]
Out[85]: 
['F1foc13m',
 'F1foc12m',
 'F1foc11m',
 'F1foc08m',
  ...
 'F1bait16m',
 'F1bait09m',
 'F1bait11m']

One is a list of booleans, which becomes a boolean array when used as index. If size is right that works.

The second is a list of strings. strings, individually or in a list, can't be used as an array index. That's what the error is telling you.

  • Related