Why does this work:
def hamming_distance(dna_1,dna_2):
hamming_distance = sum(1 for a, b in zip(dna_1, dna_2) if a != b)
return hamming_distance
As opposed to this:
def hamming_distance(dna_1,dna_2):
hamming_distance = sum(for a, b in zip(dna_1, dna_2) if a != b)
return hamming_distance
I get this error:
Input In [90]
hamming_distance = sum(for a, b in zip(dna_1, dna_2) if a != b)
^
SyntaxError: invalid syntax
I expected the function to work without the 1 after the ()
CodePudding user response:
You wrote a generator expression. Generator expressions must produce a value (some expression to the left of the first for
). Without it, you're saying "please sum all the lack-of-values not-produced by this generator expression".
Ask yourself:
- What does a genexpr that produces nothing even mean?
- What is
sum
summing when it's being passed a series of absolute nothing?
You could write a shorter genexpr with the same effect with:
hamming_distance = sum(a != b for a, b in zip(dna_1, dna_2))
since bool
s have integer values of 1
(for True
) and 0
(for False
), so it would still work, but it would be slower than sum(1 for a, b in zip(dna_1, dna_2) if a != b)
(which produces fewer values for sum
to work on and, at least on some versions of Python, allows sum
to operate faster, since it has a fast path for summing small exact int
types that bool
breaks).
CodePudding user response:
The working expression can be unrolled into something like this:
hamming_distance = 0
for a, b in zip(dna_1, dna_2):
if a != b:
hamming_distance = 1
Without a number after =
, what should Python add? It doesn't know, and neither do we.
If this "unrolled" syntax or your code's relationship to it is new to you, probably start by reading up on list comprehensions, which generalize into generator expressions (which is what you have).