I have this code:
def PCP(seq, feature):
l = []
mean = st.mean(list(feature.values()))
std = np.std(list(feature.values()))
for elem in seq:
for aa in elem:
if aa in feature:
l.append((feature[aa]-mean)/std)
else:
l.append(0)
return l
seq
is a list of strings with sequences, and feature
is a dict with aminoacids as keys and certain values as values. I want the function to get throught every aa
in a string in the list (seq
) and take the values that corresponds in the dict and make calculation.
Here is a piece of of data I put as seq
, and feature
dict:
['KISKDLSIAVQMMKRIHSLLERYPEIL', 'SGRVEKSPHEQEIKFFAKILLPLINQY', 'IDQLIVFGEQLIQKSEPLDAVLIEDEL', ..], pI= {"K":9.47, "P":6.3, "R":10.76, "T":5.6, "A":6.11, "C":5.02, "D":2.98, "E":3.08, "F":5.91, "G":6.06, "H":7.64......}
Maybe there is something wrong with them.
When I run the function I get this error:
Traceback (most recent call last)
<ipython-input-13-89a7be7ecda7> in <module>
----> 1 test_hydrofobowosc= PCP(test_data_neg, hydrofobowosc)
<ipython-input-12-2f318f9cb908> in PCP(seq, feature)
14 mean = st.mean(list(feature.values()))
15 std = np.std(list(feature.values()))
---> 16 for elem in seq:
17 for aa in elem:
18 if aa in feature:
TypeError: 'NoneType' object is not iterable
What's wrong?
CodePudding user response:
'range' returns a list of numbers.
therefore 'elem' is a number.
therefore it is not iterable.
do you mean 'for elem in seq' ?
CodePudding user response:
Here's a modified version of your code that should work:
def PCP(seq, feature):
l = []
mean = st.mean(list(feature.values()))
std = np.std(list(feature.values()))
for elem in seq:
for aa in elem:
if aa in feature:
l.append((feature[aa]-mean)/std)
else:
l.append(0)
return l
The main changes I made were:
- Instead of using "range(len(seq))" to iterate through the list of sequences, I used "seq" directly. This way, "elem" will be the actual string in the list, instead of the index of the string.
- Inside the nested loop, I used "aa" as the variable to iterate through the characters in the string.
- When checking if the current character is "in" the feature dictionary, I used the in keyword instead of "keys[s]". This is because "keys()" is a method that returns a list-like object of the dictionary's keys, and you need to use "[]" to access an item in the list. Since "aa" is a string and not an index, you can just check if it's in the dictionary directly.
- I moved the "return l" statement outside of the nested loop, so that it only happens once after all the calculations are done.
With these changes, the code should work as expected and append the calculated values to the l list for each amino acid in the sequences that has a corresponding value in the feature dictionary.