What should I add or change to my code below in order to get a function that finds the mean length of reads??? I have to write a function, mean_length, that takes one argument: A dictionary, in which keys are read names and values are read sequences. The function must return a float, which is the average length of the sequence reads. Hope someone can help me :D I am very new to coding in python.
read_map = {'Read1': 'GGCTCCCCACGGGGTACCCATAACTTGACAGTAGATCTCGTCCAGACCCCTAGC',
'Read3': 'GTCTTCAGTAGAAAATTGTTTTTTTCTTCCAAGAGGTCGGAGTCGTGAACACATCAGT',
'Read2': 'CTTTACCCGGAAGAGCGGGACGCTGCCCTGCGCGATTCCAGGCTCCCCACGGG',
'Read5': 'CGATTCCAGGCTCCCCACGGGGTACCCATAACTTGACAGTAGATCTC',
'Read4': 'TGCGAGGGAAGTGAAGTATTTGACCCTTTACCCGGAAGAGCG',
'Read6': 'TGACAGTAGATCTCGTCCAGACCCCTAGCTGGTACGTCTTCAGTAGAAAATTGTTTTTTTCTTCCAAGAGGTCGGAGT'}
def mean_lenght (read_map):
print('keys : ',read_map.values())
for key in read_map.keys():
print(key)
#result = sum(...?)/len(read_map)
return result
print(mean_lenght(read_map))
CodePudding user response:
A very one-line simple solution could be:
def mean_length(read_map):
return sum([len(v) for v in read_map.values()]) / len(read_map)
Basically, you construct a list of elements, each storing the length of an entry of read_map
. Then, you sum up all those lengths and divide by the number of entries in your dict.
If your dictionary is very big, then constructing a list might not be the most memory efficient way. In this case:
def mean_length(read_map):
mean = 0
for v in read_map.values(): mean = len(v)
mean /= len(read_map)
return mean
In this way, you do not build any intermediate list.
CodePudding user response:
The mean is the sum of lengths divided by the number of values, so let's just do this:
sum(map(len, read_map.values()))/len(read_map)
output: 55.333
Breakdown:
# "…" denotes the output of the previous line
read_map.values() -> returns the values of the dictionary
map(len, …) -> computes the length of each sequence
sum(…) -> get the total length
sum(…)/len(read_map) -> divide the total length by the number of sequence = mean
As a function:
def mean_length(d):
return sum(map(len, d.values()))/len(d)
>>> mean_length(read_map)
53.33
CodePudding user response:
get a function that finds the mean length of reads???
python
built-in module statistics
has what you are looking for statistics.mean
. Naturally you need to find lengths before feeding data in said function, for which len
built-in function is useful.
import statistics
read_map = {'Read1': 'GGCTCCCCACGGGGTACCCATAACTTGACAGTAGATCTCGTCCAGACCCCTAGC',
'Read3': 'GTCTTCAGTAGAAAATTGTTTTTTTCTTCCAAGAGGTCGGAGTCGTGAACACATCAGT',
'Read2': 'CTTTACCCGGAAGAGCGGGACGCTGCCCTGCGCGATTCCAGGCTCCCCACGGG',
'Read5': 'CGATTCCAGGCTCCCCACGGGGTACCCATAACTTGACAGTAGATCTC',
'Read4': 'TGCGAGGGAAGTGAAGTATTTGACCCTTTACCCGGAAGAGCG',
'Read6': 'TGACAGTAGATCTCGTCCAGACCCCTAGCTGGTACGTCTTCAGTAGAAAATTGTTTTTTTCTTCCAAGAGGTCGGAGT'}
print(statistics.mean(len(v) for v in read_map.values()))
output
55.333333333333336
CodePudding user response:
def mean_length(read_map):
total_chars = 0
for key in read_map.values():
total_chars = total_chars len(key)
result = total_chars / len(read_map)
return result
I think this is the most intuitive code for beginner.