Assuming the following text file
Some lines here
and here
chr1 111 222 More_data_I_dont_need More_data_I_dont_need ...
chr2 333 444 More_data_I_dont_need. More_data_I_dont_need ...
chr3 555 666 More_data_I_dont_need More_data_I_dont_need ...
(data separated by \t)
I have been trying to convert this data into a Python dictionary where the keys store the values of the first columns and the values of the dictionary store in a list the values of the columns second and third. eg.:
{"chr1": [111, 222], "chr2": [333,444]}
The code I have written so far is:
d = {}
star_end_list = []
with open("file.txt") as f:
for line in f:
if line.startswith('chr'):
chr, start,end = line.split()
star_end_list.append() = start
star_end_list.append() = end
# Then with this data I can create a dictionary. Something like
d[chr] = star_end_list
But I don't know how to ignore the values of the columns 4th, 5th and so on.
CodePudding user response:
You can retrieve only the first 3 items of the list, as seen below. You also had a few errors with your indentation, the way you use the append
function, and that you initialize the start_end_list
outside of the loop. They are addressed below too.
d = {}
with open("file.txt") as f:
for line in f:
star_end_list = []
if line.startswith('chr'):
chr, start, end = line.split()[:3]
star_end_list.append(start)
star_end_list.append(end)
d[chr] = star_end_list