Home > Back-end >  Python - creating a dictionary of csv file thath collects all values with the same key without using
Python - creating a dictionary of csv file thath collects all values with the same key without using

Time:03-20

hello I took a course in python for biologists and and had hw about using csv files, and I got stuck in the final question about creating a dictionary out of csv files with keys as tuples of (c, b), and the values are lists of tuples of all (first, finish, fix/ext) values, that are of same c,b values. the csv looks like this:

N,c,g,b,first,finish,fix/ext
1000,1.0,0.02,3.0,602,1661,0
1000,0.8,0.02,3.0,42,911,0
1000,1.0,0.02,3.0,945,2164,0
1000,1.0,0.02,3.0,141,954,0

the result need to look like this:

: {(1, 3) : [(602, 1661, 0), (945, 2164, 1), (687, 716, 1)], (1, 1.5) : [(35, 287, 0), (803, 1402, 0), …], …}

for now i managed to do this:

def read_results(full_file_path):
    csv_dict = {}
    with open(full_file_path,'r') as t:
        table = t.readlines()[1:]
        for line in table:
            line = line.replace('\n', '')
            line = line.split(',')
            csv_dict[line[0]] = line[1:]
            print(csv_dict)

which gives me this:

{'1000': ['1.0', '0.02', '3.0', '602', '1661', '0']}

but cant seem to understand how to do it with the keys that i need and how put into tuples the lists of all corresponding values

oh and I cant use import csv or other imports

CodePudding user response:

You are almost done, the only thing left is to set, what is going to be your keys and what are your values. Here I wrote a code that finalises it:

def read_results(full_file_path):
    csv_dict = {}
    with open(full_file_path,'r') as t:
        table = t.readlines()[1:]
        for line in table:
            line = line.replace('\n', '')
            line = line.split(',')
            line = list(map(float, line)) # optional, if you want to have numbers as floats and not strings
            '''
            line[1] - c
            line[3] - b
            KEY: (c,b) - tuple of c and b 
            TUPLE OF CORRESPONDING VALUES: (line[4], line[5], line[6]) - tuple of first, finish, fix/ext 
            '''
            key = (line[1],line[3])
            if key in csv_dict:
                # append a new value if key is already in the dict
                csv_dict[key].append((line[4], line[5], line[6]))
            else:
                # create a new value by the key if it is not in the dict
               csv_dict[key] = [(line[4], line[5], line[6])] 
    return csv_dict
        
dict = read_results("file.csv")
print(dict)
# try to get the values of dict by some value:
key = (1.0, 3.0) #key = ("0.8", "3.0") if strings were not converted to floats
print(dict[key])

I hope I made it understandable with the corresponding comments. If you have more questions - write them down and I will try to answer them.

  • Related