CSV to Python Dict-CodePudding

i have this csv:

ref,columns,start,end,content
1,Column1,1234,5654,text11
1,Column2,2454,3223,text12
1,Column3,7565,8765,text13
1,Column4,8524,5456,text14
1,Column5,1235,5245,text15
2,Column1,7856,2356,text21
2,Column2,4568,3545,text22
2,Column3,5756,4563,text23
2,Column4,9856,2135,text24
2,Column5,4535,5456,text25
3,Column1,5756,9856,text31
3,Column2,5764,8778,text32
3,Column3,1536,8768,text33
3,Column4,6861,5654,text34
3,Column5,6875,6586,text35

how do i change it into a dictionnary like this? to make 'ref' to be a column too, and to remove start and end in the output.

thisdict = {
  "ref": [1,2,3],
  "Column1": ['text11','text21','text31'],
  "Column2": ['text12','text22','text32'],
  "Column3": ['text13','text23','text33'],
  "Column4": ['text14','text24','text34'],
  "Column5": ['text15','text25','text35'],
}

this is what i've tried:

    Columns = "Column1","Column2","Column3","Column4","Column5"
    cols = col1,col2,col3,col4,col5 = [],[],[],[],[]

    with open(csvfile,'r',encoding="utf8") as f:
        reader = list((csv.reader(f)))
        for line in reader:
            for i,col in enumerate(Columns):
                if col == line[1]:
                    cols[i].append(line[4])
    for col in cols:
        print(col)

outputs:

['text11', 'text21', 'text31']
['text12', 'text22', 'text32']
['text13', 'text23', 'text33']
['text14', 'text24', 'text34']
['text15', 'text25', 'text35']

thank you for your help

CodePudding user response：

import pandas as pd

path_to_file = "xxxxxx"
df = pd.read_csv(path_to_file)
df = df.reset_index()
del df['Start']
del df['End']

this_dict = df.to_dict()

CodePudding user response：

Do something simple like this:

thisdict = {
    "ref": [],
}
with open("your.csv") as csv_fp:
    for line in csv_fp:
        if line.startswith("ref"):
            # skip first line
            continue

        ref, col, _, _, content = line.strip().split(",")

        if int(ref) not in thisdict["ref"]:
            thisdict["ref"].append(int(ref))

        if col not in thisdict:
            thisdict[col] = [content]
        else:
            thisdict[col].append(content)

print(thisdict)

CodePudding user response：

I would change your code a little bit. Try this:

from collections import defaultdict
import csv

col_dict = defaultdict(list)

with open(csvfile,'r',encoding="utf8") as f:
    reader = csv.DictReader(f)
    for row in reader:
        col_dict[row['columns']].append(row['content'])
        if int(row['ref']) not in col_dict['ref']:
            col_dict['ref'].append(int(row['ref']))

print(col_dict)

csv.DictReader allows you to read each row of your csv as a dictionary. This is possible because your csv has headers.

defaultdict allows you to create a dictionary in which the values will default to whatever literal or function you pass to it, in this case list, which creates an empty list.