Home > Back-end >  'open' and 'csv.DictReader' cannot read file correctly
'open' and 'csv.DictReader' cannot read file correctly

Time:09-28

I have a simple .csv file encoded in windows-1250. Two columns with key-value pairs, separated by semicolon. I would like to create a dictionary from this data. I used this solution: How to read a .csv file into a dictionary in Python. Code below:

import os
import csv

strpath = r"C:\Project Folder"

filename = "to_dictionary.csv"

os.chdir(strpath)

test_csv = open(filename, mode="r", encoding="windows-1250")

dctreader = csv.DictReader(test_csv)

ordereddct = list(dctreader)[0]

finaldct = dict(ordereddct)

print(finaldct)

First of all, this file has 370 rows but I receive only two. Second, Python reads whole first row as a key and next row as a value (and then stops as I mentioned).

# source data
# a;A
# b;B
# c;C
# ... up to 370 rows

# what I need (example; there should be 368 pairs more of course)
finaldct = {"a": "A", "b": "B"}

# what I receive
finaldct = {"a;A": "b;B"}

I have no idea why this happens and couldn't find any working solution.

Note: I would like to avoid using pandas because it seems to work slower in this case.

CodePudding user response:

file has 370 rows but I receive only two

This might be caused by problems with newlines (they do differ between systems, see Newline wikipedia entry if you want to know more). csv module docs suggest using newline='' i.e. in your case

test_csv = open(filename, newline='', mode="r", encoding="windows-1250")

CodePudding user response:

If you have a file with just two columns (assuming they're unquoted, etc.), you don't need use the csv module at all.

dct = {}
with open("file.txt", encoding="windows-1250") as f:
    for line in f:
        key, _, value = line.rstrip("\r\n").partition(";")
        dct[key] = value

CodePudding user response:

You can try the below

data = dict()
with open('test.csv') as f:
    for line in f:
        temp = line.strip()
        if temp:
            k,v = temp.split(';')
            data[k] = v
print(data)

test.csv

1;2
3;5
78;8
6;0

output

{'1': '2', '3': '5', '78': '8', '6': '0'}

CodePudding user response:

Thank you all but I finally managed to do so (and without looping)! The solution from Kite misleaded me a little. Here is my code:

import os
import csv

strpath = r"C:\Project Folder"

filename = "to_dictionary.csv"

os.chdir(strpath)

test_csv = open(filename, mode="r", encoding="windows-1250")

csvreader = csv.reader(test_csv, delimiter=";")

finaldct = dict(csvreader)

print(finaldct)

So, I needed to specify delimiter first but in a reader. Second, there's no need to use DictReader. Changing reader to dictionary suffices.

  • Related