I am trying to read a .txt file that has the format
"Field1:Field2:Field3:Field4"
"a:b:c:d"
"e:f:g:h"
into a dictionary with the format
{Field1: [a, e], Field2: [b, f], Field3: [c, g], Field4: [d, h]}
and my current code looks like
with open("data.txt",'r') as filestream:
lines = [line.strip().split(":") for line in filestream]
fields = lines[0]
d = dict.fromkeys(fields, [])
for i in range(1, len(lines)):
for j in range(len(fields)):
d[headers[j]].append(lines[i][j])
What I'm trying to do is convert each line in the file into a split and cleaned list, store that in a bigger list of lists, and then use a double for loop to match the key of the dictionary with the correct value in the smaller/current list. However, what the code ends up doing is adding to the dictionary in a way that looks like:
{Field1: [a], Field2: [a], Field3: [a], Field4: [a]}
{Field1: [a,b], Field2: [a,b], Field3: [a,b], Field4: [a,b]}
I want to add to the dictionary in the following manner:
{Field1: [a], Field2: [], Field3: [], Field4: []}
{Field1: [a], Field2: [b], Field3: [], Field4: []}
and so forth.
Can anyone help me figure out where my code is going wrong?
CodePudding user response:
Try:
out = {}
with open("data.txt", "r") as f_in:
i = (line.strip().split(":") for line in f_in)
fields = next(i)
for line in i:
for k, v in zip(fields, line):
out.setdefault(k, []).append(v)
print(out)
Prints:
{
"Field1": ["a", "e"],
"Field2": ["b", "f"],
"Field3": ["c", "g"],
"Field4": ["d", "h"],
}
CodePudding user response:
The issue that you're having comes from the line:
d = dict.fromkeys(fields, [])
More specifically, the []
. What this line does here is that it creates a new dictionary with the fields as the keys, and the SAME empty list as the value for all the fields. Meaning that field1, field2, field3 and field4 are all using the same list to store their contents and this is the main reason as to why you're getting this problem.
Your issue can be fixed through a single line change, from:
d = dict.fromkeys(fields, [])
to:
d = {field: [] for field in fields}
Meaning that your source code would become:
with open("data.txt",'r') as filestream:
lines = [line.strip().split(":") for line in filestream]
fields = lines[0]
d = {field: [] for field in fields}
for i in range(1, len(lines)):
for j in range(len(fields)):
d[fields[j]].append(lines[i][j])