Home > Software design >  Why is only half my data being passed into my dictionary?
Why is only half my data being passed into my dictionary?

Time:11-07

When I run this script I can verify that it loops through all of the values, but not all of them get passed into my dictionary

file = open('path', 'rb')
readFile = PyPDF2.PdfFileReader(file)

lineData = {}

totalPages = readFile.numPages

for i in range(totalPages):
    pageObj = readFile.getPage(i)
    pageText = pageObj.extractText
    newTrans = re.compile(r'Jan \d{2,}')
    for line in pageText(pageObj).split('\n'):
        if newTrans.match(line):
            newValue = re.split(r'Jan \d{2,}', line)
            newValueStr = ' '.join(newValue)
            newKey = newTrans.findall(line)
            newKeyStr = ' '.join(newKey)
            print(newKeyStr   newValueStr)
            lineData[newKeyStr] = newValueStr
print(len(lineData))

There are 80 data pairs but when I run this the dict only gets 37

CodePudding user response:

Well, duplicate keys, maybe? Try to make lineData = [] and append there: lineData.append({newKeyStr:newValueStr} and then check how many records you get.

  • Related