Home > Blockchain >  Data not writing in csv file
Data not writing in csv file

Time:11-07

I am trying to create a program that creates 2 csv files from an xml. However, my data isn't writing into the csv files. I managed to only get the titles.

  1. How many books were published each year
  2. How many times is each subject heading mentioned

Here is a sample of my xml that I am using

<records>
<rec resultID="1">
  <header shortDbName="cat01806a" longDbName="Simmons Library Catalog" uiTerm="sim.b2083905">
    <controlInfo>
      <bkinfo>
        <btl>Android programming [electronic resource] : pushing the limits / Erik Hellman.</btl>
        <isbn type="print">9781118717301</isbn>
        <isbn type="print">9781118717356</isbn>
      </bkinfo>
      <jinfo />
      <pubinfo>
        <dt year="2014" month="01" day="01"></dt>
      </pubinfo>
      <artinfo>
        <tig>
          <atl>Android programming. [electronic resource] : pushing the limits.</atl>
        </tig>
        <aug>
          <au>Hellman, Erik</au>
        </aug>
        <sug>
          <subj type="unclass">Android (Electronic resource)</subj>
          <subj type="unclass">Application software -- Development</subj>
          <subj type="unclass">Smartphones -- Programming</subj>
          <subj type="unclass">Tablet computers -- Programming</subj>
        </sug>
        <pubtype>eBook</pubtype>
        <pubtype>Book</pubtype>
        <doctype>Bibliographies</doctype>
        <formats />
      </artinfo>
      <language>English</language>
    </controlInfo>
    <displayInfo>
      <pLink>
        <url>http://ezproxy.simmons.edu:2048/login?url=https://search.ebscohost.com/login.aspx?direct=true&amp;db=cat01806a&amp;AN=sim.b2083905&amp;site=eds-live&amp;scope=site</url>
      </pLink>
    </displayInfo>
  </header>
</rec>
<iframe name="sif1" sandbox="allow-forms allow-modals allow-scripts" frameborder="0"></iframe>

Here is the code I have so far:


#import libraries 
import csv, xml
import xml.etree.ElementTree as ET

#read   open
base = ET.parse('simmons_program_books.xml')
detail = base.getroot()

#frequeny count for dictionary
def count(dictionary, key):
    if key in dictionary:
        dictionary[key]  = 1

    else:
        dictionary[key] = 1

#empty dictionary variables
year_count = {}
subhead_count = {}

for year in detail.iter('dt year'):
    #variable
    count(year_count, year.text)

for subhead in detail.iter('subj type'):
    count(subhead_count, subhead.text)

#to a csv (year)
year = open("year.csv", mode ='w', newline = '', encoding="utf-8")
write = csv.writer(year)

write.writerow(['year', '# books'])
for x, z in year_count.items():
    write.writerow([x, z])

#close
year.close()


#to a csv (subhead)
subhead = open("subhead.csv", mode = 'w', newline = '', encoding ="utf-8")
write = csv.writer(subhead)

write. writerow(['subheading', '# amt mentioned'])
for x, z in subhead_count.items():
    write.writerow([x, z])

#close
subhead.close()

I'm not sure what's wrong.

CodePudding user response:

  1. Your iter() method is looking for non-existent children 'dt year' & 'subj type'. They should be looking for 'year' and 'subj' instead.

  2. To populate the year text in the dictionary, use year.get('year') instead of year.text.

CodePudding user response:

first, detail.iter('dt year') won't work. iter over dts and then check year.

second, your count function has to return something

#frequeny count for dictionary
def count(dictionary, key):
    if key in dictionary:
        dictionary[key]  = 1

    else:
        dictionary[key] = 1
    return dictionary

#empty dictionary variables
year_count = {}
subhead_count = {}

for dt in detail.iter('dt'):
    #variable
    year_count=count(year_count, dt.attrib['year'])
    print('year', dt, dt.attrib['year'])

for subhead in detail.iter('subj'):
    subhead_count=count(subhead_count, subhead.text)
  • Related