I am parsing xml files in a folder using Python SAX Parser and writing the output in CSV using pandas, But I am getting only the data from last file in the CSV.
I am new to Python and this is for the first time trying SAX Parsing
File read:
for dirpath, dirs, files in os.walk(fp1):
for filename in files:
print(files)
fname = os.path.join(dirpath,filename)
if fname.endswith('.xml'):
print(fname)
#for count in files:
parser.parse(fname)
def characters(self, content):
rows = []
cols = ["ReporterCite","DecisionDate","CaseName","FileNum","CourtLocation","CourtName","CourtAbbrv","Judge","CaseLength","CourtCite","ParallelCite","CitedCount","UCN"]
#ReporteCite, DecisionDate, CaseName, FileNum, CourtLocation, CourtName, CourtAbbrv, Judge, CaseLength, CourtCite, ParallelCite, CitedCount, UCN
rows.append({"ReporterCite":self.rc,
"DecisionDate": self.dd,
"CaseName": self.can,
"FileNum": self.fn,
"CourtLocation": self.loc,
"CourtName": self.cn,
"CourtAbbrv": self.ca,
"Judge": self.j,
"CaseLength": self.cl,
"CourtCite": self.cc,
"ParallelCite": self.pc,
"CitedCount": self.cd,
"UCN": self.rn})
#print(rows)
df = pd.DataFrame(rows, columns=cols)
df.to_csv(fp2,index=False)
CodePudding user response:
I assume you will always overwrite your previous result. This is a pandas question, not a SAX question. You would like append to the existing csv, right? If this is the case you have to use the mode = ‘a’, like
df.to_csv('filename.csv',mode = 'a')
More options, see Doc
- 'w' open for writing, truncating the file first (default)
- 'x' open for exclusive creation, failing if file already exists
- 'a' open for writing, appending to the end of file if it exists