Home > Software engineering >  How the multistage folder all XML files batch remove labels, and convert TXT format?
How the multistage folder all XML files batch remove labels, and convert TXT format?

Time:09-16

How the multistage folder all XML files batch remove labels, and convert TXT format? Best can transform in a TXT file, (note: the home directory of all a folder all the files in the XML format, concrete as shown)

Recently in writing graduation thesis, the problem about the monolingual corpus building, time is short, hope to recognize your advice, thank you very much!!

CodePudding user response:

Baidu search: (1) VC file lookup or VC traversal, find all. XML file

(2) the baidu search: VC, speaking, reading and writing XML file, reads the XML file, (not including label) of the content of the XML file written TXT

CodePudding user response:

This involves:
1, folder traverse,
2, file IO,

Traverse if not considering stack overflow can directly use recursive way to traverse, if don't want a stack overflow, can use a container of ideas, each folder is the folder path in the container, after each traverse a folder is the folder path out from the container,

As to remove the label, you have to learn to from a string of identification tag, the contents of the identification tag, only will then tag content is written to TXT file,

CodePudding user response:

refer to the second floor Magic, xu response:
this involved:
1, folder traverse,
2, file IO,

Traverse if not considering stack overflow can directly use recursive way to traverse, if don't want a stack overflow, can use a container of ideas, each folder is the folder path in the container, after each traverse a folder is the folder path out from the container,

As to remove the label, you have to learn to from a string of identification tag, the contents of the identification tag, then only the tag content is written to TXT file,


Read the XML files also have specialized tools,
  • Related