I have converted a large xml file into characters in R and wondered if it would be possible to search for records which have specific words associated with them? Most of the information I have found on this assumes the data is in a dataframe, however mine is a lot of rows of characters, each being an entry in the xml file.
Unfortunately, I am unsure how to post XML text here and so although I know that I should not post images I am unsure how else to convey this as when I type using the XML format, the formatting disappears, but if a row of characters contains, among other things, a category called models, as in the below example, and I want to search only for models that are Sud, how would I do this? I am relatively new at using R.
CodePudding user response:
I would parse it using dedicated packages, see this post: How to parse an XML file to an R data frame?.
Alternatively, you could try to use grepl:
myVector[ grepl(">Sud<", myVector, fixed = TRUE) ]