Home > Mobile >  delete and replace strings matching pattern in xml file
delete and replace strings matching pattern in xml file

Time:12-29

Team, I have an xml file that needs to be modified for a personal hack project. Need help how can I club all this in fewer operations.

all below works on my mac but I need to see if I can club them into one line command. example for similar operations is there an OR operation I can do in sed syntax? something like sed sed -i '' '/classes|lines>|package|source|<source/d' coverage.xml

secondly, an answer below suggests to use xmlstarlet xmlstarlet edit -d '//package' coverage.xml with this approach it is deleting the entire block where I just want to delete the line not the block because am formatting it. see how i am renaming class to file and then making that class as parent node.!

sed -i '' '/classes/d' coverage.xml
sed -i '' '/lines>/d' coverage.xml
sed -i '' '/<method/d' coverage.xml
sed -i '' '/package/d' coverage.xml
sed -i '' '/source/d' coverage.xml
sed -i '' '/<source/d' coverage.xml

replace

sed -i '' 's/class/file/g' coverage.xml

replace at line having <coverage

sed -i '' '/<coverage/s/version="2.0.3"/version="1"/g' coverage.xml 

sample output to be converted. Note this output is being sent to sonarqube server that does not accept this format. so I am manually modifying it to send what sonarqube accepts and it works fine. but here in this question my ask is about how I can achieve that.

<?xml version="1.0" ?>
<!DOCTYPE coverage
SYSTEM 'http://cobertura.sourceforge.net/xml/coverage-04.dtd'>
<coverage branch-rate="0.0" branches-covered="0" branches-valid="0" complexity="0" line-rate="0.5936254980079682" lines-covered="447" lines-valid="753" timestamp="1672197709" version="2.0.3">
    <packages>
        <package line-rate="0.7614678899082569" branch-rate="0.0" name="src.services.secrets.keys" complexity="0">
            <classes>
                <class branch-rate="0.0" complexity="0" filename="src/services/guava/keys/keys.go" line-rate="0.7614678899082569" name="src.services.guava.keys.keys.go">
                    <methods/>
                    <lines>
                        <line branch="false" hits="0" number="109"/>
                        <line branch="false" hits="0" number="123"/>
                    </lines>
                </class>
            </classes>
        </package>
        <package line-rate="0.5944055944055944" branch-rate="0.0" name="src.services.guava.vault" complexity="0">
            <classes>
                <class branch-rate="0.0" complexity="0" filename="src/services/secrets/vault/vault.go" line-rate="0.5944055944055944" name="src.services.guava.vault.vault.go">
                    <methods/>
                    <lines>
                        <line branch="false" hits="1" number="251"/>
                        <line branch="false" hits="1" number="253"/>
                    </lines>
                </class>
            </classes>
        </package>
    </packages>
</coverage>

expected output below that works with sonarqube

<?xml version="1.0" ?>
<!DOCTYPE coverage
  SYSTEM 'http://cobertura.sourceforge.net/xml/coverage-04.dtd'>
<coverage branch-rate="0.0" branches-covered="true" branches-valid="0" complexity="0" lineToCover-rate="0.5936254980079682" lineToCovers-covered="447" lineToCovers-valid="753" timestamp="1672173715" version="1">
        <file branch-rate="0.0" complexity="0" path="src/services/guava/keys/keys.go" lineToCover-rate="0.7614678899082569" name="src.services.guava.keys.keys.go">
                <lineToCover branch="false" covered="false" lineNumber="16"/>
                <lineToCover branch="false" covered="false" lineNumber="17"/>
        </file>
        <file branch-rate="0.0" complexity="0" path="src/services/guava/server/server.go" lineToCover-rate="0.744" name="src.services.guava.server.server.go">
                <lineToCover branch="false" covered="false" lineNumber="153"/>
        </file>
        <file branch-rate="0.0" complexity="0" path="src/services/secrets/tools/keyrotate/cmd_get.go" lineToCover-rate="0.0" name="src.services.guava.tools.keyrotate.cmd_get.go">
                <lineToCover branch="false" covered="true" lineNumber="16"/>
        </file>
        <file branch-rate="0.0" complexity="0" path="src/services/secrets/tools/keyrotate/cmd_rotate.go" lineToCover-rate="0.72" name="src.services.guava.tools.keyrotate.cmd_rotate.go">
                <lineToCover branch="false" covered="false" lineNumber="75"/>
        </file>
        <file branch-rate="0.0" complexity="0" path="src/services/secrets/tools/keyrotate/main.go" lineToCover-rate="0.0" name="src.services.guava.tools.keyrotate.main.go">
                <lineToCover branch="false" covered="true" lineNumber="80"/>
                <lineToCover branch="false" covered="true" lineNumber="81"/>
        </file>
        <file branch-rate="0.0" complexity="0" path="src/services/guava/vault/vault.go" lineToCover-rate="0.5944055944055944" name="src.services.guava.vault.vault.go">
                <lineToCover branch="false" covered="false" lineNumber="251"/>
                <lineToCover branch="false" covered="false" lineNumber="253"/>
        </file>
</coverage>

CodePudding user response:

To edit an XML file, you have special tool like xmlstarlet that let you edit the XML as well. Please, forget using sed and regex to parse XML.

To delete a node (using ):

xmlstarlet edit -d '//node' file.xml

To edit a node and @version attribut:

 xmlstarlet ed -Lu '/coverage/@version' -v '1' file.xml

will update the version with good practices.

The -L is the same as sed -i '': edit in-place

To install on Mac:

brew install xmlstarlet

CodePudding user response:

I hope, it works for your solution,

import re
with open('./stackoverflow_xml_replace.txt', 'r') as f:
    lines = f.readlines()
    for index, line in enumerate(lines):
        if '<coverage' in lines[index]:
            lines[index] = lines[index].replace('version="2.0.3"', 'version="1"')
        if 'class' in lines[index]:
            lines[index] = re.sub(r"\bclass\b","file",lines[index])
        if 'filename' in lines[index]:
            lines[index] = re.sub(r"\bfilename\b","path",lines[index])
        if '<line' in lines[index]:
            lines[index] = lines[index].replace('line', 'lineToCover')
        if 'number' in lines[index]:
            lines[index] = re.sub(r"\bnumber\b","lineNumber",lines[index])
        if '<source' in lines[index]:
            lines.remove(lines[index])
with open('./stackoverflow_xml_replace_output.txt', 'w') as f:
    f.writelines(lines)
print(lines)
  • Related