Home > Software engineering >  Delete multiline commets in xml - powershell
Delete multiline commets in xml - powershell

Time:01-16

I want to get content without comments and partially succeed, but there is a multi-line comment that sucks. source xml: <add key="Service.Cat" value="xxxxxx" provider-name="SQL" date-format="yyyy-MM-dd HH:mm:ss"> <counters /> </add> <!-- <add key="XXXX" value="SQL;Persist Security Info=False;User ID=xxxxx;Password=;Initial Catalog=xxxxx;Data Source=xxxxxx" provider-name="SQL" date-format="yyyy-MM-dd HH:mm:ss"> <counters /> </add> <add key="LB" value="SQL;Initial Catalog=FTU;Data Source=xxx.xxx.xxx;Integrated Security=xxxx" provider-name="SQL" date-format="yyyy-MM-dd HH:mm:ss"> <counters /> </add>--> <add key="xxxxxx" value="Initial Catalog=xzxxx;Data Source=xxxxxx;Trusted_Connection=yes;" provider-name="SQL" date-format="yyyy-MM-dd HH:mm:ss"> <counters />

poweshel code: Get-Content -Encoding UTF8 -Path 'C:\config1.xml'| % {$_ -replace ("<!--([\s\S] ?)-->","") } | out-file C:\config_temp.xml

BUT the comment remains, while the one-liners are normally deleted.

poweshel code: Get-Content -Encoding UTF8 -Path 'C:\config1.xml'| % {$_ -replace ("<!--([\s\S] ?)-->","") } | out-file C:\config_temp.xml

BUT the comment remains, while the one-liners are normally deleted.

CodePudding user response:

You should not treat xml as ordinary text, but instead use PowerShells xml capabilities for this:

If your xml file looks anything like this

<config>
    <add key="Service.Cat" value="xxxxxx" provider-name="SQL" date-format="yyyy-MM-dd HH:mm:ss">
        <counters />
    </add>
    <!--    
        <add key="XXXX" value="SQL;Persist Security Info=False;User ID=xxxxx;Password=;Initial Catalog=xxxxx;Data Source=xxxxxx" provider-name="SQL" date-format="yyyy-MM-dd HH:mm:ss">
            <counters /></add><add key="LB" value="SQL;Initial Catalog=FTU;Data Source=xxx.xxx.xxx;Integrated Security=xxxx" provider-name="SQL" date-format="yyyy-MM-dd HH:mm:ss"><counters />
        </add>
    -->
    <add key="xxxxxx" value="Initial Catalog=xzxxx;Data Source=xxxxxx;Trusted_Connection=yes;" provider-name="SQL" date-format="yyyy-MM-dd HH:mm:ss">
        <counters />
    </add>
    <!-- This is a single line comment -->
</config>

and you want to remove ALL comments from it, do this:

# load the xml file. This way, you are ensured to get the file encoding correct
$xml = [System.Xml.XmlDocument]::new()
$xml.Load('X:\FullPath\To\TheFile.xml')

$commentNodes = $xml.SelectNodes("//comment()")
foreach ($node in $commentNodes) {
    [void]$node.ParentNode.RemoveChild($node)
}
$xml.Save('X:\FullPath\To\TheFile.xml')

If you want to remove only the multiline comments and leave the single line comments intact, do this:

# load the xml file. This way, you are ensured to get the file encoding correct
$xml = [System.Xml.XmlDocument]::new()
$xml.Load('X:\FullPath\To\TheFile.xml')

$commentNodes = $xml.SelectNodes("//comment()")
foreach ($node in $commentNodes) {
    if ($node.InnerText -match '\r?\n') {
        [void]$node.ParentNode.RemoveChild($node)
    }
}
$xml.Save('X:\FullPath\To\TheFile.xml')
  • Related