I have a gpx file, which is just xml, and want to run a powershell script to delete the < time > node.
<trkpt lat="-33.483478" lon="150.159805">
<name> p2 </name>
<time>2021-02-23T00:00:12Z</time>
</trkpt>
<trkpt lat="-33.483852" lon="150.158309">
<name> p3 </name>
<time>2021-02-23T00:00:56Z</time>
</trkpt>
<trkpt lat="-33.483943" lon="150.157897">
<name> p4 </name>
<time>2021-02-23T00:01:07Z</time>
</trkpt>
<trkpt lat="-33.484066" lon="150.157592">
<name> p5 </name>
<time>2021-02-23T00:01:17Z</time>
</trkpt>
each line ends in just LF or \n. I want to delete the < time > node including the newline.
I know I have the correct newline, or EOL, because I can see this clearly in Notepad and the regex in this works perfectly <time>(.*?)</time>\n
.
so I use powershell with this code:
(gc test.gpx) -replace '<time>(.*?)</time>`n', '' | Out-File -encoding ASCII processed1.gpx
all my research shows newline for powershell is `n
(not \n
). I have also tried `r`n
and double quotes "`n"
or "`r`n"
just in case, and its just not working. I have searched similar questions and their answers dont seem to work for me.
Help appreciated!!
Ben
CodePudding user response:
Note: For robustness, it is always preferable to use a dedicated XML parser to manipulate XML, such as the .NET [xml] (System.Xml.XmlDocument)
type - see the bottom section.
As for what you tried:
Get-Content
(gc
) reads files line by line by default, and since the resulting lines have any trailing newline removed from them, the-replace
,operator by definition cannot find any newlines to match - and, because an array of strings (lines) is provided as the input,-replace
operates on each line.- Add the
-Raw
switch to read the entire file in full, as a single, multi-line string instead.
- Add the
While it is true that you need escape sequence
`n
to represent a newline (LF) character in PowerShell, that only works in expandable (double-quoted) strings ("..."
).- While you could change the quoting of the regex to
"..."
, the better approach is to use a verbatim (single-quoted) string ('...'
) and use the regex escape sequence,\n
, to represent a newline (which PowerShell simply passes through to the .NET regex engine that underlies its regex features, such as the-replace
operator). - Additionally, you may want to use
\r?\n
in order to handle both Windows-format CRLF and Unix-format LF-only newlines.
- While you could change the quoting of the regex to
Therefore (note that omitting the replacement string is the same as passing ''
):
(gc -Raw test.gpx) -replace '<time>(.*?)</time>\r?\n'
XML-parsing solution:
# Sample input, wrapped in a <xml> element.
# To load from a file, use. Load() with a *full file path*:
# ($xml = [xml]::new()).Load("$PWD/test.gpx")
($xml = [xml]::new()).LoadXml(@'
<xml>
<trkpt lat="-33.483478" lon="150.159805">
<name> p2 </name>
<time>2021-02-23T00:00:12Z</time>
</trkpt>
<trkpt lat="-33.483852" lon="150.158309">
<name> p3 </name>
<time>2021-02-23T00:00:56Z</time>
</trkpt>
<trkpt lat="-33.483943" lon="150.157897">
<name> p4 </name>
<time>2021-02-23T00:01:07Z</time>
</trkpt>
<trkpt lat="-33.484066" lon="150.157592">
<name> p5 </name>
<time>2021-02-23T00:01:17Z</time>
</trkpt>
</xml>
'@)
$xml.xml.ChildNodes.ForEach({
$parent = $_
$null = $parent.ChildNodes.
Where({ $_.name -eq 'time' }).
ForEach({ $parent.RemoveChild($_) })
})
# Use $xml.Save - with a full output file path - to save the modified XML:
# $xml.Save("$PWD/processed1.gpx")
Note: The above doesn't create pretty-printed XML output, i.e. any original pretty-printing is lost. If pretty-printing is desired, there are two options:
Set
.PreserveWhitespace
to$true
on the[xml]
instance before calling.Load()
/.LoadXml()
- however, that may leave a blank line for each element removed.Re-perform pretty printing on saving - see the bottom section of this answer.
CodePudding user response:
Open a power shell in the directory of your gpx file
use this regex: (?<=beginningstringname)(.*\n?)(?=endstringname)
Run this command
get-content test.gpx | %{$_ -replace "(?<=<time>)(.*?)(?=<\/time>)",""}
Then
get-content test.gpx | %{$_ -replace "findText","replaceText"}