Home > Mobile >  powershell replace newline not working: `n
powershell replace newline not working: `n

Time:12-25

I have a gpx file, which is just xml, and want to run a powershell script to delete the < time > node.

    <trkpt lat="-33.483478" lon="150.159805">
    <name> p2 </name>
    <time>2021-02-23T00:00:12Z</time>
    </trkpt>
    <trkpt lat="-33.483852" lon="150.158309">
    <name> p3 </name>
    <time>2021-02-23T00:00:56Z</time>
    </trkpt>
    <trkpt lat="-33.483943" lon="150.157897">
    <name> p4 </name>
    <time>2021-02-23T00:01:07Z</time>
    </trkpt>
    <trkpt lat="-33.484066" lon="150.157592">
    <name> p5 </name>
    <time>2021-02-23T00:01:17Z</time>
    </trkpt>

each line ends in just LF or \n. I want to delete the < time > node including the newline.

I know I have the correct newline, or EOL, because I can see this clearly in Notepad and the regex in this works perfectly <time>(.*?)</time>\n .

so I use powershell with this code:

(gc test.gpx) -replace '<time>(.*?)</time>`n', '' | Out-File -encoding ASCII processed1.gpx

all my research shows newline for powershell is `n (not \n). I have also tried `r`n and double quotes "`n" or "`r`n" just in case, and its just not working. I have searched similar questions and their answers dont seem to work for me.

Help appreciated!!

Ben

CodePudding user response:

Note: For robustness, it is always preferable to use a dedicated XML parser to manipulate XML, such as the .NET [xml] (System.Xml.XmlDocument) type - see the bottom section.

As for what you tried:

  • Get-Content (gc) reads files line by line by default, and since the resulting lines have any trailing newline removed from them, the -replace,operator by definition cannot find any newlines to match - and, because an array of strings (lines) is provided as the input, -replace operates on each line.

    • Add the -Raw switch to read the entire file in full, as a single, multi-line string instead.
  • While it is true that you need escape sequence `n to represent a newline (LF) character in PowerShell, that only works in expandable (double-quoted) strings ("...").

    • While you could change the quoting of the regex to "...", the better approach is to use a verbatim (single-quoted) string ('...') and use the regex escape sequence, \n, to represent a newline (which PowerShell simply passes through to the .NET regex engine that underlies its regex features, such as the -replace operator).
    • Additionally, you may want to use \r?\n in order to handle both Windows-format CRLF and Unix-format LF-only newlines.

Therefore (note that omitting the replacement string is the same as passing ''):

(gc -Raw test.gpx) -replace '<time>(.*?)</time>\r?\n'

XML-parsing solution:

# Sample input, wrapped in a <xml> element.
# To load from a file, use. Load() with a *full file path*:
#   ($xml = [xml]::new()).Load("$PWD/test.gpx")
($xml = [xml]::new()).LoadXml(@'
<xml>
  <trkpt lat="-33.483478" lon="150.159805">
    <name> p2 </name>
    <time>2021-02-23T00:00:12Z</time>
  </trkpt>
  <trkpt lat="-33.483852" lon="150.158309">
    <name> p3 </name>
    <time>2021-02-23T00:00:56Z</time>
  </trkpt>
  <trkpt lat="-33.483943" lon="150.157897">
    <name> p4 </name>
    <time>2021-02-23T00:01:07Z</time>
  </trkpt>
  <trkpt lat="-33.484066" lon="150.157592">
    <name> p5 </name>
    <time>2021-02-23T00:01:17Z</time>
  </trkpt>
</xml>
'@)

$xml.xml.ChildNodes.ForEach({ 
  $parent = $_
  $null = $parent.ChildNodes.
    Where({ $_.name -eq 'time' }).
    ForEach({ $parent.RemoveChild($_) }) 
})

# Use $xml.Save - with a full output file path - to save the modified XML:
#    $xml.Save("$PWD/processed1.gpx")

Note: The above doesn't create pretty-printed XML output, i.e. any original pretty-printing is lost. If pretty-printing is desired, there are two options:

  • Set .PreserveWhitespace to $true on the [xml] instance before calling .Load() / .LoadXml() - however, that may leave a blank line for each element removed.

  • Re-perform pretty printing on saving - see the bottom section of this answer.

CodePudding user response:

Open a power shell in the directory of your gpx file

use this regex: (?<=beginningstringname)(.*\n?)(?=endstringname)

Run this command

             get-content test.gpx | %{$_ -replace "(?<=<time>)(.*?)(?=<\/time>)",""} 

Then

get-content test.gpx | %{$_ -replace "findText","replaceText"}

  • Related