I need to identify the new line chars if any using powershell or batch file and if present remove.
CodePudding user response:
I am afraid I don't really understand what you want. You didn't posted any input file nor specified what is the output you want from such an input. Anyway, I hope this code can help:
@echo off
setlocal EnableDelayedExpansion
rem Create a test file
set LF=^
%don't remove%
%these lines%
(
echo Line One: CR LF
set /P "=Line Two: LF!LF!"
echo Line Three: CR LF
) > test.txt < NUL
rem Read the file
set "acum=0"
(for /F "tokens=1* delims=:" %%a in ('findstr /O "^" test.txt') do (
if not defined line (
set "line=%%b"
) else (
set /A "len=%%a-acum-2, acum=%%a"
for %%n in (!len!) do if "!line:~%%n!" equ "" (
echo !line!
) else (
set /P "=!line!"
)
set "line=%%b"
)
)) < NUL
for %%a in (test.txt) do set /A "len=%%~Za-acum-2"
(for %%n in (!len!) do if "!line:~%%n!" equ "" (
echo !line!
) else (
set /P "=!line!"
)) < NUL
Output:
Line One: CR LF
Line Two: LFLine Three: CR LF
This example first create a file with three lines, but the second one is ended in LF instead of CR LF. Then, the program identify how each line ends and remove the alone LF's
The method is based on findstr /O
switch that reports the offset of the first byte of each line starting from beginning of file
CodePudding user response:
In a comment you state:
each record starts with
DTL
It sounds like the way to fix your file is to remove any newlines that aren't followed by verbatim DTL|
:
# Create sample file.
@'
DTL|foo1
DTL|foo2
result of an unwanted
newline or two
DTL|foo3
'@ > test.txt
# Replace all newlines not directly followed by verbatim 'DTL|'
# with a space (remove `, ' '` if you simply want to remove the newlines).
# Pipe to Set-Content in order to save to a file as needed.
(Get-Content -Raw test.txt) -replace '\r?\n(?!DTL\|)', ' '
Output:
DTL|foo1
DTL|foo2 result of an unwanted newline or two
DTL|foo3
For an explanation of the regex used with the -replace
operator above and the ability to experiment with it, see this regex101.com page.