Home > other >  Batch - Use findstr and "inverse class"-regex does not work proper
Batch - Use findstr and "inverse class"-regex does not work proper

Time:11-12

i have the following Problem: I try to pass a string with some kind of "separators" to a function: i.e. string=This.string.contains.dots.as.separators

In the case above i know by looking, that the separators are dots, but in runtime someone shall also be able to pass a string like: string=This-string-contains-hyphens-as-separators

My challenge is now to get the separators. Therefore i try to loop through the given string (as the both examples above) -> char for char and if there is a char which is not between the set of a-z (all letters of alphabet) it must be the separator.

I try to find the separator under use of an "inverse class" with findstr. The funny thing is, that the findstr command will work if i directly paste it into the windows cmd shell: for example, the following works:

echo .|findstr "[^a-z]"

and does find the dot and returns an errorlevel=0 if there is something else than an a-z (which then must be the separator) -> recap: i walk through char for char through the string and on some point get the separator.

Here is my Code-Snippet:

@echo off
SETLOCAL ENABLEDELAYEDEXPANSION

:MAIN
setlocal
REM Call a function with an string and it shall return the separator in the string
call :GET_THE_SEPARATOR "This.string.is.separated.by.dots" "return_value"
echo The separator is = !return_value!
exit /b 0

:GET_THE_SEPARATOR
setlocal
set "separated_string=%~1"
REM Iterate over the first 10 chars of the string
for /l %%c in (0 1 10) do (
    set "act_char=!separated_string:~%%c,1!"
    REM Search for every char not in the alphabet (means to get i.e.: . , -)
    REM Unfortunately the findstr SYNTAX here seems to be broken, but works if you
    REM directly paste it into a console.: i.e.: echo .|findstr "[^a-z]"
    echo.!act_char! | findstr ""[^^a-z]"" >NUL && (
        set "separator=!act_char!
    )
)
REM Return the value
(endlocal 
    if "%~2" neq "" (set "%~2=%separator%")
)
exit /b 0

CodePudding user response:

@ECHO OFF
SETLOCAL

SET "alphabet=a b c d e f g h i j k l m n o p q r s t u v w x y z"

CALL :getsep "This.string.is.separated.by.dots" separators
ECHO %separators% IN %originalstring%
CALL :getsep "This-string-is-separated-by-dashes" separators
ECHO %separators% IN %originalstring%
CALL :getsep "This string is separated by spaces" separators
ECHO %separators% IN %originalstring%
CALL :getsep "This*string*is*separated*by*stars" separators
ECHO %separators% IN %originalstring%
GOTO :eof

:getsep
SET "originalstring=%~1"
SET "seps=%~1"
SETLOCAL ENABLEDELAYEDEXPANSION
FOR %%z IN (%alphabet%) DO SET "seps=!seps:%%z=!"
endlocal&SET "%2=%seps%"
GOTO :eof

Simply replace each character in the original string with nothing

If the separator characters are homogeneous, the first character of the return string is the separator, otherwise it's just a matter of removing duplicate characters.

CodePudding user response:

A different approach that can report back multiple seperators and their position in the string

@Echo off

Set "ExampleString=test-String-example^|three^&four^>five*six^<seven=eight:nine_ten^^b~eleven!twelve!"

Setlocal EnableDelayedExpansion
Set "TestString=!%~1!"

If not defined TestString (
    Endlocal
    Call %~n0 "ExampleString"
    Goto:Eof
)

Set /A "[#]=1","Pos[#]=-1"
REM start count of character position at 0 index for potential use in substring modification
REM Set TestString

REM The below parsing method splits a string into it's compenent characters.
For /f "usebackq Delims=" %%G in (`%Systemroot%\System32\cmd.exe /u /c ^"Echo(!TestString!^"^|%Systemroot%\System32\find.exe /v "[false_match_%~n0]"^|%Systemroot%\System32\findstr.exe "^^"`)Do (
    Set Char=^^^%%~G
    For /f "UseBackQ Delims=abcdefghijklmnopqrstuvwxtzABCDEFGHIJKLMNOPQRSTUVWXYZ" %%c in (`"Echo("!Char!""`) Do (   
        Set /A Pos[#] =1
      If not "^%%~c" == "^^" (
            If "^^^%%~c" == "^^^!"  (
                Set Delim.![#]!.@Char.!Pos[#]!=^^^%%~c
            ) Else Set Delim.![#]!.@Char.!Pos[#]!=^%%~c
        )
        If not "^%%~c" == "^^" Set /A [#] =1
    ) 
)

Set Delim
Pause

REM // perform any actions required with the delim information here or modify the script to return the values across the endlocal.
Endlocal & Goto:eof
  • Related