Home > Software design >  I hard regex to optional space character when I need get rid space character in output/replace
I hard regex to optional space character when I need get rid space character in output/replace

Time:08-03

I tried many changes times but not work, 99% success match.

I want optional space properly. and replace group 1,2,3,4,5 without being space like (.sys) but not space (.sys )

regex search:

(?<size>[ -]?(?:(?:[0-9]{1,3}(?:,[0-9]{3}) |[0-9] )(?:\.[0-9] )?|\.[0-9] ))[\t\x20]*(?<size_type>(?i)gb|mb|m|g)[\t\x20]*(?<file>. (?=\.)|. )(?<type>(?:\..*)?)\s*\|\s*(?<path>(?i:C|D):.*\\)

regex replace:

(\1)(\2)(\3)(\4)(\5)

Text:

3.9 GB pagefile.sys | C:\
3.9 GB pagefile.sys |C:\
3.9 GB pagefile.sys| C:\
3.9 GB pagefile.sys|C:\

3.9 GB pagefile.sys | C:\
3.9 GBpagefile.sys | C:\
3.9GB pagefile.sys | C:\
3.9GBpagefile.sys | C:\

expected behavior I want:

(3.9)(GB)(pagefile)(.sys)(C:\)
(3.9)(GB)(pagefile)(.sys)(C:\)
(3.9)(GB)(pagefile)(.sys)(C:\)
(3.9)(GB)(pagefile)(.sys)(C:\)

(3.9)(GB)(pagefile)(.sys)(C:\)
(3.9)(GB)(pagefile)(.sys)(C:\)
(3.9)(GB)(pagefile)(.sys)(C:\)
(3.9)(GB)(pagefile)(.sys)(C:\)

actual behavior:

(3.9)(GB)(pagefile)(.sys )(C:\)
(3.9)(GB)(pagefile)(.sys )(C:\)
(3.9)(GB)(pagefile)(.sys)(C:\)
(3.9)(GB)(pagefile)(.sys)(C:\)

(3.9)(GB)(pagefile)(.sys )(C:\)
(3.9)(GB)(pagefile)(.sys )(C:\)
(3.9)(GB)(pagefile)(.sys )(C:\)
(3.9)(GB)(pagefile)(.sys )(C:\)

See regex101.com here link

anyone help?

CodePudding user response:

The reason you see an extra space in the replacement is because the .* in in matching the type (?<type>(?:\..*)?) can also match a space.

You could restrict it using \S* matching optional non whitespace chars if there has to be at least a single dot.

The alternation for the size_type can also be written using character classes (?<size_type>(?i)[gm]b|[mg]) and the same for the path (?<path>(?i:[CD]):.*\\)

The whole pattern could look like:

(?<size>[ -]?(?:(?:[0-9]{1,3}(?:,[0-9]{3}) |[0-9] )(?:\.[0-9] )?|\.[0-9] ))[\t\x20]*(?<size_type>(?i)[gm]b|[mg])[\t\x20]*(?<file>. (?=\.)|. )(?<type>(?:\.\S*)?)\s*\|\s*(?<path>(?i:[CD]):.*\\)

Regex demo

If there is always a pipe char and a single char C or D followed by :\ another option could be:

(?<size>[ -]?(?:(?:[0-9]{1,3}(?:,[0-9]{3}) |[0-9] )(?:\.[0-9] )?|\.[0-9] ))[\t\x20]*(?<size_type>(?i)gb|mb|m|g)[\t\x20]*(?<file>[^\s|] )(?<type>\.[^|\s] )[\t\x20]*\|[\t\x20]*(?<path>(?i:[CD]):\\)

Regex demo

  • Related