Home > Software design >  What would be the regex to remove line breaks and all following spaces
What would be the regex to remove line breaks and all following spaces

Time:10-10

What I would like to achieve is to assert that two strings are equal ignoring formatting

        $expected =
            '<div class="input select">
                <label for="countries">User Country</label>
                <select name="countries" id="countries">
                    <option value="">Select a country</option>
                    <option value="ger">Germany</option>
                    <option value="fra">France</option>
                    <option value="rus" selected="selected">Russia</option>
                </select>
            </div>';
       $actual = '<div class="input select"><label for="countries">User Country</label><select name="countries" id="countries"><option value="">Select a country</option><option value="ger">Germany</option><option value="fra">France</option><option value="rus" selected="selected">Russia</option></select></div>';
       $this->assertTextEquals($expected, $actual);

So far I got to this:

    public function assertTextEquals($expected, $actual)
    {
        $actualNoLineBreaks = preg_replace("/\r|\n/", "", $actual);
        $actualNoLongSpaces = preg_replace('!\s !', ' ', $actualNoLineBreaks);
        $expectedNoLineBreaks = preg_replace("/\r|\n/", "", $expected);
        $expectedNoLongSpaces = preg_replace('!\s !', ' ', $expectedNoLineBreaks);
        TestCase::assertEquals($expectedNoLongSpaces, $actualNoLongSpaces);
    }

but the problem is that there are still spaces where linebreaks were, f.ex.: select"> <label.

So what would be the regex to strip line breaks and all following spaces (until the first non-space character)?

Of course I could strip all spaces from both strings but that would make error messages hard to read and I am looking for an elegant solution.

CodePudding user response:

Check this:

(( *\n |\A) *)

\A is to remove any space at start of the string.

demo

CodePudding user response:

You can use

trim(preg_replace('~\h*\R\s*~', '', $expected))

See the PHP demo and the regex demo. Details:

  • \h* - zero or more horizontal whitespaces (tabs, spaces...)
  • \R - any line break char (sequence)
  • \s* - and any zero or more whitespace chars.

The trim() function removes leading/trailing whitespace.

  • Related