Why in javascript are regex matches in arabic reversed?-CodePudding

I have some text I am trying to use regexes on. I need to make a match on numbers with % sign which are surrounded by arabic text. I have some regexes that look like this:

const re1 = new RegExp('\\d*,\\d*%');
const re2 = new RegExp('%\\d*,\\d*');

I have some text that looks like this:

%9,2 ملمول/مول 77 أو

I would expect the second regex to match the text but it is the first one that matches the text. Im sure there is a good reason for it, but I did not see anything in documentation about this. Why does it do this? What is the correct way to match a number with percent sign that is embedded in arabic text?

CodePudding user response：

I would expect the second regex to match the text

This is exactly what happens:

const re1 = new RegExp('\\d*,\\d*%');
const re2 = new RegExp('%\\d*,\\d*');

const str = '%9,2 ملمول/مول 77 أو'

console.log(re1, str.match(re1))
console.log(re2, str.match(re2))

CodePudding user response：

This works on my machine (MacOs, US culture), running Node.js v14.19.3:

const corpus = '%9,2 ملمول/مول 77 أو ';

const rx = /%(\d ),(\d )/ ;

const m = rx.exec(corpus);

if (!m) {
    console.log("no match found");
} else {
    const [ match, value1, value2 ] = m;
    console.log(`entire match: «${match}»`  );
    console.log(`1st value:    «${value1}»` );
    console.log(`2nd value:    «${value2}»` );
}

Running the above yields:

entire match: «%9,2»
1st value:    «9»
2nd value:    «2»