Home > Software engineering >  Search for regex terms independent of symbol
Search for regex terms independent of symbol

Time:09-03

So I have this string right here, it's a registration number:

12.325.767/0001-90

And I want to create a regex when I type "23", return for me the "2.3" or if I type "700", return for me "7/00" or if I type "19", return "1-9".

So, I want to get the number even if there is a symbol in the middle.

I'm trying this on Javascript, here what I have:

const cnpj = "12.325.767/0001-90";
const search = "12";

const regex = new RegExp(`(${search})`, "i");

const result = cnpj.split(regex);

output:

[ '', '12', '.325.767/0001-90' ]

This output is correct, because I put a number that does not have symbols in its composition.

But when I try to search a number that contains a symbol in its composition, is not splitted.

CodePudding user response:

I used your explanation to improve my code for what I want, look:

I will try to show the cases:

12.322.362/3002-32

In this number, if I type "23", notice that I have in multiple parts of the number. So I did this:

const cnpj = "12.322.362/3002-32";
const search = "23";

const symbols = ".,/:;-";
const reInput = search.split("").join(`[${symbols}]?`);
const regex = new RegExp(`(${reInput})`, "i");

const result = cnpj.split(regex);

And the output is perfect, because he shows me every "23" of the expression, look the output:

result: ['1', '2.3', '2','2.3', '6',   '2/3', '00',  '2-3', '2']

Even if I put the complete number, he returns me the correct result, for exemple, if I type "123223", this is the output:

[ '', '12.322.3', '62/3002-32' ]

And finally, with this, I can check in the array the indexes that matches, look:

const cnpj = "12.322.362/3002-32";
const search = "23";

const symbols = ".,/:;-";
const reInput = search.split("").join(`[${symbols}]?`);
const regex = new RegExp(`(${reInput})`, "i");

const data = cnpj.split(regex);

data.map((item, index) => {
  if (item.toLowerCase().match(regex)) {
    console.log("match");
  } else {
    console.log("doesn't match");
  }
});

Output:

doesn't match
match
doesn't match
match
doesn't match
match
doesn't match
match
doesn't match

So, thank you eventHandler for your amazing explanation, the logics works fine this way.

CodePudding user response:

You could convert the user input into a RegExp that has symbols between each digit. For example, 12 becomes 1[.,/:;]?2, so you search if there is one or none of those symbols between the numbers.

If you used a function like this, you would get the result you want:

function underline(input, text) {
    const symbols = '.,/:;';
    const reInput = input.split('').join(`[${symbols}]?`);
    const re = new RegExp(reInput);

    return re.exec(text);
}

for user input 23, with your example text 12.325.767/0001-90, this function returns the following result

[ '2.3', index: 1, input: '12.325.767/0001-90', groups: undefined ]

You can substring from that, using the index and the result length, or make a more complex RegExp, like this:

function underline(input, text) {
    const symbols = '.,/:;';
    const reInput = input.split('').join(`[${symbols}]?`);
    const reText = '(.*?)('   reInput   ')(.*)';
    const re = new RegExp(reText);
    const [_, ...result] = re.exec(text);
    
    return result;
}

This returns the array you expected.

[ '1', '2.3', '25.767/0001-90' ]
  • Related