Home > Back-end >  Regex works for form validation but the same regex used in a query returns records that do not match
Regex works for form validation but the same regex used in a query returns records that do not match

Time:11-08

I want to prevent my Client records from having a number or special character in it's name or last_name attributes.

I've set up a format validation in my Client model like so:

validates :name,
          format: { with: /\A[a-zÀ-ú\s] \z/i, message: :wrong_format }
validates :last_name,
          format: { with: /\A[a-zÀ-ú\s] \z/i, message: :wrong_format }

To my knowledge, this will allow for a string with alphabetic characters including accentuation marks like Á (a-zÀ-ú), plus any white space character (\s), and 0 or more of that combination ( ). The /i at the end being for case insensitive.

The validation works fine, preventing form submissions with something like P3ter.

The problem is: using the same regex to query my Postgres DB for records that match it returns records like P3ter. And of course, negating that regex in a query to search for records that do not match it returns no records like P3ter (when I do have them).

select * from clients
where name !~* '[a-zÀ-ú\s] '
or last_name !~* '[a-zÀ-ú\s] '

// no results

I've tried a different regex like [[:alpha:][:space:]] with the same satisfactory results for form validation but the same weird result when used in a DB query.

How is it that the regex is working for format validations to prevent form submissions with a string like P3ter but the same regex returns records like P3ter when used in a query?

CodePudding user response:

The problem is:

You didn't add ^ at the beginning of the regex pattern, and $ the end. The regex pattern [a-zÀ-ú\s] matches ter of P3ter.

^ matches the begin of each line. while $ matches the end of each line.

[a-zÀ-ú\s] this means that match only one character from a to z or from À to ú or \s which is a whitespace character (equivalent to [\r\n\t\f\v ]).

[a-zÀ-ú\s] this part matches ter of P3ter. The sign means match the previous token between one and unlimited times which basically means match one or more, at least match one character.

The solution:

select * from clients
where name !~* '^[a-zÀ-ú\s] $'
or last_name !~* '^[a-zÀ-ú\s] $'
  • Related