Home > Software engineering >  Writing Capybara expectations to verify phone numbers
Writing Capybara expectations to verify phone numbers

Time:12-16

I'm using AWS Textract to pull information from PDF documents. After the scanned text is returned from AWS and persisted to a var, I'm doing this:

phone_number = '(555) 123-4567'

scanned_pdf_text.should have_text phone_number

But this fails about 20% of the time because of the non-deterministic way that AWS is returning the scanned PDF text. On occasion, the phone numbers can appear either of these two ways:

(555)123-4567 or (555) 123-4567

Some of this scanned text is very large, and I'd prefer not to go through the exercise of sanitizing the text coming back if I can avoid it (I'm also not good at regex usage). I also think using or logic to handle both cases seems to be a little heavy handed just to check text that is so similar (and clearly near-identical to the human eye).

Is there an rspec matcher that'll allow me to check on this text? I'm also using Capybara.default_normalize_ws = true but that doesn't seem to help in this case.

CodePudding user response:

Assuming scanned_pdf_text is a string and the only differences you're seeing is in spaces then you can just get rid of the spaces and compare

scanned_pdf_text.gsub(/\s /, '').should eq('(555)123-4567') # exact

scanned_pdf_text.gsub(/\s /, '').should match('(555)123-4567') # partial

scanned_pdf_text.gsub(/\s /, '').should have_text('(555)123-4567') # partial
  • Related