Home > database >  How Many Numbers Are In A BigQuery Field That Is A String
How Many Numbers Are In A BigQuery Field That Is A String

Time:08-25

I want to simply count how many numbers are in a string inside BigQuery.

I can obviously count how long the string is using LENGTH():

DECLARE str STRING DEFAULT 'h0w many numb3rs?';
SELECT
  str,
  LENGTH(str) AS length;

But how do I count how many numbers are in the string?

CodePudding user response:

Extract only numeric values from the string using regex and count them

DECLARE str STRING DEFAULT 'h0w many numb3rs?';
SELECT
  str,
  LENGTH(str) AS length, array_length(regexp_extract_all(str, r'[\d]'))

enter image description here

reference:

enter image description here

CodePudding user response:

Ok, you guys got me :o)
Joining this thread with below simple and cheap option

DECLARE str STRING DEFAULT 'h0w many numb3rs?';
SELECT str, LENGTH(str) length, COUNT(*) digits
FROM UNNEST(SPLIT(str, '')) char
WHERE char BETWEEN '0' AND '9'    

with output

enter image description here

CodePudding user response:

The trick is to remove numbers (by replacing numbers with empty strings using REGEXP_REPLACE) and then find the difference in the original string length and the reduced string length. Search the re2 documentation for available regex syntax. You can use the Perl shorthand [\d] for digits (or [0-9] works too).

DECLARE str STRING DEFAULT 'h0w many numb3rs?';
SELECT
  str,
  LENGTH(str) AS length,
  LENGTH(str) - LENGTH(REGEXP_REPLACE(str, r'[\d]', '')) AS numbers,
  LENGTH(str) - LENGTH(REGEXP_REPLACE(str, r'[[:alpha:]]', '')) AS non_alpha_characters;

(note: you can find the non-alpha characters (including spaces and punctuation) by matching and removing [[:alpha:]] (or equivalently [A-Za-z]) characters.

  • Related