I want to simply count how many numbers are in a string inside BigQuery.
I can obviously count how long the string is using LENGTH():
DECLARE str STRING DEFAULT 'h0w many numb3rs?';
SELECT
str,
LENGTH(str) AS length;
But how do I count how many numbers are in the string?
CodePudding user response:
Extract only numeric values from the string using regex and count them
DECLARE str STRING DEFAULT 'h0w many numb3rs?';
SELECT
str,
LENGTH(str) AS length, array_length(regexp_extract_all(str, r'[\d]'))
reference:
CodePudding user response:
Ok, you guys got me :o)
Joining this thread with below simple and cheap option
DECLARE str STRING DEFAULT 'h0w many numb3rs?';
SELECT str, LENGTH(str) length, COUNT(*) digits
FROM UNNEST(SPLIT(str, '')) char
WHERE char BETWEEN '0' AND '9'
with output
CodePudding user response:
The trick is to remove numbers (by replacing numbers with empty strings using REGEXP_REPLACE) and then find the difference in the original string length and the reduced string length. Search the re2 documentation for available regex syntax. You can use the Perl shorthand [\d]
for digits (or [0-9]
works too).
DECLARE str STRING DEFAULT 'h0w many numb3rs?';
SELECT
str,
LENGTH(str) AS length,
LENGTH(str) - LENGTH(REGEXP_REPLACE(str, r'[\d]', '')) AS numbers,
LENGTH(str) - LENGTH(REGEXP_REPLACE(str, r'[[:alpha:]]', '')) AS non_alpha_characters;
(note: you can find the non-alpha characters (including spaces and punctuation) by matching and removing [[:alpha:]]
(or equivalently [A-Za-z]
) characters.