Home > OS >  I'm unable to cast numbers as float64 using bigquery, keep getting bad double value errors
I'm unable to cast numbers as float64 using bigquery, keep getting bad double value errors

Time:11-18

I'm trying to extract numbers as substrings from between certain characters before casting them as float64 using BigQuery, but I'm getting a Bad Double Value error for some values. I tried using safe_cast to identify which values cause the error and they return as null (as expected) but I can't seem to figure out why these values can't be casted as float64 since they are in fact numbers. The only thing in common with the anomalies is that the first number extracted in a row is 0 but there are other values which do this and are casted fine.

This is an example of the string which numbers are extracted from: AOS-1545902(NCP)*0@84‬#475 which is object_text stored in a table named tr. So in this case, the first number that is extracted is 0 from between the '*' and '@', the second is 84 from between '@' and '#', and the last one would be 475 after the '#'.

This is the query that I am using to extract the numbers and cast them to float64:

cast(substr(tr.object_text, strpos(tr.object_text,'*') 1, (strpos(tr.object_text,'@')-(strpos(tr.object_text,'*') 1))) as float64) AS FP_Share,
safe_cast(substr(tr.object_text, strpos(tr.object_text,'@') 1, (strpos(tr.object_text,'#')-(strpos(tr.object_text,'@') 1))) as float64) AS V_Share,

cast(substr(tr.object_text, strpos(tr.object_text,'#') 1) as float64) as Cust_Price,

From these, V_Share(the number between the '@' and '#') is the one that has these anomalies and when I extract the number without casting it to float64 using this query:

substr(tr.object_text, strpos(tr.object_text,'@') 1, (strpos(tr.object_text,'#')-(strpos(tr.object_text,'@') 1))) AS noCast_V_Share,

There are a total of 8 of these anomalies as you can see in the snippet below: Results Snippet

Hoping someone could help me out with this!

CodePudding user response:

Since you seem to have unprintable characters causing problems in your extraction, you could use REGEXP_EXTRACT to extract only the numbers (and possibly period if required) which should result in a valid conversion, something like;

CAST(
  REGEXP_EXTRACT(
    SUBSTR(tr.object_text, 
      STRPOS(tr.object_text,'@') 1, 
      (STRPOS(tr.object_text,'#')-(STRPOS(tr.object_text,'@') 1))), 
    '[0-9.] '
  ) AS FLOAT64
) AS V_Share
  • Related