Home > OS >  SQL - regexp_extract JSON string
SQL - regexp_extract JSON string

Time:11-30

I'm trying to extract from the following string the locationID

{"type":"player","topic_id":"555","topic_name":"sfd","userId":116,"userLocation":{"countryCode":"BR","locationId":21,"locationCity":"Rio de Janeiro"}}

I'm able to extract for example the topic_id using the following safe_cast(regexp_extract(h.events.label,r'"topic_id":"([a-zA-Z0-9-_. ] )"') as int64 but this doesn't work for locationId. I'm guessing it's because of the nested dict? But not sure how to get around that.

CodePudding user response:

You'd better using a json function rather than a regexp function.

WITH sample_data AS (
  SELECT '{"type":"player","topic_id":"555","topic_name":"sfd","userId":116,"userLocation":{"countryCode":"BR","locationId":21,"locationCity":"Rio de Janeiro"}}' json 
)
SELECT CAST(JSON_VALUE(json, '$.userLocation.locationId') AS INT64) AS locationId 
  FROM sample_data;

 ------------ 
| locationId |
 ------------ 
|         21 |
 ------------ 

this doesn't work for locationId. I'm guessing it's because of the nested dict?

  • I guess it's because a value of topic_id is a string "555" and a value of locationId is an integer 21.
  • r'"locationId":([a-zA-Z0-9-_. ] )' will work for locationId but more simple regular expression would be r'"locationId":(\d )'
  • Related