Home > Back-end >  BigQuery: SELECT AS STRUCT doesn't named struct fields as expected
BigQuery: SELECT AS STRUCT doesn't named struct fields as expected

Time:03-31

I've defined the following UDF to convert a string to a list of word counts:

CREATE TEMP FUNCTION get_bow_vector(document STRING) RETURNS ARRAY<STRUCT<STRING, INT>> AS ((
  SELECT
  ARRAY(
    SELECT AS STRUCT word AS word, COUNT(1) AS freq
    FROM UNNEST(REGEXP_EXTRACT_ALL(document, r'[\w_] ')) AS word
    GROUP BY word
    ORDER BY freq DESC
  )
));

This UDF mostly works as expected: it turns a string into an array of STRUCTs, where each struct contains a word and a count. However the field of the struct are not named word and freq as expected. Instead they're given the anonymous field names: _field_1 and _field_2.

I've got two questions:

  1. How can I name the fields of the struct?
  2. If I dont' name the fields of the struct, how do I access them anyway? When I try using _field_1 I get the following error Field name _field_1 does not exist in STRUCT<STRING, INT64>

CodePudding user response:

Use below instead

CREATE TEMP FUNCTION get_bow_vector(document STRING) 
RETURNS ARRAY<STRUCT<word STRING, freq INT>> AS ((
  SELECT
  ARRAY(
    SELECT AS STRUCT word AS word, COUNT(1) AS freq
    FROM UNNEST(REGEXP_EXTRACT_ALL(document, r'[\w_] ')) AS word
    GROUP BY word
    ORDER BY freq DESC
  )
));
  • Related