Home > Enterprise >  How can I de-concatenate a string in BigQuery, removing the initial substring that exists in another
How can I de-concatenate a string in BigQuery, removing the initial substring that exists in another

Time:08-10

Someone sent me data that looks like this

WITH

foo as (
  SELECT "Field Outfield" AS Zone, "Field Outfield 109" AS Section
  UNION ALL SELECT "Bleachers" AS Zone, "Bleachers 202" AS Section
  UNION ALL SELECT "Delta Suites" AS Zone, "Delta Suite 218 A" AS Section
  UNION ALL SELECT "Jim Beam Suites" AS Zone, "Terrace 317" AS Section
)

SELECT * FROM foo
|Zone            |Section            |
|:---------------|:------------------|
|Field Outfield  |Field Outfield 109 |
|Bleachers       |Bleachers 202      |
|Delta Suites    |Delta Suite 218 A  |
|Jim Beam Suites |Terrace 317        |

As you can see, the Section values are usually prefixed by the Zone name. In other words, someone seems to have concatenated the Zone values with the true Section values. I would like to transform the data to its original state, which should look like this

|Zone            |Section            |
|:---------------|:------------------|
|Field Outfield  |109                |
|Bleachers       |202                |
|Delta Suites    |218 A              |
|Jim Beam Suites |Terrace 317        |

I would use REGEXP_REPLACE(), but I don't think you can do it in a vectorized fashion.

CodePudding user response:

You simply use REPLACE, it will only remove Zone, if it is there, the TRIM will removes Spaces leading and trailing.

WITH

foo as (
  SELECT "Field Outfield" AS Zone, "Field Outfield 109" AS Section
  UNION ALL SELECT "Bleachers" AS Zone, "Bleachers 202" AS Section
  UNION ALL SELECT "Delta Suites" AS Zone, "Delta Suite 218 A" AS Section
  UNION ALL SELECT "Jim Beam Suites" AS Zone, "Terrace 317" AS Section
)

SELECT Zone, TRIM(REPLACE(Section, Zone, '')) FROM foo
  • Related