Home > Software engineering >  REGEXP_EXTRACT in google data studio
REGEXP_EXTRACT in google data studio

Time:11-18

I'm trying to generate 3 new calculated fields in Google Data Studio / Looker Studio using the REGEXP_EXTRACT function.

Here is the sample data that I have on a Google sheet:

Sample data
Serviços de Impressão > Impressoras > Falha na impressão > Troca de Tonner
Aplicativos e Softwares > Avaliação de Aplicativos e Software > Pacote Office
Computadores e Periféricos > Manutenção > Teclado / Mouse > Aquisição de equipamento
Acessos > Certificado Digital

Each > represents a division and ideally what I would like to do is extract the first three fields and disregard the rest, something like this:

Calculated field 1 Calculated field 2 Calculated field 3
Serviços de Impressão Impressoras Falha na impressão
Aplicativos e Softwares Avaliação de Aplicativos e Software Pacote Office
Computadores e Periféricos Manutenção Teclado / Mouse
Acessos Certificado Digital null

I managed to generate a code to extract the first calculated field using

REGEXP_EXTRACT(Sample data,'^(. ?)>')

but in the second I didn't know how to do it, since I can always have one or more separators > as in the example of the last line.

How can I formulate the codes for calculated fields 2 and 3 please?

CodePudding user response:

The second field can be extracted with

 > (.*?)(?: > |$)

The third field - if any - can be obtained with

 > .*? > (.*?)(?: > |$)

The (?: > |$) non-capturing group matches either space > space, or end of string.

CodePudding user response:

you can try:

REGEXP_EXTRACT(test1,'^(?:[^>] >){0}([^>] )')

the above formula is for the first field. change the Zero in parentheses to 1 for second field and 2 for third field.

enter image description here

  • Related