How to remove adjacent duplicates in a string in BigQuery?-CodePudding

I have a table in BigQuery where one of the columns has string values that look like this:

Page A >> Page A >> Page B >> Page B >> Page A >> Page A >> Page C

I am trying to find a way to de-duplicate adjacent values only, so that the string becomes this:

Page A >> Page B >> Page A >> Page C

I have tried converting the string to an array and then using DISTINCT to remove duplicates, then converting back to a string, but I end up with this, which is wrong:

Page A >> Page B >> Page C

Can anyone help please? Maybe there's a regex solution? Or was I on the right track with arrays?

CodePudding user response：

You might consider below answers for similar question.

- regexp101.com explanation