Home > OS >  How do I extract URLs within HTML from a Google Sheets cell?
How do I extract URLs within HTML from a Google Sheets cell?

Time:09-10

I have a list of cells that each include raw HTML. I want to extract all the URLs from each cell (some cells contain multiple URLs).

So from a cell that has:

<p>This is <a href="https://www.test.com">only</a> a test! Do not <a href="https://www.afraid.com">be</a> afraid!</p>

I want it to just say https://www.test.com and https://www.afraid.com in separate adjacent cells in the same row.

I think I can use REGEXEXTRACT, but I'm a huge novice when it comes Regex, so any help is appreciated!

CodePudding user response:

Use this

=ArrayFormula(IF(A2:A="",,TRANSPOSE(QUERY(TRANSPOSE(IFNA(
                          REGEXEXTRACT(SPLIT(A2:A, """>", 0),
                          "^?https?:\/\/(. [com|org|net]?)"),"")), " where Col1 is not null"))))

enter image description here

  • Related