Home > other >  Remove email from an Excel cell
Remove email from an Excel cell

Time:06-21

I'm working through a data cleansing exercise. I'm needing to remove any e-mail from a column, but retain all other text.

  • It is assumed that any e-mail is any text surrounding the @ symbol contained within spaces
  • The email isn't in a fixed position within the cell (and not always contained in the cell)
  • Email can be the only text in a cell

Sample Data:

Raw Data Formatted
24B Nowhere Street, Levin 5510 24B Nowhere Street, Levin 5510
30 Bartholomew Road, Ashtown 5510, [email protected], 021 123 4567 - James 30 Bartholomew Road, Ashtown 5510, 021 132 4567 - James
[email protected], 021 987 6543, M Sinclair 021 987 6543, M Sinclair
[email protected]

TIA

CodePudding user response:

FILTERXML() can give you desired output.

=IFERROR(TEXTJOIN(" ",TRUE,FILTERXML("<t><s>"&SUBSTITUTE(A2," ","</s><s>")&"</s></t>","//s[not(contains(., '@'))]")),"")

If your version of excel support LAMBDA() function then you can use below function for one go.

=IFERROR(BYROW(A2:A5,LAMBDA(a,TEXTJOIN(" ",TRUE,FILTERXML("<t><s>"&SUBSTITUTE(a," ","</s><s>")&"</s></t>","//s[not(contains(., '@'))]")))),"")

enter image description here

CodePudding user response:

This Tutorial is also helpful, https://www.youtube.com/watch?v=6_Ysi1SB9cY

After going through the tutorial, please read this: when you write the if condition, you can find the cells containing '@' symbol and replace them with '@' or something and then continue.

CodePudding user response:

Alternatively, if TEXTSPLIT() is available:

=LET(X,TEXTSPLIT(A1,", "),TEXTJOIN(", ",,FILTER(X,ISERROR(FIND("@",X)),"")))

Note: I splitted the text on the ", " (comma-space combo) instead to prevent possible trailing spaces. The samples show that should work fine.

  • Related