Very simple formulas following but I am missing some understanding and it seems frustratingly simple.
Very simple text extraction:
MID(A1,Start Num, Num of Chars)
A simple formula text finding formula,
SEARCH(Find_text, within_text, start_num)
Combined these two formulas can find and extract text from a field between two text characters, for instance 'underscores', 'parentheses' 'commas'
So for example to extract
text to extract >>> Jimbo Jones
from a cell containing parentheses an easy formula would be;
Sample text A1 = Incident Report No.1234, user (Jimbo Jones) Status- pending
formula;
=MID(A1, SEARCH("(", A1) 1, SEARCH(")", A1) - SEARCH("(", A1) -1)
Extracted text = Jimbo Jones
The logic is simple
1.Identify cell containing text
2.Find the start number by nominating a first searchable character
3.Find the end number of the text being extracted by searching for the second searchable character
4.Subtracting the Start Number from the End number gives the number of characters to extract
Without using Search formula the code is;
MID=(A1,32,11) = Jimbo Jones
But if i want to extract text between commas or other identical characters (such as quotation marks, apostrophes, asterisk ) I need to use this formula following (which I found suggested)
=MID(A1, SEARCH(",", A1) 1, SEARCH(",", A1, SEARCH(",", A1) 1) - SEARCH(",",A1) -1)
Sample text A1 Incident Report No.1234 user, Jimbo Jones, Status- pending
Extracted text = Jimbo Jones
But I how do i nominate other boundaries, such as text between 3rd and 4th comma for example?
Sample text A1 Incident Report, No.1234, user, Jimbo Jones, Status- pending
The reason for my confusion is in the above formula excel finds the second iteration of the comma no matter where they are in the text yet the actual formula being used is identical to the formula finding the first comma, the count of characters seems to automatically assume somehow that I want the second comma not the first, how do i instruct the formula find subsequent iterations of commas, such as 3rd 4th or 9th?
And what am i not understanding in why the formula finds the 2nd comma?
Cheers!
CodePudding user response:
To explain what you are confused about:
At first sight it looks that it uses same formula to find 1st and 2nd searched symbol. But at second look you might notice that there is and argument start_num which tells for a formula where to start looking from. If you give first symbol location 1 (SEARCH(",", A1) 1)
)as that argument, formula will start looking for first search symbol in this part: ' No.1234, user, Jimbo Jones, Status- pending' and will give answer 42. You got 1st occasion place with first formula and then second occasion with formula above. Just find length by substracting and thats it.
Possible solutions:
If you have Office 365, use TEXTAFTER()
and TEXTBEFORE()
as others have stated where you can pass instance number as an argument:
=TEXTAFTER(TEXTBEFORE(A1,",",4),",",3)
Then you can use TRIM()
to get rid of unwanted spaces in begining and end.
If you use older version of Office you can use SUBSTITUTE()
as workaround as it lets you to change nth occasion of specific symbol in text.
Choose a symbol that does not appear in your text and change 3th and 4th occasions of your searched symbol to it. Then look for them (in this example we will substitute ,
to #
:
=MID(A1,SEARCH("#",SUBSTITUTE(A1,",","#",3)) 1,SEARCH("#",SUBSTITUTE(A1,",","#",4))-(SEARCH("#",SUBSTITUTE(A1,",","#",3)) 1))
Formulas used in explanation column C:
C |
---|
=SUBSTITUTE(A1,",","#",3) |
=SUBSTITUTE(A1,",","#",4) |
=SEARCH("#",B1) |
=SEARCH("#",B2) |
=MID(A1,B3 1,B4-(B3 1)) |
Full formula: |
=MID(A1,SEARCH("#",SUBSTITUTE(A1,",","#",3)) 1,SEARCH("#",SUBSTITUTE(A1,",","#",4))-(SEARCH("#",SUBSTITUTE(A1,",","#",3)) 1)) |
Trimmed: |
=TRIM(MID(A1,SEARCH("#",SUBSTITUTE(A1,",","#",3)) 1,SEARCH("#",SUBSTITUTE(A1,",","#",4))-(SEARCH("#",SUBSTITUTE(A1,",","#",3)) 1))) |
CodePudding user response:
Thanks for the responses all, I grinded through using the two Formulas I asked about (MID and SEARCH) and I have a result.
It's not pretty nor elegant but it extracts the data as per requirement. I will benefit from the tips left here in response to my question as simpler options are available.
Requirement: Extract text between 3rd and 4th Commas using MID and SEARCH
Sample text A15
Incident Report (ammended), No.12545424234, user, Jimbo Jones, Status- pending
MID(A15,(SEARCH(",",A15,(1 (SEARCH(",",A15,SEARCH(",",A15) 1))))) 2,(SEARCH(",",A15,(SEARCH(",",A15,SEARCH(",",A15) 1) (SEARCH(",",A15)))-(SEARCH(",",A15,SEARCH(",",A15) 1)-(SEARCH(",",A15))) 1)-(SEARCH(",",A15,(1 (SEARCH(",",A15,SEARCH(",",A15) 1))))))-2)
Test Extracted
Jimbo Jones
Obviously this solution works on other text, but it's also obviously not easy to quickly amend for other text locations.
Anyway, cheers again for the pointers...