I have a set of numbers (15 digits) and need to get the value of the 10th - 11th digit. I am working on using RegEx but so far have only made it to select the first 9 digits and can’t go further from there.
^[1-9\][0-9\][0-9\][0-9\][0-9\][0-9\][0-9\][0-9\][0-9\]
Sample digits are:
188202143002267.50 (need to get only 002)
188202143-002267.50 (need to get only 002 —- already resolved this)
Or if there is a better way to do this? I am working on Excel VBA.
CodePudding user response:
Using regexp.replace and Mid()
:
Sub Tester2()
Debug.Print Mid(DigitsOnly("188202143002267.50"), 10, 3)
Debug.Print Mid(DigitsOnly("188202143-002267.50"), 10, 3)
Debug.Print Mid(DigitsOnly("XXX188/202-143-002267.50"), 10, 3)
End Sub
Function DigitsOnly(v)
Static re As Object
If re Is Nothing Then
Set re = CreateObject("vbscript.RegExp")
re.Global = True
re.Pattern = "\D"
End If
DigitsOnly = re.replace(v, "")
End Function
CodePudding user response:
Use Replace
and Mid
:
Value = "188202143002267.50"
' or:
' Value = "188202143-002267.50"
Digit3 = Mid(Replace(Value, "-", ""), 10, 3)
' Digit3 -> "002"
CodePudding user response:
Alternative via FilterXML()
My intention is not to show a better or more elegant way by this alternative to RegEx
,
but to point out how one can use FilterXML
(available after all since 2013 ) in VBA
to return filtered results.
Since the handling of XML is still seen as a bit cryptic, I would like to contribute with some hints to a better understanding of how the FilterXML function works.
What does FilterXML:
- The function returns specific data
- from a (wellformed) XML content string (roughly comparable to a html-tag structure with individual, but case sensitive node names)
- by using a specified XPath expression string (c.f. @JvdV 's excellent examples at Extract substrings ..)
Syntax - see section 2
FILTERXML(xml, xpath)
Argument 1: XML content - see section 1
The XML standard allows to design use a customized markup language by your own. If you regard each character of an assumed input "A123" as node element, you could
- name the starting DocumentElement e.g.
r
(symbolizing root) and - the individual nodes containing the input characters e.g.
i
- (and take care of correctly set closing tags
This results in a simple well formed hierarchy like
<r>
<i>A</i>
<i>1</i>
<i>2</i>
<i>3</i>
</r>
Note that the conversion to a valid xml content in this example is executed by a tricky splitting into single characters plus eventual joining together to insert the wanted node names (~ "tags") into the final xml content string.
Argument 2: XPath expression - see section 2
The XML hierarchy allows XPath expressions to search within any indicated node and sub-node references. Thus such an XPath expression string to isolate and return digits only like in the posted question could be
"//i[.*0=0]"
where //i
indicates to search occurrencies of i
at any hierarchy level,
the brackets []
enclose the filter condition to return only digits, expressed by the multiplication of the referenced node content .
with zero if resulting in zero .*0=0
, thus excluding other elements than digits.
In contrast to XMLDoM (Document Object Model) coding, filtering via Application.FilterXML 's XPath expressions does not return node objects (or defined node types), but nothing other than the "assigned" node content values (textual child elements in XMLDoM),
which Excel for this special question would interpret automatically as numbers (here even: Double
!)
*) XMLDoM is a cross-platform and language-independent interface treating the document as a tree structure and allowing programmatic access to the tree
Returned results - see section 3
Results are returned as follows:
- a) in the case of several results as a vertical 2-dim generally variant array (in tabular Excel 2019 as dynamic spill range),
- b) if there is only one result: as a single value,
- c) if there is no result (or an incorrect node query): as an error value (the worksheet function would return
#VALUE!
, the late-boundApplication.FilterXML()
functionError 2015
)
Before transforming these different filtering results to any wanted output (e.g. by transposing or type transformations) you would check them via VarType()
- see section 2) in the following code example:
Function GetDigitsOnly()
Example call due to OP: Mid$(DigitsOnly(s), 10, 2)
where s represents a string input
Function DigitsOnly(ByVal s As String) As String
If Not Len(s) Then Exit Function ' escape if empty string
'1) Get XML content (as FilterXML's first argument)
'a) make string splitable via vbNullChar delimiter
s = StrConv(s, vbUnicode)
'b) atomize via Split()
Dim tmp
tmp = Split(Left(s, Len(s) - 1), vbNullChar, Len(s) \ 2 1)
'c) get wellformed (html-like) xml content
tmp = "<r><i>" & Join(tmp, "</i><i>") & "</i></r>"
'2) apply FilterXML({content},{XPath}) to get digits only
tmp = Application.FilterXML(tmp, "//i[.*0=0]")
'3) return string result
Select Case VarType(tmp)
Case vbError ' not any digit found
Exit Function
Case vbArray vbVariant ' vertical digits array
DigitsOnly = Join(Application.Transpose(tmp), vbNullString)
Case Else ' single (here e.g.: vbDouble) result
DigitsOnly = tmp
End Select
End Function