Home > Software design >  VBA and regexp - identify numbers stored as text
VBA and regexp - identify numbers stored as text

Time:12-29

I have some data that looks like this (more than 400 columns) :

year ID fake_num1 fake_num2 text1
2019 11 36 000 10'000 text, 1
2020 12 -1 275 1 000,00 text 2

Columns fake_num1 and fake_num2 are stored as text. What I'm trying to achieve is

  1. Identify those fake numbers columns
  2. Clean the data (e.g. remove space, columns, replace comma by points) with a for loop

I need some help with step 1. I have to identify columns fake_num1 and fake_num2, while avoiding columns like text1. I was thinking of going with regexp but maybe there is another solution.

I used part of the code here: SO regexp, however I am not sure how to proceed from there.

Dim strPattern as String: strPattern = "^[0-9]$"

will find anything that starts and ends with a number, and only has numbers (if my comprehension is correct). What's the best way to manage the cases listed in the table above ?

CodePudding user response:

Please, try the next code, It considers "fake numbers columns" as ones where replacing the necessary characters makes from string a number:

Sub testMakeNumbers()
 Dim sh As Worksheet, lastR As Long, lastCol As Long, i As Long, rngCol As Range
 
 Set sh = ActiveSheet 'you can use here the necessary sheet
 lastR = sh.Range("A" & sh.rows.Count).End(xlUp).row
 lastCol = sh.cells(1, Columns.Count).End(xlToLeft).Column
 
 'determine the problematic columns:
 For i = 1 To lastCol
    If Not IsNumeric(sh.cells(2, i).Value) And _
            IsNumeric(Replace(Replace(Replace(sh.cells(2, i).Value, " ", ""), "'", ""), ",", ".")) Then
        If rngCol Is Nothing Then
            Set rngCol = sh.cells(2, i)
        Else
            Set rngCol = Union(rngCol, sh.cells(2, i))
        End If
    End If
 Next
 'replace the characters making the string as number:
 With Intersect(rngCol.EntireColumn, sh.Range("A2", sh.cells(lastR, lastCol)))
       .Replace ",", "."
       .Replace Chr(160), ""
       .Replace " ", ""
       .Replace "'", ""
  End With
End Sub
  • Related