Home > Software engineering >  Replace double commas with last group of alphanumeric characters in the line
Replace double commas with last group of alphanumeric characters in the line

Time:11-23

Long story... I have a CSV that looks, symbolically, something like this:

Source CSV.

I have been trying to get Excel to fill each blank cell in each column with the value above it or below it. For example, I want all of column A to contain 'a' and column E to contain the following values: 2,2,2,2,e,e,e,e,e. Here is the desired result:

Goal CSV.

The real CSV has over 10,000 rows, so doing this by hand would take days.

What I thought to do was transpose the spreadsheet data, put it into Notepad and try with regex.

Here is my expression (hit 'Tests' and then 'Text' again if you are getting an error), but it's not giving me what I want.

In the text:

a,,,,,,1,,,,2,,,,,,,,,,,5,,,,,,,,,4
,b,,,2,,,,,,,,,,,,4,,,,,,,,5,,,,,,5
,,c,,,,,3,,,,,,,,,,,,,,,,,,,,,,,
,,,d,,,,,,,,6,,,,,,,,,,,,,,,,,,,
,,,,e,,,,,,,,,,,,6,,,,,3,,,,,,,,,1
,,,,,f,,,,,,2,,,,,,,,,,,,,,,,,,,
,,,,,,g,,,,,,,,,2,,,,,,,,,7,,,,,,
,,,,,,,h,,,,,,,,,,,,,,,,,,,,,,,2
,,,,,,,,f,,,0,,,,,,,,,,,,,9,,,,2,,

Replace:

([A-z0-9]),|,,

With:

$1

Result:

a,1,254
,b2,4,5,5
c3
,d,6
e,631
,f,2
g27,
,h2
f09,2,

Desired result:

a,a,a,a,a,a,1,1,1,1,2,2,2,2,2,2,2,2,2,2,2,5,5,5,5,5,5,5,5,5,4
b,b,b,b,2,2,2,2,2,2,2,2,2,2,2,2,4,4,4,4,4,4,4,4,5,5,5,5,5,5,5
c,c,c,c,c,c,c,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3
d,d,d,d,d,d,d,d,d,d,d,6,6,6,6,6,6,6,6,6,6,6,6,6,6,6,6,6,6,6,6
e,e,e,e,e,e,e,e,e,e,e,e,e,e,e,e,6,6,6,6,6,3,3,3,3,3,3,3,3,3,1
f,f,f,f,f,f,f,f,f,f,f,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2
g,g,g,g,g,g,g,g,g,g,g,g,g,g,g,2,2,2,2,2,2,2,2,2,7,7,7,7,7,7,7
h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,2
f,f,f,f,f,f,f,f,f,f,f,0,0,0,0,0,0,0,0,0,0,0,0,0,9,9,9,9,2,2,2

In other words:

  • Search the line for the first character (or string) between two commas;
  • Put this character in any gap (in between two consecutive commas) until you reach a new character or string;
  • If the line starts with a comma, find the first character or string in the line and then replace any preceding gaps with this character or string.

As far as I can tell, I'm telling it to find any alphanumeric character followed by a comma or a comma followed by a comma and to replace each instance with the first token (alpha). But what it seems to be doing is treating alpha comma and every comma comma thereafter as one token group. How can I make it so that each alpha comma or comma comma is its own group and is replaced with the token (alpha) and a comma?

CodePudding user response:

This isn't the most efficient macro but it worked for me. It was kind of fun to try and solve too, as sad as that may sound. Feel free to use it or go with your other approach.

The assumption being made is that it should look up before it looks down.

Public Sub DoSmartFillDownAndUp()
    Dim objCell As Range, bFinished As Boolean
    
    Application.ScreenUpdating = False
    
    Do While bFinished = False
        For Each objCell In Selection
            If objCell.Value = "" Then
                If objCell.Row > 1 Then
                    objCell.Value = objCell.Offset(-1, 0).Value
                End If
            End If
             
            If objCell.Value = "" Then
                objCell.Value = objCell.Offset(1, 0).Value
            End If
        Next
        
        bFinished = True
        
        For Each objCell In Selection
            If objCell.Value = "" Then
                bFinished = False
                Exit For
            End If
        Next
    Loop
    
    Application.EnableEvents = True
End Sub

Throw that into a new module, select the range you want to run the data over and then fire the macro.

CodePudding user response:

Suppose part of your spreadsheet looks like the following.

   A B C D E F G
   -------------
 1|a   1     2  
 2|  b   3     4
 3|    c   5
 4|      d   6 7
 5|  8     e 
 6|    9     f
 7|      0     g

As I understand, you wish to compute the following, where row 11 corresponds to row 1, row 12 corresponds to row 2 and so on.

   A B C D E F G
   -------------
11|a b 1 3 5 2 4
12|a b 1 3 5 2 4
13|a b c 3 5 2 4
14|a b c d 5 6 7 
15|a 8 c d e 6 7
16|a 8 9 d e f 7
17|a 8 9 0 e f g

Cell A11 equals the value in cell Ar, where r, 1 <= r <= 7, is the smallest row number such that cell An is non-empty. (I haven't used Excel in years but my recollection is the row is determined using an "index" formula.) B11, B12 and so on are computed analogously.

Cell A12 equals the contents of A2 if A2 is non-empty; else it equals the value in A11. Cells B12-G12 are computed similarly.

Cell A13 is similarly computed from cells A3 and A12. All remaining cells are computed in the same way: cell cr is computed from cells c(r-10) and c(r-1).

CodePudding user response:

Have you tried this Regex in notepad , it's a bit close to your desired output: it's based upon your given sample text ...

Find: ,*(\w),*
Replace All: $1,

See the result in the image ... enter image description here

  • Related