Home > front end >  Ignore merged cells in rows when reading from a Word table
Ignore merged cells in rows when reading from a Word table

Time:01-31

I am a new VBA user, and wanting to copy data from a table in Word into Excel. With support of lovely online folk, I have managed to get the following which seems to work:

Sub extractData()
    Dim wd As New Word.Application
    Dim doc As Word.Document
    Dim sh As Worksheet
    Dim Tbls As Word.Tables, r As Variant, cell As Variant

    wd.Visible = True
    Set doc = wd.Documents.Open("\\file.docx", ReadOnly:=True)

    Set Tbls = doc.Tables

    Set sh = ActiveSheet
        

    For Each r In Tbls(1).Rows
        
        For Each cell In r.Cells
                    sh.Cells(cell.Row.Index, cell.Column.Index).Value = Application.WorksheetFunction.Clean(cell.Range.Text)
        Next cell
    Next r


End Sub

However, the table in Word has horizontal cell merging, and VBA can't work with mixed cell widths. I do not need these rows to be copied, so is there a way to count how many columns (should be 6), and if less than 6 to then skip the row?

I have tried looking online, but cannot find how to get it to count the cells in each row, and then skip if less than 6. I feel like it should be pretty straightforward, but I'm very new to this! Any suggestions? Thanks!

CodePudding user response:

a) I am surprised that you state that "VBA can't work with mixed cell widths" - I did a test and your code should work without issues on tables where you have (horizontally) merged cells. There is a problem with the code when you have vertically merged cells because in that case you cannot access the individual rows of the table

b) if you want to loop over all cells of a table no matter if it has merged cells or not, you can loop over all cells of the Range of the table:

Dim rng As Word.Range
Dim tbl As Word.Table
Set tbl = doc.Tables(1)
Set rng = tbl.Range
For Each cell In rng.Cells
    Debug.Print cell.RowIndex, cell.ColumnIndex, cell.Range.Text
Next

The properties RowIndex and ColumnIndex give you the row resp. column number of the cell - if it is a merged cell, it's the first (top left) cell.

c) To get the number of cells in a row or column, you can use the cells.Count property. As mentioned, this will work for rows only if there are no vertically merged cells, similarly, it will for columns only if there are no horizontally merged cells.

Dim row As Word.row, col As Word.Column
For Each r In tbl.Rows
    r.Cells.Count
Next r
Dim c As Word.Column
For Each c In Tbls(1).Columns
    c.Cells.Count
Next

CodePudding user response:

A row has a collection of cells, and collections always have a Count property.

Sub extractData()
    Dim wd As New Word.Application
    Dim doc As Word.Document
    Dim sh As Worksheet
    Dim Tbls As Word.Tables, r As Word.Row, cell As Word.Cell

    wd.Visible = True
    Set doc = wd.Documents.Open("\\file.docx", ReadOnly:=True)

    Set Tbls = doc.Tables

    Set sh = ActiveSheet

    For Each r In Tbls(1).Rows
        If r.Cells.Count = 6 Then
            For Each cell In r.Cells
                    sh.Cells(cell.Row.Index, cell.Column.Index).Value = Application.WorksheetFunction.Clean(cell.Range.Text)
            Next cell
        End if
    Next r    

End Sub
  • Related