I'm creating a macro that is formatting a collection of files and a step in this process is to delete columns that aren't required, keeping a specific set of columns.
I know I can delete columns based on their location and I have this approach implemented already ie 1,3,7 etc or A, C, G etc. But I'm conscious that the report being used might change layout or add extra columns over time and I want to ensure the required columns are kept.
Ideally this code would cycle through each column header starting at A1 until the last column and delete an entire column if the header value isn't found in a list. This list will be an array captured from a range in one of the sheets in the workbook.
List = {Blue, Green, Orange}
Original Table
Blue | Red | Green | Orange | Black |
---|---|---|---|---|
row | row | row | row | row |
Formatted Table
Blue | Green | Orange |
---|---|---|
row | row | row |
Does anyone have any suggestions on the approach I could take to get this working or if it's even possible? Any help would be greatly appreciated
CodePudding user response:
You might profit from the following approach by reordering a datafield array via Application.Index
which allows even to move the existing columns to any new position.
Note: this flexible solution can be time consuming for greater data sets, where I would prefer other ways you can find in a lot of answers at SO.
Sub ReorderColumns()
Const headerList As String = "Blue,green,Orange"
'a) define source range
Dim src As Range
Set src = Tabelle3.Range("A1:E100")
'b) define headers
Dim allHeaders: allHeaders = src.Resize(1).Value2
Dim newHeaders: newHeaders = Split(headerList, ",")
'c) get column positions in old headers
Dim cols
cols = getCols(newHeaders, allHeaders)
'd) define data
Dim data As Variant
data = src.Value2
'e) reorder data based on found column positions
data = Application.Index(data, Evaluate("row(1:" & UBound(data) & ")"), cols)
'f) overwrite source data
src = vbNullString ' clear
src.Resize(UBound(data), UBound(data, 2)) = data
End Sub
Help function getCols()
Function getCols(individualHeaders, allHeaders)
'Purp: get 1-based column numbers of found headers via Match
getCols = Application.Match(individualHeaders, allHeaders, 0) ' 1-based
End Function
CodePudding user response:
Dynamic Named Range
I think a dynamic named range is an excellent choice for storing and retrieving your required columns. Please see the link I provided from https://exceljet.net/ to setup your dynamic named range.
Generic formula
=$A$2:INDEX($A:$A,COUNTA($A:$A))
Regular Expression Approach
After reading in your named range, one approach for testing your columns is using regular expressions. To use this you will need to set a library reference to Microsoft VBScript Regular Expressions 5.5
. The pipe character |
represents an or
statement, so we can join our array using that delimiter.
Deleting Ranges in loops
When deleting columns or rows within a loop, the best approach I have found is to union
the ranges together in a variable and execute the deletion in one go. This helps performance and it prevents errors from deleting ranges the loop is working on.
I do this so often that I created a custom function for this UnionRange
' Helper function that allows
' concatinating ranges together
Public Function UnionRange( _
ByRef accumulator As Range, _
ByRef nextRange As Range _
)
If accumulator Is Nothing Then
Set UnionRange = nextRange
Else
Set UnionRange = Union(accumulator, nextRange)
End If
End Function
Putting it all together
Below is my implementation of what your code could look like, just make sure to first:
- Create a Dynamic Named Range and populate with your required headers
- Add
Microsoft VBScript Regular Expressions 5.5
reference - Update
Sheet1
to whatever sheet your table exists (possibly change logic for finding header row based on your needs)
' Need Regular Expressions Referenced in order to work!
' @libraryReference {Microsoft VBScript Regular Expressions 5.5}
Public Sub DemoDeletingNonRequiredColumns()
' Make sure to create a named range
' otherwise this section will fail. In this
' example the named range is `RequiredColumns`
Dim requiredColumns() As Variant
requiredColumns = Application.WorksheetFunction.Transpose( _
Range("RequiredColumns").Value2 _
)
' To test if the column is in the required
' columns this method uses regular expressions.
With New RegExp
.IgnoreCase = True
' The pipe charactor is `or` in testing.
.Pattern = Join(requiredColumns, "|")
Dim headerRow As Range
' This example uses `Sheet1`, but update to
' the actual sheet you are using.
With Sheet1
Set headerRow = .Range("A1", .Cells(1, Columns.Count).End(xlToLeft))
End With
Dim column As Range
For Each column In headerRow
' If the column name doesn't match the
' pattern, then concatinate it to the
' toDelete range.
If Not .Test(column.Value2) Then
Dim toDelete As Range
Set toDelete = UnionRange(toDelete, column.EntireColumn)
End If
Next
End With
' toDelete is used as it provides better performance
' and it also prevents errors when deleting columns
' while looping.
If Not toDelete Is Nothing Then
toDelete.Delete
End If
End Sub