Home > Software design >  VBA File Collections need to only add specific files
VBA File Collections need to only add specific files

Time:01-19

I have never worked with file collections before, but I was able to find the code below (https://danwagner.co/how-to-combine-multiple-excel-workbooks-into-one-worksheet-with-vba/). I have a file location that could have over 120 files. I needed the sub to browse to that file location, loop through the files and copy/append data to a new workbook. And that parts works perfectly. My issue is that I don't need it to add all the files to the collection. Each filename begins with a 4 digit year, i.e. 2019_M05 (meaning May of 2019). I only need it to look at the past 7 years files. Ive tried using an if on the strFile name, but it locks my excel every time. Unfortunately, they need all the data in one file and it could be over 500k lines. Any suggestions would be appreciated.

Public Sub Create820Accumulatorfile()
    Dim wb1 As Workbook
    Dim strDirContainingFiles As String, strFile As String, _
        strFilePath As String, stryears As String
    Dim wbkDst As Workbook, wbkSrc As Workbook
    Dim wksDst As Worksheet, wksSrc As Worksheet
    Dim lngIdx As Long, lngSrcLastRow As Long, _
        lngSrcLastCol As Long, lngDstLastRow As Long, _
        lngDstLastCol As Long, lngDstFirstFileRow As Long
    Dim rngSrc As Range, rngDst As Range, rngFile As Range
    Dim colFileNames As Collection
    Set colFileNames = New Collection
    Dim StartingTime As Single
    
    Application.ScreenUpdating = False
    Application.EnableEvents = False
    Application.Calculate = xlCalculationManual
    
    Set wb1 = ThisWorkbook
    StartingTime = Timer
    
    'Set references up-front
    strDirContainingFiles = wb1.Sheets("Start Here").Range("B11").Value '<~ your folder
    stryears = wb1.Sheets("Start Here").Range("B12").Value '<~ years for files to include
    Set wbkDst = Workbooks.Add '<~ Dst is short for destination
    Set wksDst = wbkDst.ActiveSheet
    
    'Store all of the file names in a collection
    strFile = Dir(strDirContainingFiles & "\*.xlsx")
    Do While Len(strFile) > 0
        colFileNames.Add Item:=strFile
        strFile = Dir
    Loop
    
    ''CHECKPOINT: make sure colFileNames has the file names
    'Dim varDebug As Variant
    'For Each varDebug In colFileNames
    '    Debug.Print varDebug
    'Next varDebug
    
    'Now we can start looping through the "source" files
    'and copy their data to our destination sheet
    For lngIdx = 1 To colFileNames.Count
        
        'Assign the file path
        strFilePath = strDirContainingFiles & "\" & colFileNames(lngIdx)
        
        'Open the workbook and store a reference to the data sheet
        Set wbkSrc = Workbooks.Open(strFilePath)
        Set wksSrc = wbkSrc.Worksheets("Excel_Destination") '<~ change based on your Sheet name
        
        'Identify the last row and last column, then
        'use that info to identify the full data range
        lngSrcLastRow = LastOccupiedRowNum(wksSrc)
        lngSrcLastCol = LastOccupiedColNum(wksSrc)
        With wksSrc
            Set rngSrc = .Range(.Cells(1, 1), .Cells(lngSrcLastRow, _
                                                     lngSrcLastCol))
        End With
        
        ''CHECKPOINT: make sure we have the full source data range
        'wksSrc.Range("A1").Select
        'rngSrc.Select
        
        'If this is the first (1st) loop, we want to keep
        'the header row from the source data, but if not then
        'we want to remove it
        If lngIdx <> 1 Then
            Set rngSrc = rngSrc.Offset(1, 0).Resize(rngSrc.Rows.Count - 1)
        End If
        
        ''CHECKPOINT: make sure that we remove the header row
        ''from the source range on every loop that is not
        ''the first one
        'wksSrc.Range("A1").Select
        'rngSrc.Select
        
        'Copy the source data to the destination sheet, aiming
        'for cell A1 on the first loop then one past the
        'last-occupied row in column A on each following loop
        If lngIdx = 1 Then
            lngDstLastRow = 1
            Set rngDst = wksDst.Cells(1, 1)
        Else
            lngDstLastRow = LastOccupiedRowNum(wksDst)
            Set rngDst = wksDst.Cells(lngDstLastRow   1, 1)
        End If
        rngSrc.Copy Destination:=rngDst '<~ this is the copy / paste
        
        'Almost done! We want to add the source file info
        'for each of the data blocks to our destination
        
        'On the first loop, we need to add a "Source Filename" column
        If lngIdx = 1 Then
            lngDstLastCol = LastOccupiedColNum(wksDst)
            wksDst.Cells(1, lngDstLastCol   1) = "Source Filename"
        End If
        
        'Identify the range that we need to write the source file
        'info to, then write the info
        With wksDst
        
            'The first row we need to write the file info to
            'is the same row where we did our initial paste to
            'the destination file
            lngDstFirstFileRow = lngDstLastRow   1
            
            'Then, we need to find the NEW last row on the destination
            'sheet, which will be further down (since we pasted more
            'data in)
            lngDstLastRow = LastOccupiedRowNum(wksDst)
            lngDstLastCol = LastOccupiedColNum(wksDst)
            
            'With the info from above, we can create the range
            Set rngFile = .Range(.Cells(lngDstFirstFileRow, lngDstLastCol), _
                                 .Cells(lngDstLastRow, lngDstLastCol))
                                 
            ''CHECKPOINT: make sure we have correctly identified
            ''the range where our file names will go
            'wksDst.Range("A1").Select
            'rngFile.Select
                                 
            'Now that we have that range identified,
            'we write the file name
            rngFile.Value = wbkSrc.Name
            
        End With
        
        'Close the source workbook and repeat
        wbkSrc.Close SaveChanges:=False
        
    Next lngIdx
    
    Application.ScreenUpdating = True
    Application.EnableEvents = True
    Application.Calculate = xlCalculationAutomatic
    
    'Let the user know that the combination is done!
    MsgBox "Data combined! " & Format((Timer - StartingTime) / 86400, "hh:mm:ss")
    
End Sub

'''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''
'INPUT       : Sheet, the worksheet we'll search to find the last row
'OUTPUT      : Long, the last occupied row
'SPECIAL CASE: if Sheet is empty, return 1
Public Function LastOccupiedRowNum(Sheet As Worksheet) As Long
    Dim lng As Long
    If Application.WorksheetFunction.CountA(Sheet.Cells) <> 0 Then
        With Sheet
            lng = .Cells.Find(What:="*", _
                              After:=.Range("A1"), _
                              LookAt:=xlPart, _
                              LookIn:=xlFormulas, _
                              SearchOrder:=xlByRows, _
                              SearchDirection:=xlPrevious, _
                              MatchCase:=False).Row
        End With
    Else
        lng = 1
    End If
    LastOccupiedRowNum = lng
End Function

'''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''
'INPUT       : Sheet, the worksheet we'll search to find the last column
'OUTPUT      : Long, the last occupied column
'SPECIAL CASE: if Sheet is empty, return 1
Public Function LastOccupiedColNum(Sheet As Worksheet) As Long
    Dim lng As Long
    If Application.WorksheetFunction.CountA(Sheet.Cells) <> 0 Then
        With Sheet
            lng = .Cells.Find(What:="*", _
                              After:=.Range("A1"), _
                              LookAt:=xlPart, _
                              LookIn:=xlFormulas, _
                              SearchOrder:=xlByColumns, _
                              SearchDirection:=xlPrevious, _
                              MatchCase:=False).Column
        End With
    Else
        lng = 1
    End If
    LastOccupiedColNum = lng
End Function

I know I need to add it in this section of the code. I tried creating a variable to hold the year of the file and it tests against a user inputted starting date, but the loop is going through 100 files and it crashes my excel. I don't get any errors other than the crash.

    'Store all of the file names in a collection
    Dim fileyear as long

    strFile = Dir(strDirContainingFiles & "\*.xlsx")
    Do While Len(strFile) > 0
        fileyear = left(strFile, 2)
        if fileyear >= wb1.Sheets("Start Here").Range("B12").Value then
            colFileNames.Add Item:=strFile
            strFile = Dir
        end if
    Loop

CodePudding user response:

Writing Consecutive Numbers to a Dictionary

  • The following is something like the idea presented by Daniel Dušek in the comments.
  • Here is a great dictionary resource. Here is a Youtube playlist from the same author.
Dim YearsCount As Long
YearsCount = wb1.Sheets("Start Here").Range("B12").Value

Dim LastYear As Long: LastYear = Year(Date) ' current year...
' ... or read from a cell like the years count

Dim dictYears As Object
Set dictYears = CreateObject("Scripting.Dictionary")

Dim y As Long
For y = 0 To YearsCount - 1
    dictYears(CStr(LastYear - y)) = Empty
Next y

strFile = Dir(strDirContainingFiles & "\*.xlsx")
Do While Len(strFile) > 0
    If dict.Exists(Left(strFile, 4)) Then
        colFileNames.Add Item:=strFile
        strFile = Dir
    End If
Loop
  • Related