Filter table and copy to new file in Pandas (VBA-like) dynamically-CodePudding

I have a file (CSV) containing data that looks something like this in a table format:

ID	employee	date	value1	value2
1	a	2022-01-01	123	456
2	b	2022-01-01	123	456
3	a	2022-01-01	123	456
4	c	2022-01-01	123	456
5	d	2022-01-01	123	456
6	b	2022-01-01	123	456
7	e	2022-01-01	123	456
8	e	2022-01-01	123	456

In Excel I created a macro with VBA that creates new workbooks for the table filtered by the employee column. The macro works dynamically by creating a dictionary of the unique values in column [employee] and a For Each loop that filters the original table by the key of the dictionary. In this example it would create 5 new workbooks (for employee "a", "b", "c", "d" and "e") displaying only their corresponding data. E.g. for employee a:

ID	employee	date	value1	value2
1	a	2022-01-01	123	456
3	a	2022-01-01	123	456

VBA code:

Sub Filter_Copy()

Dim rng As Range
Dim wb_new As Workbook
Dim dic As Object
Dim cell AsSub Range
Dim key As Variant

Set rng = Table1.ListObjects("table").Range
Set dic = CreateObject("Scripting.Dictionary")

For Each cell In Range("table[column]")
    dic(cell.Value) = 0
Next cell

For Each key In dic.keys
    rng.AutoFilter 5, key
    Set wb_new = Workbooks.Add
    rng.SpecialCells(xlCellTypeVisible).Copy wb_new.Worksheets(1).Range("A1")
    wb_new.Close True, ThisWorkbook.Path & "\" & key & ".xlsx"
    
Next key

End Sub

Now I want to do the exact same in Python but I am lost with using a for loop to loop through a dictionary of the unique values in [employee]. I have no troubles doing it manually for every employee by creating a dedicated dataframe for each unique value, filtering it like df.employee == "a" and saving it to_excel() but doing it dynamically is something different.

So far I have the obvious:

import pandas as pd

file = "filename.csv"

df = pd.read_csv(file)

dict = dict(df["employee"].unique())

CodePudding user response：

Use pandas.groupby and iterate through them:

groupobj = df.groupby('employee')
for name,group in groupobj:
    group.to_excel(f'./{name}.xlsx')