Hey guys so I'm making a dictionary file that has a table name and column name and table name is repeated many times for how many column names there are in the excel file. So for example
| table_name| column_name|
| ----------|------------|
| players | name |
| players | height |
| players | age |
| teams | name |
| teams | city |
| teams | owner |
and it goes down and down. I have around 1000 rows here, so I could type them up in the format that I desire but feels like it would take a lot of time. Here is the format that I'm trying to get in a list of dictionaries.
[{'players':['name', 'height', 'age']}, {'teams':['name', 'city', 'owner']}, ....]
CodePudding user response:
One option can be to read an excel file with pandas.
You can use pandas.DataFrame.groupby()
then get the result of groupby as list
with apply
. At then end use pandas.Series.to_dict()
.
import pandas as pd
file_path = "Book1.xlsx"
df = pd.read_excel(file_path)
# >>> df
# table_name column_name
# 0 players name
# 1 players height
# 2 players age
# 3 teams name
# 4 teams city
# 5 teams owner
dct = df.groupby('table_name')['column_name'].apply(list).to_dict()
# dct -> {'players': ['name', 'height', 'age'], 'teams': ['name', 'city', 'owner']}
# For converting the above 'dict', you can use the below 'list comprehension':
lst_dct = [{k:v} for k,v in dct.items()]
print(lst_dct)
Output:
[{'players': ['name', 'height', 'age']}, {'teams': ['name', 'city', 'owner']}]
CodePudding user response:
ahh thanks I'mahdi, I actually didn't see your answer, and my answer is actually pretty close to yours, just posting it just in case there is a need for a dictionary of all tables minus the list. but just saw you also included that as well :). Glad we came to the same conclusion here that pandas is a nice library to use.
import pandas
def excel_to_dict():
csvFile = pandas.read_csv('qv_columns.csv')
tables_grouped = csvFile.groupby('TABLE_NAME').
['COLUMN_NAME'].agg(list)
tables_dict = tables_grouped.to_dict()
print(tables_dict)