I am trying to read an Excel file with merged cells and create a list of dictionaries from it. The DataFrame looks like:
Name Group Type Values
0 AC01 A01 Individual Value 1
1 NaN NaN NaN Value 2
2 NaN NaN NaN Value 3
3 NaN NaN NaN Value 4
4 NaN NaN NaN Value 5
5 AC02 A02 Individual Value 6
6 NaN NaN NaN Value 7
7 NaN NaN NaN Value 8
8 NaN NaN NaN Value 9
9 NaN NaN NaN Value 10
10 AC03 A03 Individual Value 11
11 NaN NaN NaN Value 12
12 NaN NaN NaN Value 13
13 NaN NaN NaN Value 14
14 NaN NaN NaN Value 15
And to create a list of dictionaries, I am doing:
import pandas as pd
data = pd.read_excel(io="merged_cells.xlsx", usecols="A:D")
df = (
pd.DataFrame(data)
.rename(
columns={
data.columns[0]: "name",
data.columns[1]: "grp",
data.columns[2]: "type",
data.columns[3]: "values",
}
)
.to_dict(orient="records")
)
Which results in:
[{'grp': 'A01', 'name': 'AC01', 'type': 'Individual', 'values': 'Value 1'},
{'grp': nan, 'name': nan, 'type': nan, 'values': 'Value 2'},
{'grp': nan, 'name': nan, 'type': nan, 'values': 'Value 3'},
{'grp': nan, 'name': nan, 'type': nan, 'values': 'Value 4'},
{'grp': nan, 'name': nan, 'type': nan, 'values': 'Value 5'},
{'grp': 'A02', 'name': 'AC02', 'type': 'Individual', 'values': 'Value 6'},
{'grp': nan, 'name': nan, 'type': nan, 'values': 'Value 7'},
{'grp': nan, 'name': nan, 'type': nan, 'values': 'Value 8'},
{'grp': nan, 'name': nan, 'type': nan, 'values': 'Value 9'},
{'grp': nan, 'name': nan, 'type': nan, 'values': 'Value 10'},
{'grp': 'A03', 'name': 'AC03', 'type': 'Individual', 'values': 'Value 11'},
{'grp': nan, 'name': nan, 'type': nan, 'values': 'Value 12'},
{'grp': nan, 'name': nan, 'type': nan, 'values': 'Value 13'},
{'grp': nan, 'name': nan, 'type': nan, 'values': 'Value 14'},
{'grp': nan, 'name': nan, 'type': nan, 'values': 'Value 15'}]
What I am trying to achieve is that the last column to be converted into a list in each dictioary to be like:
[
{
"grp": "A01",
"name": "AC01",
"type": "Individual",
"values": ["Value 1", "Value 2", "Value 3", "Value 4", "Value 5"]
}
{
"grp": "A02",
"name": "AC02",
"type": "Individual",
"values": ["Value 6", "Value 7", "Value 8", "Value 9", "Value 10"]
}
{
"grp": "A03",
"name": "AC03",
"type": "Individual",
"values": ["Value 11", "Value 12", "Value 13", "Value 14", "Value 15"]
}
]
CodePudding user response:
Use fillna to fill forward values and aggregate Values
to list after create groups:
out = df.fillna().groupby(['Name', 'Group', 'Type']) \
.agg({'Values': list}).reset_index() \
.to_dict(orient='index').values()
Output:
>>> list(out)
[{'Name': 'AC01',
'Group': 'A01',
'Type': 'Individual',
'Values': ['Value 1', 'Value 2', 'Value 3', 'Value 4', 'Value 5']},
{'Name': 'AC02',
'Group': 'A02',
'Type': 'Individual',
'Values': ['Value 6', 'Value 7', 'Value 8', 'Value 9', 'Value 10']},
{'Name': 'AC03',
'Group': 'A03',
'Type': 'Individual',
'Values': ['Value 11', 'Value 12', 'Value 13', 'Value 14', 'Value 15']}]
CodePudding user response:
IIUC, you can try:
(
df.ffill()
.groupby(df.columns.to_list()[:-1], as_index=False)
.agg(list)
.to_dict('records')
)