I have the excels file with merged cells on column A and 2 values on column B and C like on this image https://i.stack.imgur.com/wnPsf.png (sorry I'm not allowed to upload images yet) In case the image not working, I'm posting the example too:
A B C
B1 C1
A B2 C2
B3 C3
The results I want is {A:[(B1,C1),(B2,C2),(B3,C3)]}, is there any way to do this? Thank you very much
EDIT 1: this is the output of print(df.info())
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 2 entries, 0 to 1
Data columns (total 3 columns):
# Column Non-Null Count Dtype
--- ------ -------------- -----
0 A 0 non-null float64
1 B1 2 non-null object
2 C1 2 non-null object
dtypes: float64(1), object(2)
memory usage: 176.0 bytes
None
EDIT 2: I followed answer of @jezrael and its worked
CodePudding user response:
First forward filling missing values and then create nested lists, groupby
is used for working if multiple categories in column A
:
df = pd.DataFrame({'A': {0: 'A', 1: np.nan, 2: np.nan},
'B': {0: 'B1', 1: 'B2', 2: 'B3'},
'C': {0: 'C1', 1: 'C2', 2: 'C3'}})
df['A'] = df['A'].ffill()
d = df.groupby('A')[['B','C']].apply(lambda x: [tuple(y) for y in x.to_numpy()]).to_dict()
print (d)
{'A': [('B1', 'C1'), ('B2', 'C2'), ('B3', 'C3')]}