I have a python dictionary in which the keys of the dictionary are tuples of two strings and the values are integers.
It looks like this:
mydic = { ('column1', 'index1'):33,
('column1', 'index2'):34,
('column2', 'index1'):35,
('column2', 'index2'):36 }
The first string of the tuples should be used as the column-name in the dataframe and the second string in the tuple should be used as the index.
The dataframe from this should look like this:
(index) | column1 | column 2 |
---|---|---|
index1 | 33 | 35 |
index2 | 34 | 36 |
Is there any way to do this?
(Or do I have to loop through all elements of the dictionary and build the dataframe one value at a time by hand?)
CodePudding user response:
Build a pd.Series
first (which will have a MultiIndex), then use pd.Series.unstack
to get the column names.
df = pd.Series(mydic).unstack(0)
print(df)
column1 column2
index1 33 35
index2 34 36
CodePudding user response:
You can use pd.MultiIndex.from_tuples
.
mydic = { ('column1', 'index1'):33,
('column1', 'index2'):34,
('column2', 'index1'):35,
('column2', 'index2'):36 }
df = pd.DataFrame(mydic.values(), index = pd.MultiIndex.from_tuples(mydic))
0
column1 index1 33
index2 34
column2 index1 35
index2 36
What comes after that is just a workaround.
df.T.stack()
column1 column2
0 index1 33 35
index2 34 36
Notice that the index contains two rows. Do not forget to reset it.
df.T.stack().reset_index().drop('level_0', axis = 1)
level_1 column1 column2
0 index1 33 35
1 index2 34 36
You can rename the level_1
if you want to. Hope it helps.