Let's say we have a dataframe like this
df = pd.DataFrame(
np.arange(2, 11).reshape(-1, 3),
index=list('ABC'),
columns=pd.MultiIndex.from_arrays([
['data1', 'data2','data3'],
['F', 'K',''],
['', '','']
], names=['meter', 'Sleeper',''])
).rename_axis('Index')
df
meter data1 data2 data3
Sleeper F K
Index
A 2 3 4
B 6 7 8
C 10 11 12
df.index.names
FrozenList(['Index'])
When I do unstack
df=df.unstack().reset_index()
I'm getting this
meter Sleeper Index 0
0 data1 F A 2
1 data1 F B 5
2 data1 F C 8
3 data2 K A 3
4 data2 K B 6
5 data2 K C 9
6 data3 A 4
7 data3 B 7
8 data3 C 10
but I don't want to have data3
column under the meter
. I want to keep as it is as a column .
How can we achieve that ?
I tried
df.set_index(['meter','Sleeper']).Value(fill_value='').rename_axis(None, 1).reset_index()
but I got
>KeyError: "None of ['meter', 'Sleeper'] are in the columns"
How to pick which column to unstack a dataframe on
CodePudding user response:
Instead of unstack
, you can set data3
as index by .set_index()
with append=True
and then .stack()
on both 2 levels of column MultiIndex, as follows:
(df.set_index('data3', append=True)
.stack([0, 1])
.reset_index()
.rename({'': '0'}, axis=1)
)
Data Input:
df = pd.DataFrame(
np.arange(2, 11).reshape(-1, 3),
index=list('ABC'),
columns=pd.MultiIndex.from_arrays([
['data1', 'data2','data3'],
['F', 'K',''],
['', '','']
], names=['meter', 'Sleeper',''])
).rename_axis('Index')
df
meter data1 data2 data3
Sleeper F K
Index
A 2 3 4
B 5 6 7
C 8 9 10
Output
Index data3 meter Sleeper 0
0 A 4 data1 F 2.0
1 A 4 data2 K 3.0
2 B 7 data1 F 5.0
3 B 7 data2 K 6.0
4 C 10 data1 F 8.0
5 C 10 data2 K 9.0