I have a DataFrame as below
A B C
1 1 2 3
2 4 2 5
And I want to combine index and column into a new index while keeping the value from the previous column index as below
Value
1A 1
1B 2
1C 3
2A 4
2B 2
2C 5
I know I can iterate through it with df.iterrows()
and create a new DataFrame from that, but I'm working with a huge dataset and it is too inefficient. Dataset is tens of millions of observations.
CodePudding user response:
Use DataFrame.stack
with list comprehension:
df = df.stack().to_frame('Value')
df.index = [f'{a}{b}' for a, b in df.index]
print (df)
Value
1A 1
1B 2
1C 3
2A 4
2B 2
2C 5
Or use numpy solution with ravel
:
c = np.tile(df.columns, len(df))
i = np.repeat(df.index, len(df.columns))
df = pd.DataFrame({'value': df.to_numpy().ravel()}, index=[f'{a}{b}' for a, b in zip(i, c)])
print (df)
value
1A 1
1B 2
1C 3
2A 4
2B 2
2C 5