I got a pandas dataframe like this:
x y z
0 a d g
1 b e h
2 c f i
Now I want to convert it into a dataframe with a single row with each cell and row column as column names:
z_2 z_1 z_0 y_2 y_1 y_0 x_2 x_1 x_0
0 i h g f e d c b a
I know I can do it like this, but I need to runtime optimize the code, if possible without loops, etc.
df = pd.DataFrame({"x": ["a", "b", "c"],
"y": ["d", "e", "f"],
"z": ["g", "h", "i"]})
df.to_dict()
wantedRes = pd.DataFrame()
for key, value in df.items():
for key2, value2 in value.items():
wantedRes.insert(loc = 0, column = str(key) "_" str(key2),value = [value2] )
CodePudding user response:
You can use .stack()
for this:
s = df.stack()
df_new = pd.DataFrame([s.values], columns=[f'{j}_{i}' for i, j in s.index])
Output:
x_0 y_0 z_0 x_1 y_1 z_1 x_2 y_2 z_2
0 a d g b e h c f i
CodePudding user response:
You can unstack
, rework the index and convert to_frame
:
s = df.unstack()
wantedRes = s.set_axis(s.index.map(lambda x: f'{x[0]}_{x[1]}'))[::-1].to_frame().T
output:
z_2 z_1 z_0 y_2 y_1 y_0 x_2 x_1 x_0
0 i h g f e d c b a