I'd like to achieve something like below in a more sufficient way. I think df.pivot might do it, but I can't make it work. Any suggestions?
import pandas as pd
df = pd.DataFrame({'level1':['a', 'a'], 'level2':['b', 'b'], 'level3':[100, 101], 'id1':[111,222],'id2':[333,444], 'foo_value':[0.1,0.2], 'bar_value':[0.3,0.4]})
# now i want to re-shape it to below
rows = []
items = [col.replace("_value", "") for col in df.columns if col.endswith("_value")]
for _, row in df.iterrows():
for id_col in ("id1", "id2"):
for item in items:
rows.append({
"id": row[id_col],
"item": item,
"value": row[f"{item}_value"],
"level1": row["level1"],
"level2": row["level2"],
"level3": row["level3"]
})
reshaped_df = pd.DataFrame(rows)
CodePudding user response:
DataFrame.melt
l = ['level1', 'level2', 'level3']
s1 = df.melt(l, value_vars=df.filter(like='id'), value_name='id')
s2 = df.melt(l, value_vars=df.filter(like='_value'), var_name='item')
out = s1.merge(s2).drop('variable', axis=1)
Result
print(out)
level1 level2 level3 id item value
0 a b 100 111 foo_value 0.1
1 a b 100 111 bar_value 0.3
2 a b 100 333 foo_value 0.1
3 a b 100 333 bar_value 0.3
4 a b 101 222 foo_value 0.2
5 a b 101 222 bar_value 0.4
6 a b 101 444 foo_value 0.2
7 a b 101 444 bar_value 0.4
CodePudding user response:
You chose a hard way.
Try this:
df = df.T
df
It prints: