I have a pandas DataFrame with three columns:
X Y Z
0 1 4 True
1 2 5 True
2 3 6 False
How do I make it so that I have two columns X
and Z
with values:
X Z
0 1 True
1 2 True
2 3 False
3 4 True
4 5 True
5 6 False
CodePudding user response:
you can melt:
In [41]: df.melt(id_vars="Z", value_vars=["X", "Y"], value_name="XY")[["XY", "Z"]]
Out[41]:
XY Z
0 1 True
1 2 True
2 3 False
3 4 True
4 5 True
5 6 False
- identifier variable is "Z": it will be repeated as necessary against value variables...
- ...which are X and Y
- name X and Y's together column to "XY", and select that and "Z" at the end
(you can chain .rename(columns={"XY": "X"})
if you want that column to be named X again.)
CodePudding user response:
Another possible solution, based on pandas.concat
:
pd.concat([df[['X','Z']], df[['Y','Z']].rename({'Y': 'X'}, axis=1)])
Output:
X Z
0 1 True
1 2 True
2 3 False
0 4 True
1 5 True
2 6 False
CodePudding user response:
You could use stack
after setting your index to be 'Z' with some basic manipulation like renaming and dropping:
# Setup
df = pd.DataFrame({"X" : [1, 2, 3], "Y" : [4, 5, 6], "Z": [True, True, False]})
# Reshape
df.set_index('Z').stack().reset_index().rename({0: 'X'},axis=1).sort_values('X')[['X','Z']]
prints:
# Output
X Z
0 1 True
2 2 True
4 3 False
1 4 True
3 5 True
5 6 False