Hi I have a DataFrame for which I have multiple columns I want to combine into 1 with several other columns that I want to be duplicated. An example dataframe:
df = pd.DataFrame(np.random.randint(10, size=60).reshape(6, 10))
df.columns = ['x1', 'x2', 'x3', 'x4', 'x5', 'y1', 'y2', 'y3', 'y4', 'y5']
x1 x2 x3 x4 x5 y1 y2 y3 y4 y5
0 2 6 9 4 3 8 6 1 0 7
1 1 4 8 7 3 0 5 7 3 1
2 6 7 4 8 1 5 7 7 8 5
3 6 3 4 8 0 8 7 2 3 8
4 8 5 6 1 6 3 2 1 1 4
5 1 3 7 5 1 6 5 3 8 5
I would like a nice way to produce the following DataFrame:
x1 x2 x3 x4 x5 y
0 2 6 9 4 3 8
1 1 4 8 7 3 0
2 6 7 4 8 1 5
3 6 3 4 8 0 8
4 8 5 6 1 6 3
5 1 3 7 5 1 6
6 2 6 9 4 3 6
7 1 4 8 7 3 5
8 6 7 4 8 1 7
9 6 3 4 8 0 7
10 8 5 6 1 6 2
11 1 3 7 5 1 5
12 2 6 9 4 3 1
13 1 4 8 7 3 7
14 6 7 4 8 1 7
15 6 3 4 8 0 2
16 8 5 6 1 6 1
17 1 3 7 5 1 3
18 2 6 9 4 3 0
19 1 4 8 7 3 3
20 6 7 4 8 1 8
21 6 3 4 8 0 3
22 8 5 6 1 6 1
23 1 3 7 5 1 8
24 2 6 9 4 3 7
25 1 4 8 7 3 1
26 6 7 4 8 1 5
27 6 3 4 8 0 8
28 8 5 6 1 6 4
29 1 3 7 5 1 5
Is there a nice way to produce this DataFrame with Pandas functions or is it more complicated?
Thanks
CodePudding user response:
You can do this with df.melt()
.
df.melt(
id_vars = ['x1','x2','x3','x4','x5'],
value_vars = ['y1','y2','y3','y4','y5'],
value_name = 'y'
).drop(columns='variable')
df.melt()
will have the column called variable
that has the value for which column it originally came from (so is that row coming from y1
, y2
, etc), so you want to drop that as you see above.