I apologise for the title, I know it isn't the most helpful. What I'm attempting to do is restructure my data so that each of a given column is given it's own row with certain values carried over from the previous dataframe.
My Data in its current form is something like this:
ColA | ColB | ColC | val1 | val2 | val3
1 | 2 | 3 | A | B | C
4 | 5 | 6 | D | E | F
And I want to restructure it so I get a result like this:
ColA | ColB | ColC | val
1 | 2 | 3 | A
1 | 2 | 3 | B
1 | 2 | 3 | C
4 | 5 | 6 | D
4 | 5 | 6 | E
4 | 5 | 6 | F
How would I do this?
I know I could go through each row, grab the relevant data and concat a dataframe but I was hoping for a much better alternative
CodePudding user response:
Given:
ColA ColB ColC val1 val2 val3
0 1 2 3 A B C
1 4 5 6 D E F
Doing:
df.melt(['ColA', 'ColB', 'ColC'])
Output:
ColA ColB ColC variable value
0 1 2 3 val1 A
1 4 5 6 val1 D
2 1 2 3 val2 B
3 4 5 6 val2 E
4 1 2 3 val3 C
5 4 5 6 val3 F
CodePudding user response:
I think the following code should solve your problem. I created an example with the sample data you provided. The code involves stacking and merging the dataframes and columns.
df_start = pd.DataFrame()
df_start = df_start.append([[1, 2, 3, "A", "B", "C"], [4, 5, 6, "D", "E", "F"]])
df_start = df_start.rename(columns={0: "ColA", 1: "ColB", 2: "ColC", 3: "val1", 4: "val2", 5: "val3"})
df_start["vals"] = df_start.values.tolist()
df_start["vals"] = df_start["vals"].apply(lambda x: x[3:])
df_ = df_start["vals"].apply(pd.Series).stack().reset_index(level=1, drop=True).to_frame('val')
result_df = pd.merge(df_, df_start, left_index=True, right_index=True)
result_df = result_df.iloc[:, [1, 2, 3, 0]]
result_df