I have two datasets. After merging them horzontally, and sorting the columns with the following code, I get the dataset below:
df=
X | Y |
---|---|
5.2 | 6.5 |
3.3 | 7.6 |
df_year=
X | Y |
---|---|
2014 | 2014 |
2015 | 2015 |
df_all_cols = pd.concat([df, df_year], axis = 1)
sorted_columns = sorted(df_all_cols.columns)
df_all_cols_sort = df_all_cols[sorted_columns]
X | X | Y | Y |
---|---|---|---|
5.2 | 2014 | 6.5 | 2014 |
3.3 | 2015 | 7.6 | 2015 |
I am trying to make my data look like this, by stacking the dataset every 2 columns.
name | year | Variable |
---|---|---|
5.2 | 2014 | X |
3.3 | 2015 | X |
6.5 | 2014 | Y |
7.6 | 2015 | Y |
CodePudding user response:
One approach could be as follows:
0 1
0 X 5.2 2014
Y 6.5 2014
1 X 3.3 2015
Y 7.6 2015
- Next, use
df.sort_index
to sort on the original column names (i.e."X, Y"
, now appearing asindex level 1
), and get rid ofindex level 0
(df.droplevel
). - Finally, use
df.reset_index
withdrop=False
to insert index as a column and rename all the columns withdf.rename
.
res = (pd.concat([df.stack(),df_year.stack()], axis=1)
.sort_index(level=1)
.droplevel(0)
.reset_index(drop=False)
.rename(columns={'index':'Variable',0:'name',1:'year'})
)
# change the order of cols
res = res.iloc[:, [1,2,0]]
print(res)
name year Variable
0 5.2 2014 X
1 3.3 2015 X
2 6.5 2014 Y
3 7.6 2015 Y