Home > database >  Reshaping a dataframe every nth column
Reshaping a dataframe every nth column

Time:11-16

I have two datasets. After merging them horzontally, and sorting the columns with the following code, I get the dataset below:

df=

X Y
5.2 6.5
3.3 7.6

df_year=

X Y
2014 2014
2015 2015
df_all_cols = pd.concat([df, df_year], axis = 1)
sorted_columns = sorted(df_all_cols.columns)
df_all_cols_sort = df_all_cols[sorted_columns]
X X Y Y
5.2 2014 6.5 2014
3.3 2015 7.6 2015

I am trying to make my data look like this, by stacking the dataset every 2 columns.

name year Variable
5.2 2014 X
3.3 2015 X
6.5 2014 Y
7.6 2015 Y

CodePudding user response:

One approach could be as follows:

  • Apply df.stack to both dfs before feeding them to pd.concat. The result at this stage being:
       0     1
0 X  5.2  2014
  Y  6.5  2014
1 X  3.3  2015
  Y  7.6  2015
  • Next, use df.sort_index to sort on the original column names (i.e. "X, Y", now appearing as index level 1), and get rid of index level 0 (df.droplevel).
  • Finally, use df.reset_index with drop=False to insert index as a column and rename all the columns with df.rename.
res = (pd.concat([df.stack(),df_year.stack()], axis=1)
       .sort_index(level=1)
       .droplevel(0)
       .reset_index(drop=False)
       .rename(columns={'index':'Variable',0:'name',1:'year'})
       )

# change the order of cols
res = res.iloc[:, [1,2,0]]

print(res)

   name  year Variable
0   5.2  2014        X
1   3.3  2015        X
2   6.5  2014        Y
3   7.6  2015        Y
  • Related