Home > front end >  I am trying to Stack, Melt, grouby or reshape a Pandas DataFrame
I am trying to Stack, Melt, grouby or reshape a Pandas DataFrame

Time:05-18

I am trying to reshape the dataframe below (imported from a .csv), keeping the Easting, Northing and Node Name value together on the same row, but having everything 'stacked' in 4 columns. So I want data in columns V0e, V0n and Vd on top of data in columns S0_Pe, S0_Pn and S0_Pd. In reality there are 8 sets of these easting/northing/node trios. Would I need to rename V0e, V0n, S0_Pe and S0_Pn to 'Easting' and 'Northing' and Vd & S0_Pd to 'Node'? I have experimented with grouby, stack & melt but either everything ends up to two columns ('shot' and everything else) or fails to group as I want.
I have also looked at MultiIndex, with the Node in a level above the easting/northing pair, but I failed to apply it to the existing df as loaded from my .csv file.

Index   shot    V0e         V0n         Vd   S0_Pe      S0_Pn       S0_Pd
0       1001    530811.1    6764623.3   Vd   nan        nan         S0_Pd
1       1002    530808.8    6764617.4   Vd   530771.3   6764510.4   S0_Pd
2       1003    530806.6    6764611.4   Vd   nan        nan         S0_Pd
3       1004    530804.2    6764605.8   Vd   530765.6   6764499.1   S0_Pd

I don't mind if it looks like this:

Index   shot    V0e         V0n         Vd   
0       1001    530811.1    6764623.3   Vd   
1       1002    530808.8    6764617.4   Vd   
2       1003    530806.6    6764611.4   Vd   
3       1004    530804.2    6764605.8   Vd   
4       1001    nan         nan         S0_Pd
5       1002    530771.3    6764510.4   S0_Pd
6       1003    nan         nan         S0_Pd
7       1004    530765.6    6764499.1   S0_Pd

or this, I just need the coordinate pairs and node to move together:

Index   shot    V0e         V0n         Vd   
0       1001    530811.1    6764623.3   Vd   
1       1001    nan         nan         S0_Pd    
2       1002    530808.8    6764617.4   Vd   
3       1002    530771.3    6764510.4   S0_Pd    
4       1003    530806.6    6764611.4   Vd
5       1003    nan         nan         S0_Pd
6       1004    530804.2    6764605.8   Vd
7       1004    530765.6    6764499.1   S0_Pd

CodePudding user response:

You can do this pd.wide_to_long and a little column renaming to standardize:

df_rc = df.rename(columns={'V0e':'V0e:Vd', 
                   'V0n':'V0n:Vd',
                   'S0_Pe':'V0e:S0_Pd',
                   'S0_Pn':'V0n:S0_Pd'})

df_rc = df_rc.drop(['Vd', 'S0_Pd'], axis=1)

df_out = pd.wide_to_long(df_rc, 
                          ['V0e', 'V0n'], 
                          ['Index', 'shot'], 
                          'Vd', 
                          ':', 
                          '.*')\
            .reset_index()

Output:

   Index  shot     Vd       V0e        V0n
0      0  1001     Vd  530811.1  6764623.3
1      0  1001  S0_Pd       NaN        NaN
2      1  1002     Vd  530808.8  6764617.4
3      1  1002  S0_Pd  530771.3  6764510.4
4      2  1003     Vd  530806.6  6764611.4
5      2  1003  S0_Pd       NaN        NaN
6      3  1004     Vd  530804.2  6764605.8
7      3  1004  S0_Pd  530765.6  6764499.1

CodePudding user response:

You can use the often forgotten pd.lreshape to do this:

This function is a generic version of pd.wide_to_long where you pass a dictionary of {new_column name: [*columns to vertically stack]}. Then any unspecified columns in this dictionary are melted to fit the output.

import pandas as pd

out = pd.lreshape(
    df, 
    {'V0e': ['V0e', 'S0_Pe'], 
     'V0n': ['V0n', 'S0_Pn'], 
     'Vd': ['Vd', 'S0_Pd']}, 
    dropna=False
)

print(out)
   Index  shot       V0e        V0n     Vd
0      0  1001  530811.1  6764623.3     Vd
1      1  1002  530808.8  6764617.4     Vd
2      2  1003  530806.6  6764611.4     Vd
3      3  1004  530804.2  6764605.8     Vd
4      0  1001       NaN        NaN  S0_Pd
5      1  1002  530771.3  6764510.4  S0_Pd
6      2  1003       NaN        NaN  S0_Pd
7      3  1004  530765.6  6764499.1  S0_Pd
  • Related