Home > Software design >  Find name of column which is non nan
Find name of column which is non nan

Time:03-10

I have a Dataframe defined like :

df1 = pd.DataFrame({"col1":[1,np.nan,np.nan,np.nan,2,np.nan,np.nan,np.nan,np.nan],              
                    "col2":[np.nan,3,np.nan,4,np.nan,np.nan,np.nan,5,6], 
                    "col3":[np.nan,np.nan,7,np.nan,np.nan,8,9,np.nan, np.nan]})

I want to transform it into a DataFrame like:

df2 = pd.DataFrame({"col_name":['col1','col2','col3','col2','col1',
                                'col3','col3','col2','col2'], 
                    "value":[1,3,7,4,2,8,9,5,6]})

If possible, can we reverse this process too? By that I mean convert df2 into df1.

I don't want to go through the DataFrame iteratively as it becomes too computationally expensive.

CodePudding user response:

You can stack it:

out = (df1.stack().astype(int).droplevel(0)
       .rename_axis('col_name').reset_index(name='value'))

Output:

  col_name  value
0     col1      1
1     col2      3
2     col3      7
3     col2      4
4     col1      2
5     col3      8
6     col3      9
7     col2      5
8     col2      6

To go from out back to df1, you could pivot:

out1 = pd.pivot(out.reset_index(), 'index', 'col_name', 'value')
  • Related