Home > Software design >  Reorder dataframe by column
Reorder dataframe by column

Time:03-04

Given the following df:

order by this column1 column2 column n
abc abc def ghi
def def ghi jkl
ghi ghi jkl nan
jkl mno nan nan
mno nan nan nan

--> One column with a complete set of unique values

--> Several columns with a subset of values. These columns have a different length and are thus 'stacked' to the top.

What I want is to use the values in the first column to reorder the values of the other columns, ie match them against first column.

Expected outcome:

order by this column1 column2 column n
abc abc nan nan
def def def nan
ghi ghi ghi ghi
jkl nan jkl jkl
mno mno nan nan

I guess I can iterrows through 'order by this' and then use loc and something to fill with nan's? Is there a better way?

CodePudding user response:

As what you want is equivalent to checking whether each value from the "order" column is present in the other columns, you could simply do:

df.apply(lambda s: df['order'].where(df['order'].isin(s)))

output:

  order column1 column2 column_n
0   abc     abc     NaN      NaN
1   def     def     def      NaN
2   ghi     ghi     ghi      ghi
3   jkl     NaN     jkl      jkl
4   mno     mno     NaN      NaN

CodePudding user response:

Maybe I'm not getting it, but shouldn't a simple .sort_values () do that? The fact that some NaN are present seems not relevant...

https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.sort_values.html

  • Related