Python - Sorting the values of every row in a table and get a new Pandas dataframe with original col-CodePudding

I just have a table in csv format like this

Date	Company1	Company2	Company3
01.01.2020	1,01	0,9	1
02.01.2020	0,9	2,2	2
...	...	...	...
24.10.2020	1,02	1,01	1,03

Now my aim is to sort every row in a descending order. So I finally get the following table:

Date	1	2	3
01.01.2020	Company1	Company3	Company2
02.01.2020	Company2	Company3	Company1
...	...	...	...
24.10.2020	Company3	Company1	Company2

is their a simple way to do something with python?

CodePudding user response：

You could use reshape using melt, sort_values and create a helper column, then shape back to original with the helper column as column names:

(df.melt(id_vars='Date')
   .sort_values(by='value', ascending=False)
   .assign(col=lambda d: d.groupby('Date').cumcount().add(1))
   .pivot(index='Date', columns='col', values='variable')
)

output:

col                1         2         3
Date                                    
01.01.2020  Company1  Company3  Company2
02.01.2020  Company2  Company3  Company1
24.10.2020  Company3  Company1  Company2

CodePudding user response：

One approach using np.argsort

# get column names
columns = df.columns[1:].to_numpy()

# get sorted indices
indices = np.argsort(df.iloc[:, 1:] * -1, axis=1)

# create new DataFrame
res = pd.concat([df["Date"], pd.DataFrame(columns[indices], columns=range(1, 4))], axis=1)
print(res)

Output

         Date         1         2         3
0  01.01.2020  Company1  Company3  Company2
1  02.01.2020  Company2  Company3  Company1
2  24.10.2020  Company3  Company1  Company2

CodePudding user response：

You can use .apply() on each row to sort values in descending order and get the index (i.e. column labels) of sorted sequence:

df2 = (df.set_index('Date')[['Company1', 'Company2', 'Company3']]
         .replace(r',', r'.', regex=True)
         .astype(float)
         .apply(lambda x: x.sort_values(ascending=False).index.tolist(), axis=1, result_type='expand')
         .pipe(lambda x: x.set_axis(x.columns 1, axis=1))
         .reset_index()
      )

Result:

print(df2)


         Date         1         2         3
0  01.01.2020  Company1  Company3  Company2
1  02.01.2020  Company2  Company3  Company1
2  24.10.2020  Company3  Company1  Company2