Home > Enterprise >  Python - Sorting the values of every row in a table and get a new Pandas dataframe with original col
Python - Sorting the values of every row in a table and get a new Pandas dataframe with original col

Time:10-25

I just have a table in csv format like this

Date Company1 Company2 Company3
01.01.2020 1,01 0,9 1
02.01.2020 0,9 2,2 2
... ... ... ...
24.10.2020 1,02 1,01 1,03

Now my aim is to sort every row in a descending order. So I finally get the following table:

Date 1 2 3
01.01.2020 Company1 Company3 Company2
02.01.2020 Company2 Company3 Company1
... ... ... ...
24.10.2020 Company3 Company1 Company2

is their a simple way to do something with python?

CodePudding user response:

You could use reshape using melt, sort_values and create a helper column, then shape back to original with the helper column as column names:

(df.melt(id_vars='Date')
   .sort_values(by='value', ascending=False)
   .assign(col=lambda d: d.groupby('Date').cumcount().add(1))
   .pivot(index='Date', columns='col', values='variable')
)

output:

col                1         2         3
Date                                    
01.01.2020  Company1  Company3  Company2
02.01.2020  Company2  Company3  Company1
24.10.2020  Company3  Company1  Company2

CodePudding user response:

One approach using np.argsort

# get column names
columns = df.columns[1:].to_numpy()

# get sorted indices
indices = np.argsort(df.iloc[:, 1:] * -1, axis=1)

# create new DataFrame
res = pd.concat([df["Date"], pd.DataFrame(columns[indices], columns=range(1, 4))], axis=1)
print(res)

Output

         Date         1         2         3
0  01.01.2020  Company1  Company3  Company2
1  02.01.2020  Company2  Company3  Company1
2  24.10.2020  Company3  Company1  Company2

CodePudding user response:

You can use .apply() on each row to sort values in descending order and get the index (i.e. column labels) of sorted sequence:

df2 = (df.set_index('Date')[['Company1', 'Company2', 'Company3']]
         .replace(r',', r'.', regex=True)
         .astype(float)
         .apply(lambda x: x.sort_values(ascending=False).index.tolist(), axis=1, result_type='expand')
         .pipe(lambda x: x.set_axis(x.columns 1, axis=1))
         .reset_index()
      )

Result:

print(df2)


         Date         1         2         3
0  01.01.2020  Company1  Company3  Company2
1  02.01.2020  Company2  Company3  Company1
2  24.10.2020  Company3  Company1  Company2
  • Related