I have a data frame like this:
a | b | c | d | e |
---|---|---|---|---|
a_1 | b_1 | c_1 | d_1 | e_1 |
0 | b_2 | c_2 | d_2 | e_2 |
0 | b_3 | c_3 | 0 | e_3 |
0 | 0 | c_4 | 0 | e_4 |
0 | 0 | 0 | 0 | e_5 |
I want the data frame to look like this:
e | c | b | d | a |
---|---|---|---|---|
e_1 | c_1 | b_1 | d_1 | a_1 |
e_2 | c_2 | b_2 | d_2 | 0 |
e_3 | c_3 | b_3 | 0 | 0 |
e_4 | c_4 | 0 | 0 | 0 |
e_5 | 0 | 0 | 0 | 0 |
where "letter_number" is any value not equal to 0.
CodePudding user response:
pandas >= 1.1
We can call sort_index
on the columns with a custom key
function:
df.sort_index(key=lambda c: df[c].ne('0').sum(), ascending=False, axis=1)
e c b d a
0 e_1 c_1 b_1 d_1 a_1
1 e_2 c_2 b_2 d_2 0
2 e_3 c_3 b_3 0 0
3 e_4 c_4 0 0 0
4 e_5 0 0 0 0
I assumed the zeroes are in string format and not numeric.
Older versions
We can sort the column headers based on the predicate you described using python's inbuilt sorted
function:
df[sorted(df, key=lambda c: df[c].ne('0').sum(), reverse=True)]
e c b d a
0 e_1 c_1 b_1 d_1 a_1
1 e_2 c_2 b_2 d_2 0
2 e_3 c_3 b_3 0 0
3 e_4 c_4 0 0 0
4 e_5 0 0 0 0
CodePudding user response:
You can try with np.argsort
and iloc
:
df.iloc[:, np.argsort(df.eq('0').sum())]
Or use sort_values
:
df[df.eq('0').sum().sort_values().index]
Both give:
e c b d a
0 e_1 c_1 b_1 d_1 a_1
1 e_2 c_2 b_2 d_2 0
2 e_3 c_3 b_3 0 0
3 e_4 c_4 0 0 0
4 e_5 0 0 0 0