Home > OS >  Reshaping long format dataframe to wide format according to the value of the elements in columns
Reshaping long format dataframe to wide format according to the value of the elements in columns

Time:06-20

Suppose I have the following pandas dataframe X in long format:

rank group ID
1    1     3
2    1     1
3    1     2
4    1     4
1    2     2
2    2     1
1    3     1
2    3     4
3    3     3
4    3     2
1    4     1
1    5     3
2    5     2
3    5     1
1    6     1

And I would like to reshape it to the following wide format according to the following rules:

  1. split the ID column into 4 columns n1,n2,n3,n4 representing the 4 elements (person) in the ID column.
  2. for column ni, i=1,2,3,4, the entry in the j-th row is 5 minus the ranking of i-the person in the j-th group. For example, in group 3, person 4 gets rank 2, hence the 3rd row of the n4 column is 5-2=3.
    If person i doesn't exist in group j, then the j-th entry in column ni is NA.

So basically I want to create a "score system" for person i according to the ranking: the person who is ranked 1 gets the highest score and the person who is ranked 4 gets the lowest score (or NA if that no there aren't that many people in the group).

i.e.:

group n1 n2 n3 n4
1     3  2  4  1 
2     3  4  NA NA
3     4  1  2  3
4     4  NA NA NA
5     2  3  4  NA
6     4  NA NA NA

I hope I have explained it in an understandable manner. Thank you.

CodePudding user response:

Reshape the dataframe using pivot then subtract 5 from all the values and add prefix of n to column names:

df.pivot('group', 'ID', 'rank').rsub(5).add_prefix('n')

ID      n1   n2   n3   n4
group                    
1      3.0  2.0  4.0  1.0
2      3.0  4.0  NaN  NaN
3      4.0  1.0  2.0  3.0
4      4.0  NaN  NaN  NaN
5      2.0  3.0  4.0  NaN
6      4.0  NaN  NaN  NaN
  • Related