Create an ordering dataframe depending on the ordering of items in a smaller dataframe-CodePudding

I have a dataframe that looks something like this:

   i  j
0  a  b
1  a  c
2  b  c

I would like to convert it to another dataframe that looks like this:

The idea is to look at each row in the first dataframe and assign the value 1 to the item in the first column and the value -1 for the item in the second column and 0 for all other items in the new dataframe. The second dataframe will have as many rows as the first and as many columns as the number of unique entries in the first dataframe. Thank you.

Couldn't really get a start on this.

CodePudding user response：

example

data = {'i': {0: 'a', 1: 'a', 2: 'b'}, 'j': {0: 'b', 1: 'c', 2: 'c'}}
df = pd.DataFrame(data)

df

   i  j
0  a  b
1  a  c
2  b  c

First make dummy

df1 = pd.get_dummies(df)

df1

    i_a i_b j_b j_c
0   1   0   1   0
1   1   0   0   1
2   0   1   0   1

Second make df1 index to multi-index

df1.columns = df1.columns.map(lambda x: tuple(x.split('_')))

df1

    i       j
    a   b   b   c
0   1   0   1   0
1   1   0   0   1
2   0   1   0   1

Third make j to negative value

df1.loc[:, 'j'] = df1.loc[:, 'j'].mul(-1).to_numpy()

df1

    i       j
    a   b   b   c
0   1   0   -1  0
1   1   0   0   -1
2   0   1   0   -1

Final sum i & j

df1.sum(level=1 ,axis=1)

    a   b   c
0   1   -1  0
1   1   0   -1
2   0   1   -1

we can put multiple columns as list instead of i and j

CodePudding user response：

columns = ['a', 'b', 'c']
def get_order(input_row):
    output_row[input_row[i]] = 1
    output_row[input_row[j]] = -1
    return pd.Series(output_row)
ordering_df = original_df.apply(get_order, axis = 1)