I have a dataframe that looks something like this:
i j
0 a b
1 a c
2 b c
I would like to convert it to another dataframe that looks like this:
a b c
0 1 -1 0
1 1 0 -1
2 0 1 -1
The idea is to look at each row in the first dataframe and assign the value 1 to the item in the first column and the value -1 for the item in the second column and 0 for all other items in the new dataframe. The second dataframe will have as many rows as the first and as many columns as the number of unique entries in the first dataframe. Thank you.
Couldn't really get a start on this.
CodePudding user response:
example
data = {'i': {0: 'a', 1: 'a', 2: 'b'}, 'j': {0: 'b', 1: 'c', 2: 'c'}}
df = pd.DataFrame(data)
df
i j
0 a b
1 a c
2 b c
First make dummy
df1 = pd.get_dummies(df)
df1
i_a i_b j_b j_c
0 1 0 1 0
1 1 0 0 1
2 0 1 0 1
Second make df1
index to multi-index
df1.columns = df1.columns.map(lambda x: tuple(x.split('_')))
df1
i j
a b b c
0 1 0 1 0
1 1 0 0 1
2 0 1 0 1
Third make j
to negative value
df1.loc[:, 'j'] = df1.loc[:, 'j'].mul(-1).to_numpy()
df1
i j
a b b c
0 1 0 -1 0
1 1 0 0 -1
2 0 1 0 -1
Final sum i
& j
df1.sum(level=1 ,axis=1)
a b c
0 1 -1 0
1 1 0 -1
2 0 1 -1
we can put multiple columns as list instead of i
and j
CodePudding user response:
columns = ['a', 'b', 'c']
def get_order(input_row):
output_row[input_row[i]] = 1
output_row[input_row[j]] = -1
return pd.Series(output_row)
ordering_df = original_df.apply(get_order, axis = 1)