I have a pandas dataframe with the distances between names like this:
name1 | name2 | distance |
---|---|---|
Peter | John | 3.4 |
John | James | 2.3 |
James | Peter | 1.4 |
I need to convert it to a distance matrix like this. (The distances with the same names (that are always 0) are not in the original dataframe):
matrix | John | Peter | James |
---|---|---|---|
John | 0 | 3.4 | 2.3 |
Peter | 3.4 | 0 | 1.4 |
James | 2.3 | 1.4 | 0 |
Any help?
Thank you!
CodePudding user response:
here is one way useing pivot :
df1 = df.pivot(index = 'name1', columns = 'name2', values='distance').fillna(0)
df2 = df.pivot(index = 'name2', columns = 'name1', values='distance').fillna(df1)
df2
output :
>>>
name1 James John Peter
name2
James 0.0 2.3 1.4
John 2.3 0.0 3.4
Peter 1.4 3.4 0.0
CodePudding user response:
You can pivot
and combine_first
with its own transpose:
df2 = df.pivot(index='name1', columns='name2', values='distance')
df2 = df2.combine_first(df2.T).fillna(0)
Output:
James John Peter
name1
James 0.0 2.3 1.4
John 2.3 0.0 3.4
Peter 1.4 3.4 0.0
As pipeline:
df2 = (df
.pivot(index='name1', columns='name2', values='distance')
.pipe(lambda d: d.combine_first(d.T))
.fillna(0)
)