import pandas as pd
data = [[1, 1, 2, 1, 0], [ 2, 2, 2, 1, 4], [ 3, 1, 0, 1,4], [ 4, 1, 3, 1, 4],
[5, 1, 6, 1, 4], [ 6, 1, 2, 0, 4], [ 7, 1, 2, 7,4], [ 8, 1, 2, 1, 1],
[9, 1, 2, 1, 2], [10, 1, 2, 1, 3], [11, 1, 2, 1,5], [12, 1, 2, 1, 6]]
df = pd.DataFrame(data, columns=['Id','c1', 'c2','c3', 'c4'])
import scipy.integrate
import scipy.special
mat = scipy.spatial.distance.cdist(
df[['c1','c2','c3','c4']],
df[['c1','c2','c3','c4']],
metric='euclidean'
)
new_df = pd.DataFrame(mat, index=df['Id'], columns=df['Id'])
When I apply sorting in dataframe, it works:
new_df.sort_values(by=1,ascending=True,kind="mergesort",axis=1)
but if I apply sorting in a subset of dataframe it does not work:
i = 1
j = 2
new_dff = new_df[i:j]
new_dff.sort_values(by=1, ascending=True, kind="mergesort", axis=1)
CodePudding user response:
For subset of rows use DataFrame.loc
:
i = 1
j = 2
new_dff=new_df.loc[i:j]
print (new_dff)
Id 1 2 3 4 5 6 7 \
Id
1 0.000000 4.123106 4.472136 4.123106 5.656854 4.123106 7.211103
2 4.123106 0.000000 2.236068 1.414214 4.123106 1.414214 6.082763
Id 8 9 10 11 12
Id
1 1.000000 2.000000 3.000000 5.000000 6.000000
2 3.162278 2.236068 1.414214 1.414214 2.236068
Then sorting working well:
new_dff = new_dff.sort_values(by=1, ascending=True, kind="mergesort", axis=1)
print (new_dff)
Id 1 8 9 10 2 4 6 \
Id
1 0.000000 1.000000 2.000000 3.000000 4.123106 4.123106 4.123106
2 4.123106 3.162278 2.236068 1.414214 0.000000 1.414214 1.414214
Id 3 11 5 12 7
Id
1 4.472136 5.000000 5.656854 6.000000 7.211103
2 2.236068 1.414214 4.123106 2.236068 6.082763
Or for subset of columns use :
for select all rows:
i = 1
j = 2
new_dff=new_df.loc[:, i:j]
print (new_dff)
Id 1 2
Id
1 0.000000 4.123106
2 4.123106 0.000000
3 4.472136 2.236068
4 4.123106 1.414214
5 5.656854 4.123106
6 4.123106 1.414214
7 7.211103 6.082763
8 1.000000 3.162278
9 2.000000 2.236068
10 3.000000 1.414214
11 5.000000 1.414214
12 6.000000 2.236068
Or both:
i = 1
j = 2
new_dff=new_df.loc[i:j, i:j]
print (new_dff)
Id 1 2
Id
1 0.000000 4.123106
2 4.123106 0.000000
CodePudding user response:
The expected output is unclear.
You request to sort your dataframe's column using the row index 1
.
However, when slicing the rows with new_dff = new_df[i:j]
, the row with index 1
is lost. Thus indexing fails and you get the error.
Do you want to subset the columns instead? new_dff = new_df.loc[:, i:j]