I have a numpy array pair list
[[214,295], [215, 294], [229, 226], [229, 227]]
After calculating the average of the bunch of points using the Z score, the result I have is
[[222.0, 260.5], [214.0, 295.0], [229.0, 226.0]]
Expected result
[[average of [214, 295], [215, 294]
] , [average of [229, 226], [229, 227]
]
It should always return 2 points instead of more than 3 points. Do I need to calculate a new Z score separately?
Another example that I can bring up for discussion
[[95, 132], [96, 132], [94, 133], [134, 239], [95, 131]]
Current output
[[ 95. 132.], [134. 239.]]
If the data has many points bunched around the [134, 239] point, I would like to have a more robust way to split the two main bunches.
import numpy as np
from scipy import stats
tempList = np.array([[214,295], [215, 294],[229, 226], [229, 227]])
z= stats.zscore(tempList, axis=0)
z = list([abs(x)<1 and abs(y)<1 for x,y in z])
newList = tempList[[not x for x in z]]
tempList = tempList[z]
newList = np.concatenate([[tempList.mean(axis=0)], newList])
print(newList)
CodePudding user response:
Well, it's easy to do what you describe just by reshaping the array:
import numpy as np
tempList = np.array([[214,295], [215, 294],[229, 226], [229, 227]])
tempList = tempList.reshape( (-1,2,2) )
print(tempList)
print("---")
print( tempList.mean( axis=1 ) )
Output:
[[[214 295]
[215 294]]
[[229 226]
[229 227]]]
---
[[214.5 294.5]
[229. 226.5]]