I need to average the Y values corresponding to the values in the X array...
X=np.array([ 1, 1, 2, 2, 2, 2, 3, 3 ... ])
Y=np.array([ 10, 30, 15, 10, 16, 10, 15, 20 ... ])
In other words, the equivalents of the 1 values in the X array are 10 and 30 in the Y array, and the average of this is 20, the equivalents of the 2 values are 15, 10, 16, and 10, and their average is 12.75, and so on...
How can I calculate these average values?
CodePudding user response:
One option is to use a property of linear regression (with categorical variables):
import numpy as np
x = np.array([ 1, 1, 2, 2, 2, 2, 3, 3 ])
y = np.array([ 10, 30, 15, 10, 16, 10, 15, 20 ])
x_dummies = x[:, None] == np.unique(x)
means = np.linalg.lstsq(x_dummies, y, rcond=None)[0]
print(means) # [20. 12.75 17.5 ]
CodePudding user response:
You can try using pandas
import pandas as pd
import numpy as np
N = pd.DataFrame(np.transpose([X,Y]),
columns=['X', 'Y']).groupby('X')['Y'].mean().to_numpy()
# array([20. , 12.75, 17.5 ])
CodePudding user response:
import numpy as np
X = np.array([ 1, 1, 2, 2, 2, 2, 3, 3])
Y = np.array([ 10, 30, 15, 10, 16, 10, 15, 20])
# Only unique values
unique_vals = np.unique(X);
# Loop for every value
for val in unique_vals:
# Search for proper indexes in Y
idx = np.where(X == val)
# Mean for finded indexes
aver = np.mean(Y[idx])
print(f"Average for {val}: {aver}")
Result:
Average for 1: 20.0
Average for 2: 12.75
Average for 3: 17.5
CodePudding user response:
you can use something like the below code :
import numpy as np
X=np.array([ 1, 1, 2, 2, 2, 2, 3, 3])
Y=np.array([ 10, 30, 15, 10, 16, 10, 15, 20])
def groupby(a, b):
# Get argsort indices, to be used to sort a and b in the next steps
sidx = b.argsort(kind='mergesort')
a_sorted = a[sidx]
b_sorted = b[sidx]
# Get the group limit indices (start, stop of groups)
cut_idx = np.flatnonzero(np.r_[True,b_sorted[1:] != b_sorted[:-1],True])
# Split input array with those start, stop ones
out = [a_sorted[i:j] for i,j in zip(cut_idx[:-1],cut_idx[1:])]
return out
group_by_array=groupby(Y,X)
for item in group_by_array:
print(np.average(item))
I use the information in the below link to answer the question: Group numpy into multiple sub-arrays using an array of values
CodePudding user response:
I think this solution should work:
avg_arr = []
i = 1
while i <= np.max(x):
inds = np.where(x == i)
my_val = np.average(y[inds[0][0]:inds[0][-1]])
avg_arr.append(my_val)
i =1
Definitely, not the cleanest, but I was able to test it quickly and it does indeed work.