Replace str value in pd df by sampling from a pandas array-CodePudding

I have a pandas df

df = pd.DataFrame({'A': [0.1, 0.1, 0.1, 0.1, 'X'], 'B': [0.1, 0.1, 'X', 0.1, 0.1], 'C': [0.1, 'X', 'X', 'X', 'X']})

 A    B    C
 0.1  0.1  0.1
 0.1  0.1   X
 0.1   X    X
 0.1  0.1   X
  X   0.1   X

and an array

<PandasArray> [0.9999999999999304, 0.9999973764241584, 0.9999997377248664, 0.9615117313882438, 0.871479832883895, 0.9999999999998652, 0.9999999999999994, 0.9999029359407972, 0.999999984174712, 0.9944689702907784] Length: 10, dtype: float64

I would like to replace the values X by sampling from the array such that the distribution of the values in the array is represented in the df in the locations with the value X

I have tried

df[df == 'X'] = np.random.choice(arr, replace=True)

which gives this output

 A    B    C
 0.1  0.1  0.1
 0.1  0.1  1.0
 0.1  1.0  1.0
 0.1  0.1  1.0
 1.0  0.1  1.0

Does this randomly sample from the array and why are the values rounded? I would like to replace with the exact values from the array.

CodePudding user response：

Does this randomly sample from the array?

Yes, you are right.

Why are the values rounded?

It is display problem, if convert to list get real data:

df[df == 'X'] = np.random.choice(arr, replace=True)
print (df.to_dict('list'))

{'A': [0.1, 0.1, 0.1, 0.1, 0.9999997377248664],
 'B': [0.1, 0.1, 0.9999997377248664, 0.1, 0.1], 
 'C': [0.1, 0.9999997377248664, 0.9999997377248664, 0.9999997377248664, 0.9999997377248664]}