Home > Enterprise >  Change values in column based on condition- Python
Change values in column based on condition- Python

Time:02-22

I have a dataset containing a thousand zeros, five hundred ones, and so on. I want to change the first 400 zeros to 0.3, the next 600 zeros to 0.6. Then, I want to change the first 200 ones to 1.4, the next 300 ones to 1.8. And so on.

The whole point being I want to change the integer value to some fractions based on the frequency specified.

Ex: Dataset: 0,0,0,0,0,1,1,1,1,1,1 Output: 0.2,0.2,0.2,0.2,0.8,1.2,1.2,1.2,1.4,1.4,1.4 Input: Frequency, Dataset Frequency=[4,1] for 0 & [3,3] for 1 New dataset=[0.2,0.8] for 0 & [1.2,1.4] for 1

CodePudding user response:

Assuming your datapoints are sorted, a simple solution would be

df = pd.DataFrame({'col':[0,0,0,0,0,1,1,1,1,1,1]})

frequencies = [
    [4, 1],
    [3, 3],
]

new_values = [
    [0.2, 0.8],
    [1.2, 1.4],
]

a = np.concatenate(frequencies)
df['new_col'] = np.concatenate(new_values)[np.arange(len(a)).repeat(a)]

Output:

>>> df
    col  new_col
0     0      0.2
1     0      0.2
2     0      0.2
3     0      0.2
4     0      0.8
5     1      1.2
6     1      1.2
7     1      1.2
8     1      1.4
9     1      1.4
10    1      1.4
  • Related