Home > Software design >  Pandas data frame adding a column based on data from 2 other columns
Pandas data frame adding a column based on data from 2 other columns

Time:04-01

I have a data frame. One column is called fractal. It has 0's or 1's in which the 1's represents a fractal. Here is the output of np.flatnonzero to get an idea of the frequency of fractals:

np.flatnonzero

[ 15  32  77  93 110 152 165 185 194 201 223 232 245 264 294 306 320 327
 347 370 380 391 409 436 447 460 474 481 500 534 549 561 579 586 599 620
 627 641 653 670 685 704 711 758 784]

There's another column that has a high price, df['high'] that contains the daily high prices of a financial instrument.

I want to add a column to the database, df['f_support'] that contains high prices relating to the high price of the last fractal.

The high price is 2 rows before the fractal signal. In other words, the column would contain the same high price until another fractal signal, then a new high price would start filling the column.

Looking at the output of np.flatzero the column f_support should contain this:

f_support value
0–14 nothing
15–31 df['high'].iloc[13]
32–77 df['high'].iloc[30]

and so on.

I hope I've conveyed this so it makes sense. There's probably an easy way to do this but it's beyond my present scope.

CodePudding user response:

IIUC:

fracloc = np.flatnonzero(df.fractal)
df.loc[df.index[fracloc], 'f_support'] = df['high'].iloc[fracloc - 2].to_numpy()
df['f_support'] = df['f_support'].pad()

df

    fractal       high  f_support
0         0  74.961120        NaN
1         0   2.297611        NaN
2         0  60.294702        NaN
3         0  91.874424        NaN
4         0  69.327601        NaN
..      ...        ...        ...
73        0  34.925407  61.977998
74        0  64.475880  61.977998
75        0  86.939800  61.977998
76        0  42.377974  61.977998
77        1  42.725907  86.939800
  • Related