Home > Software design >  How to correctly iterate over a pandas column considering a slice of values on Python?
How to correctly iterate over a pandas column considering a slice of values on Python?

Time:08-08

I happen to have the following information stored in a variable called df_trading_pair:

           Start Date  Open Price  High Price   Low Price  Close Price         Volume                End Date
0 2022-08-08 07:15:00  0.00001241  0.00001242  0.00001238   0.00001239  16808259334.0 2022-08-08 07:19:59.999
1 2022-08-08 07:20:00  0.00001238  0.00001239  0.00001235   0.00001238   7237826684.0 2022-08-08 07:24:59.999
2 2022-08-08 07:25:00  0.00001238  0.00001239  0.00001237   0.00001238   1768234135.0 2022-08-08 07:29:59.999
3 2022-08-08 07:30:00  0.00001238  0.00001239  0.00001236   0.00001236   5243964161.0 2022-08-08 07:34:59.999
4 2022-08-08 07:35:00  0.00001236  0.00001237  0.00001235   0.00001235   8802029320.0 2022-08-08 07:39:59.999
5 2022-08-08 07:40:00  0.00001234  0.00001236  0.00001233   0.00001234   3038529151.0 2022-08-08 07:44:59.999
6 2022-08-08 07:45:00  0.00001233  0.00001236  0.00001232   0.00001235   7700037899.0 2022-08-08 07:49:59.999
7 2022-08-08 07:50:00  0.00001235  0.00001237  0.00001235   0.00001237   1929229917.0 2022-08-08 07:54:59.999 

I am interested in iterating over the Open Price column using a range of 4 values, in such a way that it prints the following information on each iteration:

Iteration #1

0    0.00001241
1    0.00001238
2    0.00001238
3    0.00001238
Name: Open Price, dtype: float64

Iteration #2

1    0.00001238
2    0.00001238
3    0.00001238
4    0.00001236
Name: Open Price, dtype: float64

. . .

Iteration #5

4    0.00001236
5    0.00001234
6    0.00001233
7    0.00001235
Name: Open Price, dtype: float64

As such, I already have part of the solution and it is as follows:

for i in range(0, len(df_trading_pair)):
    slc = df_trading_pair["Open Price"].iloc[i : i   4]
    print(slc)
    i = i   1
    print("")

However, there is a problem and that is that my solution iterates 3 more times, printing the following after the 5th iteration:

Iteration #6

5    0.00001234
6    0.00001233
7    0.00001235
Name: Open Price, dtype: float64

Iteration #7

6    0.00001233
7    0.00001235
Name: Open Price, dtype: float64

Iteration #8

7    0.00001235
Name: Open Price, dtype: float64

I would like to know How can I fix it? and if possible, an explanation of why my solution did not work as expected?

CodePudding user response:

Another sollution, using .rolling:

for d in df["Open Price"].rolling(4):
    if len(d) != 4:
        continue
    print(d, "\n")

Prints:

0    0.00001241
1    0.00001238
2    0.00001238
3    0.00001238
Name: Open Price, dtype: object 

1    0.00001238
2    0.00001238
3    0.00001238
4    0.00001236
Name: Open Price, dtype: object 

2    0.00001238
3    0.00001238
4    0.00001236
5    0.00001234
Name: Open Price, dtype: object 

3    0.00001238
4    0.00001236
5    0.00001234
6    0.00001233
Name: Open Price, dtype: object 

4    0.00001236
5    0.00001234
6    0.00001233
7    0.00001235
Name: Open Price, dtype: object 

CodePudding user response:

You can just add a condition on the length of the resulting Series:

for i in range(0, len(df_trading_pair)):
slc = df_trading_pair["Open Price"].iloc[i : i   4]
if len(slc) == 4:
    print(f'Iteration #{i 1}, len: {len(slc)}')
    print(slc)
    print("")
    i = i   1
else:
    break
  • Related