I have a pandas dataframe of over 1000 lines, where I want to read values from 10 rows at a time. eg - I want to calculate the number of times Logic is 1 for the first 10 rows, then for the next 10 rows and so on.
Time | Logic |
---|---|
1 | 0 |
2 | 1 |
3 | 0 |
4 | 0 |
. | . |
. | . |
. | . |
997 | 1 |
998 | 0 |
999 | 0 |
CodePudding user response:
If you just need an array, use the underlying numpy array:
df['Logic'].to_numpy().reshape(-1,10).sum(1)
output:
array([8, 3, 6, 6, 7, 6, 3, 5, 7, 5, 4, 3, 9, 3, 7, 5, 4, 4, 3, 3, 4, 4,
5, 5, 5, 7, 6, 5, 5, 5, 7, 8, 6, 7, 4, 5, 4, 6, 6, 7, 6, 7, 2, 6,
6, 3, 3, 7, 4, 6, 5, 4, 5, 4, 3, 4, 9, 7, 4, 3, 5, 6, 5, 4, 5, 6,
6, 8, 4, 4, 7, 7, 5, 4, 3, 5, 7, 4, 3, 3, 5, 3, 5, 8, 6, 5, 6, 6,
5, 5, 7, 3, 3, 4, 7, 2, 2, 4, 6, 1])
With pandas, you could also use groupby
sum
:
df.groupby(df.index//10)['Logic'].sum()
or, if you don't have a range index:
import numpy as np
df.groupby(np.arange(len(df))//10)['Logic'].sum()
Example:
# reproducible input
import numpy as np
np.random.seed(0)
df = pd.DataFrame({'Time': np.arange(1000) 1,
'Logic': np.random.choice([0,1], 1000)})
# calculation
N = 10
out = df.groupby(df.index//N)['Logic'].sum()
output:
0 8
1 3
2 6
3 6
4 7
..
95 2
96 2
97 4
98 6
99 1