I want to iterate through a column and if that column value meets some criteria it changes another column value.
cycleNum = 0
first = 0
for entry in df1['Ns']:
if entry < first:
cycleNum = cycleNum 1
df1['cycleNumber'] = cycleNum
first = 0
else:
df1['cycleNumber'] = cycleNum
first = entry
So I want cycleNumber
column value to change for that row only. It seems at the minute that it changes the value for every row every time its ran.
I am thinking it should be something like
df1['cycleNumber', ROW] = cycleNum
but cant fugure how to assert that specific row.
CodePudding user response:
Use df1.loc[ROW, 'cycleNumber']
for idx, entry in df1['Ns'].iteritems():
if entry < first:
cycleNum = cycleNum 1
df1.loc[idx, 'cycleNumber'] = cycleNum
first = 0
else:
df1.loc[idx, 'cycleNumber'] = cycleNum
first = entry
CodePudding user response:
Starting with this fake DataFrame
a b c d
0 15 5 -9 999
1 -11 -8 11 999
2 11 -14 -1 999
3 -4 16 12 999
4 4 4 17 999
5 16 -5 -4 999
6 -3 17 18 999
7 -16 -4 17 999
8 -17 -15 6 999
9 17 2 10 999
Make a boolean Series of column 'a'
less than one so you can use it to select/filter the rows you are interested in.
>>> lessthanzero = df['a'] < 0
>>> print(lessthanzero)
0 False
1 True
2 False
3 True
4 False
5 False
6 True
7 True
8 True
9 False
Your example increments a counter by one for each value less than one - the cumulative sum of the boolean Series.
>>> values = lessthanzero.cumsum().drop_duplicates()
>>> values
0 0
1 1
3 2
6 3
7 4
8 5
Name: a, dtype: int64
Filter and assign
>>> df.loc[lessthanzero,'d'] = values
>>> df
a b c d
0 15 5 -9 999
1 -11 -8 11 1
2 11 -14 -1 999
3 -4 16 12 2
4 4 4 17 999
5 16 -5 -4 999
6 -3 17 18 3
7 -16 -4 17 4
8 -17 -15 6 5
9 17 2 10 999
Random fake DataFrame
import numpy as np
import pandas as pd
rng = np.random.default_rng()
nrows = 10
df = pd.DataFrame(rng.integers(-20,20,(nrows,4)),columns=['a', 'b', 'c', 'd'])
df['d'] = 999
print(df.to_string())