Home > Enterprise >  Iterate through a column and change another column value based on it (Pandas Dataframe)
Iterate through a column and change another column value based on it (Pandas Dataframe)

Time:10-24

I want to iterate through a column and if that column value meets some criteria it changes another column value.

cycleNum = 0

first = 0

for entry in df1['Ns']:

    if entry < first:
        cycleNum = cycleNum  1
        df1['cycleNumber'] = cycleNum
        first = 0
    else:
        df1['cycleNumber'] = cycleNum
        first = entry

So I want cycleNumber column value to change for that row only. It seems at the minute that it changes the value for every row every time its ran.

I am thinking it should be something like

df1['cycleNumber', ROW] = cycleNum

but cant fugure how to assert that specific row.

CodePudding user response:

Use df1.loc[ROW, 'cycleNumber']

for idx, entry in df1['Ns'].iteritems():

    if entry < first:
        cycleNum = cycleNum  1
        df1.loc[idx, 'cycleNumber'] = cycleNum
        first = 0
    else:
        df1.loc[idx, 'cycleNumber'] = cycleNum
        first = entry

CodePudding user response:

Starting with this fake DataFrame

    a   b   c    d
0  15   5  -9  999
1 -11  -8  11  999
2  11 -14  -1  999
3  -4  16  12  999
4   4   4  17  999
5  16  -5  -4  999
6  -3  17  18  999
7 -16  -4  17  999
8 -17 -15   6  999
9  17   2  10  999

Make a boolean Series of column 'a' less than one so you can use it to select/filter the rows you are interested in.

>>> lessthanzero = df['a'] < 0
>>> print(lessthanzero)
0    False
1     True
2    False
3     True
4    False
5    False
6     True
7     True
8     True
9    False

Your example increments a counter by one for each value less than one - the cumulative sum of the boolean Series.

>>> values = lessthanzero.cumsum().drop_duplicates()
>>> values
0    0
1    1
3    2
6    3
7    4
8    5
Name: a, dtype: int64

Filter and assign

>>> df.loc[lessthanzero,'d'] = values
>>> df
    a   b   c    d
0  15   5  -9  999
1 -11  -8  11    1
2  11 -14  -1  999
3  -4  16  12    2
4   4   4  17  999
5  16  -5  -4  999
6  -3  17  18    3
7 -16  -4  17    4
8 -17 -15   6    5
9  17   2  10  999

Random fake DataFrame

import numpy as np
import pandas as pd

rng = np.random.default_rng()
nrows = 10
df = pd.DataFrame(rng.integers(-20,20,(nrows,4)),columns=['a', 'b', 'c', 'd'])
df['d'] = 999
print(df.to_string())
  • Related