How to compare two columns in DataFrame and change value of third column based on that comparison?-CodePudding

I have following table in Pandas:

index | project | category | period | update | amount
0     | 100130  | labour   | 202201 | 202203 | 1000
1     | 100130  | labour   | 202202 | 202203 | 1000
2     | 100130  | labour   | 202203 | 202203 | 1000
3     | 100130  | labour   | 202204 | 202203 | 1000
4     | 100130  | labour   | 202205 | 202203 | 1000

And my final goal is to get table grouped by project and category with summary of amount column but only from month of update until now. So for example above I will get summary from 202203 until 202205 which is 3000 for project 100130 and category labour.

As a first step I tried following condition:

for index, row in table.iterrows():
    if row["period"] < row["update"]
        row["amount"] = 0

But:

this iteration is not working
is there some simple and not so time consuming way how to do it? As my table has over 60.000 rows, so iteration not so good idea probably.

CodePudding user response：

table["amount"] = 0 if table["period"] < table["update"] else None

CodePudding user response：

I did some more research and this code seems to solve my problem:

def check_update(row):
    if row["period"] < row["update"]:
        return 0
    else:
        return row["amount"]

table["amount2"] = table.apply(check_update, axis=1)